--- title: RAG Pipeline For LLMs emoji: 🔍 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: app.py pinned: false license: mit tags: - rag - question-answering - nlp - faiss - transformers - wikipedia - semantic-search - huggingface - sentence-transformers models: - sentence-transformers/all-mpnet-base-v2 - deepset/roberta-base-squad2 datasets: - wikipedia --- # 🔍 RAG Pipeline For LLMs 🚀 [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Mehardeep7/rag-pipeline-llm) [![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org) ## 📖 Project Overview An intelligent **Retrieval-Augmented Generation (RAG)** pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers. ## ✨ Key Features - 📚 **Dynamic Knowledge Retrieval** from Wikipedia with error handling - 🧮 **Semantic Search** using sentence transformers (no keyword dependency) - ⚡ **Fast Vector Similarity** with FAISS indexing (sub-second search) - 🤖 **Intelligent Answer Generation** using pre-trained QA models - 📊 **Confidence Scoring** for answer quality assessment - 🎛️ **Customizable Parameters** (chunk size, retrieval count, overlap) - ✂️ **Smart Text Chunking** with overlapping segments for context preservation ## 🏗️ Architecture ``` User Query → Embedding → FAISS Search → Retrieve Chunks → QA Model → Answer + Confidence ``` ## 🤖 AI Models Used - **📏 Text Chunking**: `sentence-transformers/all-mpnet-base-v2` tokenizer - **🧮 Vector Embeddings**: `sentence-transformers/all-mpnet-base-v2` (768-dimensional) - **❓ Question Answering**: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0) - **🔍 Vector Search**: FAISS IndexFlatL2 for L2 distance similarity ## 🚀 How to Use 1. **📖 Process Article**: Enter any Wikipedia topic and configure chunk settings 2. **❓ Ask Questions**: Switch to Q&A tab and enter your questions 3. **📊 View Results**: Explore answers with confidence scores and similarity metrics 4. **🔍 Analyze**: Check retrieved context and visualization analytics ## 💡 Example Usage ``` Topic: "Artificial Intelligence" Question: "What is machine learning?" Answer: "Machine learning is a subset of artificial intelligence..." Confidence: 89.7% ``` ## 🔧 Configuration Options - **Chunk Size**: 128-512 tokens (default: 256) - **Overlap**: 10-50 tokens (default: 20) - **Retrieval Count**: 1-10 chunks (default: 3) ## 📊 Performance - **Search Speed**: Sub-second retrieval for 1000+ chunks - **Accuracy**: High precision with confidence scoring - **Memory Efficient**: Optimized chunk sizes prevent token overflow ## 🔗 Links - **📝 Full Project**: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM) - **📓 Jupyter Notebook**: Complete implementation with explanations - **🌐 Streamlit App**: Alternative web interface ## 🤝 Credits Built with ❤️ using: - 🤗 **Hugging Face** for transformers and model hosting - ⚡ **FAISS** for efficient vector search - 🎨 **Gradio** for the interactive interface - 📖 **Wikipedia API** for knowledge content --- **⭐ If you find this useful, please give it a star on GitHub!**