rag-pipeline-llm / README.md
Mehardeep7's picture
Add space metadata and enhanced description
90e36aa
---
title: RAG Pipeline For LLMs
emoji: ๐Ÿ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
tags:
- rag
- question-answering
- nlp
- faiss
- transformers
- wikipedia
- semantic-search
- huggingface
- sentence-transformers
models:
- sentence-transformers/all-mpnet-base-v2
- deepset/roberta-base-squad2
datasets:
- wikipedia
---
# ๐Ÿ” RAG Pipeline For LLMs ๐Ÿš€
[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Mehardeep7/rag-pipeline-llm)
[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org)
## ๐Ÿ“– Project Overview
An intelligent **Retrieval-Augmented Generation (RAG)** pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.
## โœจ Key Features
- ๐Ÿ“š **Dynamic Knowledge Retrieval** from Wikipedia with error handling
- ๐Ÿงฎ **Semantic Search** using sentence transformers (no keyword dependency)
- โšก **Fast Vector Similarity** with FAISS indexing (sub-second search)
- ๐Ÿค– **Intelligent Answer Generation** using pre-trained QA models
- ๐Ÿ“Š **Confidence Scoring** for answer quality assessment
- ๐ŸŽ›๏ธ **Customizable Parameters** (chunk size, retrieval count, overlap)
- โœ‚๏ธ **Smart Text Chunking** with overlapping segments for context preservation
## ๐Ÿ—๏ธ Architecture
```
User Query โ†’ Embedding โ†’ FAISS Search โ†’ Retrieve Chunks โ†’ QA Model โ†’ Answer + Confidence
```
## ๐Ÿค– AI Models Used
- **๐Ÿ“ Text Chunking**: `sentence-transformers/all-mpnet-base-v2` tokenizer
- **๐Ÿงฎ Vector Embeddings**: `sentence-transformers/all-mpnet-base-v2` (768-dimensional)
- **โ“ Question Answering**: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0)
- **๐Ÿ” Vector Search**: FAISS IndexFlatL2 for L2 distance similarity
## ๐Ÿš€ How to Use
1. **๐Ÿ“– Process Article**: Enter any Wikipedia topic and configure chunk settings
2. **โ“ Ask Questions**: Switch to Q&A tab and enter your questions
3. **๐Ÿ“Š View Results**: Explore answers with confidence scores and similarity metrics
4. **๐Ÿ” Analyze**: Check retrieved context and visualization analytics
## ๐Ÿ’ก Example Usage
```
Topic: "Artificial Intelligence"
Question: "What is machine learning?"
Answer: "Machine learning is a subset of artificial intelligence..."
Confidence: 89.7%
```
## ๐Ÿ”ง Configuration Options
- **Chunk Size**: 128-512 tokens (default: 256)
- **Overlap**: 10-50 tokens (default: 20)
- **Retrieval Count**: 1-10 chunks (default: 3)
## ๐Ÿ“Š Performance
- **Search Speed**: Sub-second retrieval for 1000+ chunks
- **Accuracy**: High precision with confidence scoring
- **Memory Efficient**: Optimized chunk sizes prevent token overflow
## ๐Ÿ”— Links
- **๐Ÿ“ Full Project**: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM)
- **๐Ÿ““ Jupyter Notebook**: Complete implementation with explanations
- **๐ŸŒ Streamlit App**: Alternative web interface
## ๐Ÿค Credits
Built with โค๏ธ using:
- ๐Ÿค— **Hugging Face** for transformers and model hosting
- โšก **FAISS** for efficient vector search
- ๐ŸŽจ **Gradio** for the interactive interface
- ๐Ÿ“– **Wikipedia API** for knowledge content
---
**โญ If you find this useful, please give it a star on GitHub!**