Spaces:
Running
Running
File size: 3,480 Bytes
90e36aa d29a257 90e36aa d29a257 f5fc0c6 d29a257 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
title: RAG Pipeline For LLMs
emoji: ๐
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
tags:
- rag
- question-answering
- nlp
- faiss
- transformers
- wikipedia
- semantic-search
- huggingface
- sentence-transformers
models:
- sentence-transformers/all-mpnet-base-v2
- deepset/roberta-base-squad2
datasets:
- wikipedia
---
# ๐ RAG Pipeline For LLMs ๐
[](https://huggingface.co/spaces/Mehardeep7/rag-pipeline-llm)
[](https://python.org)
## ๐ Project Overview
An intelligent **Retrieval-Augmented Generation (RAG)** pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.
## โจ Key Features
- ๐ **Dynamic Knowledge Retrieval** from Wikipedia with error handling
- ๐งฎ **Semantic Search** using sentence transformers (no keyword dependency)
- โก **Fast Vector Similarity** with FAISS indexing (sub-second search)
- ๐ค **Intelligent Answer Generation** using pre-trained QA models
- ๐ **Confidence Scoring** for answer quality assessment
- ๐๏ธ **Customizable Parameters** (chunk size, retrieval count, overlap)
- โ๏ธ **Smart Text Chunking** with overlapping segments for context preservation
## ๐๏ธ Architecture
```
User Query โ Embedding โ FAISS Search โ Retrieve Chunks โ QA Model โ Answer + Confidence
```
## ๐ค AI Models Used
- **๐ Text Chunking**: `sentence-transformers/all-mpnet-base-v2` tokenizer
- **๐งฎ Vector Embeddings**: `sentence-transformers/all-mpnet-base-v2` (768-dimensional)
- **โ Question Answering**: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0)
- **๐ Vector Search**: FAISS IndexFlatL2 for L2 distance similarity
## ๐ How to Use
1. **๐ Process Article**: Enter any Wikipedia topic and configure chunk settings
2. **โ Ask Questions**: Switch to Q&A tab and enter your questions
3. **๐ View Results**: Explore answers with confidence scores and similarity metrics
4. **๐ Analyze**: Check retrieved context and visualization analytics
## ๐ก Example Usage
```
Topic: "Artificial Intelligence"
Question: "What is machine learning?"
Answer: "Machine learning is a subset of artificial intelligence..."
Confidence: 89.7%
```
## ๐ง Configuration Options
- **Chunk Size**: 128-512 tokens (default: 256)
- **Overlap**: 10-50 tokens (default: 20)
- **Retrieval Count**: 1-10 chunks (default: 3)
## ๐ Performance
- **Search Speed**: Sub-second retrieval for 1000+ chunks
- **Accuracy**: High precision with confidence scoring
- **Memory Efficient**: Optimized chunk sizes prevent token overflow
## ๐ Links
- **๐ Full Project**: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM)
- **๐ Jupyter Notebook**: Complete implementation with explanations
- **๐ Streamlit App**: Alternative web interface
## ๐ค Credits
Built with โค๏ธ using:
- ๐ค **Hugging Face** for transformers and model hosting
- โก **FAISS** for efficient vector search
- ๐จ **Gradio** for the interactive interface
- ๐ **Wikipedia API** for knowledge content
---
**โญ If you find this useful, please give it a star on GitHub!**
|