File size: 3,480 Bytes
90e36aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d29a257
 
90e36aa
d29a257
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f5fc0c6
 
d29a257
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: RAG Pipeline For LLMs
emoji: ๐Ÿ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
tags:
  - rag
  - question-answering
  - nlp
  - faiss
  - transformers
  - wikipedia
  - semantic-search
  - huggingface
  - sentence-transformers
models:
  - sentence-transformers/all-mpnet-base-v2
  - deepset/roberta-base-squad2
datasets:
  - wikipedia
---

# ๐Ÿ” RAG Pipeline For LLMs ๐Ÿš€

[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Mehardeep7/rag-pipeline-llm)
[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org)

## ๐Ÿ“– Project Overview

An intelligent **Retrieval-Augmented Generation (RAG)** pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.

## โœจ Key Features

- ๐Ÿ“š **Dynamic Knowledge Retrieval** from Wikipedia with error handling
- ๐Ÿงฎ **Semantic Search** using sentence transformers (no keyword dependency)
- โšก **Fast Vector Similarity** with FAISS indexing (sub-second search)
- ๐Ÿค– **Intelligent Answer Generation** using pre-trained QA models
- ๐Ÿ“Š **Confidence Scoring** for answer quality assessment
- ๐ŸŽ›๏ธ **Customizable Parameters** (chunk size, retrieval count, overlap)
- โœ‚๏ธ **Smart Text Chunking** with overlapping segments for context preservation

## ๐Ÿ—๏ธ Architecture

```
User Query โ†’ Embedding โ†’ FAISS Search โ†’ Retrieve Chunks โ†’ QA Model โ†’ Answer + Confidence
```

## ๐Ÿค– AI Models Used

- **๐Ÿ“ Text Chunking**: `sentence-transformers/all-mpnet-base-v2` tokenizer
- **๐Ÿงฎ Vector Embeddings**: `sentence-transformers/all-mpnet-base-v2` (768-dimensional)
- **โ“ Question Answering**: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0)
- **๐Ÿ” Vector Search**: FAISS IndexFlatL2 for L2 distance similarity

## ๐Ÿš€ How to Use

1. **๐Ÿ“– Process Article**: Enter any Wikipedia topic and configure chunk settings
2. **โ“ Ask Questions**: Switch to Q&A tab and enter your questions
3. **๐Ÿ“Š View Results**: Explore answers with confidence scores and similarity metrics
4. **๐Ÿ” Analyze**: Check retrieved context and visualization analytics

## ๐Ÿ’ก Example Usage

```
Topic: "Artificial Intelligence"
Question: "What is machine learning?"
Answer: "Machine learning is a subset of artificial intelligence..."
Confidence: 89.7%
```

## ๐Ÿ”ง Configuration Options

- **Chunk Size**: 128-512 tokens (default: 256)
- **Overlap**: 10-50 tokens (default: 20)
- **Retrieval Count**: 1-10 chunks (default: 3)

## ๐Ÿ“Š Performance

- **Search Speed**: Sub-second retrieval for 1000+ chunks
- **Accuracy**: High precision with confidence scoring
- **Memory Efficient**: Optimized chunk sizes prevent token overflow

## ๐Ÿ”— Links

- **๐Ÿ“ Full Project**: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM)
- **๐Ÿ““ Jupyter Notebook**: Complete implementation with explanations
- **๐ŸŒ Streamlit App**: Alternative web interface

## ๐Ÿค Credits

Built with โค๏ธ using:
- ๐Ÿค— **Hugging Face** for transformers and model hosting
- โšก **FAISS** for efficient vector search
- ๐ŸŽจ **Gradio** for the interactive interface
- ๐Ÿ“– **Wikipedia API** for knowledge content

---

**โญ If you find this useful, please give it a star on GitHub!**