Spaces:
Running
Running
title: RAG Pipeline For LLMs | |
emoji: ๐ | |
colorFrom: blue | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 4.0.0 | |
app_file: app.py | |
pinned: false | |
license: mit | |
tags: | |
- rag | |
- question-answering | |
- nlp | |
- faiss | |
- transformers | |
- wikipedia | |
- semantic-search | |
- huggingface | |
- sentence-transformers | |
models: | |
- sentence-transformers/all-mpnet-base-v2 | |
- deepset/roberta-base-squad2 | |
datasets: | |
- wikipedia | |
# ๐ RAG Pipeline For LLMs ๐ | |
[](https://huggingface.co/spaces/Mehardeep7/rag-pipeline-llm) | |
[](https://python.org) | |
## ๐ Project Overview | |
An intelligent **Retrieval-Augmented Generation (RAG)** pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers. | |
## โจ Key Features | |
- ๐ **Dynamic Knowledge Retrieval** from Wikipedia with error handling | |
- ๐งฎ **Semantic Search** using sentence transformers (no keyword dependency) | |
- โก **Fast Vector Similarity** with FAISS indexing (sub-second search) | |
- ๐ค **Intelligent Answer Generation** using pre-trained QA models | |
- ๐ **Confidence Scoring** for answer quality assessment | |
- ๐๏ธ **Customizable Parameters** (chunk size, retrieval count, overlap) | |
- โ๏ธ **Smart Text Chunking** with overlapping segments for context preservation | |
## ๐๏ธ Architecture | |
``` | |
User Query โ Embedding โ FAISS Search โ Retrieve Chunks โ QA Model โ Answer + Confidence | |
``` | |
## ๐ค AI Models Used | |
- **๐ Text Chunking**: `sentence-transformers/all-mpnet-base-v2` tokenizer | |
- **๐งฎ Vector Embeddings**: `sentence-transformers/all-mpnet-base-v2` (768-dimensional) | |
- **โ Question Answering**: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0) | |
- **๐ Vector Search**: FAISS IndexFlatL2 for L2 distance similarity | |
## ๐ How to Use | |
1. **๐ Process Article**: Enter any Wikipedia topic and configure chunk settings | |
2. **โ Ask Questions**: Switch to Q&A tab and enter your questions | |
3. **๐ View Results**: Explore answers with confidence scores and similarity metrics | |
4. **๐ Analyze**: Check retrieved context and visualization analytics | |
## ๐ก Example Usage | |
``` | |
Topic: "Artificial Intelligence" | |
Question: "What is machine learning?" | |
Answer: "Machine learning is a subset of artificial intelligence..." | |
Confidence: 89.7% | |
``` | |
## ๐ง Configuration Options | |
- **Chunk Size**: 128-512 tokens (default: 256) | |
- **Overlap**: 10-50 tokens (default: 20) | |
- **Retrieval Count**: 1-10 chunks (default: 3) | |
## ๐ Performance | |
- **Search Speed**: Sub-second retrieval for 1000+ chunks | |
- **Accuracy**: High precision with confidence scoring | |
- **Memory Efficient**: Optimized chunk sizes prevent token overflow | |
## ๐ Links | |
- **๐ Full Project**: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM) | |
- **๐ Jupyter Notebook**: Complete implementation with explanations | |
- **๐ Streamlit App**: Alternative web interface | |
## ๐ค Credits | |
Built with โค๏ธ using: | |
- ๐ค **Hugging Face** for transformers and model hosting | |
- โก **FAISS** for efficient vector search | |
- ๐จ **Gradio** for the interactive interface | |
- ๐ **Wikipedia API** for knowledge content | |
--- | |
**โญ If you find this useful, please give it a star on GitHub!** | |