Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.46.0
metadata
title: RAG Pipeline For LLMs
emoji: ๐
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
tags:
- rag
- question-answering
- nlp
- faiss
- transformers
- wikipedia
- semantic-search
- huggingface
- sentence-transformers
models:
- sentence-transformers/all-mpnet-base-v2
- deepset/roberta-base-squad2
datasets:
- wikipedia
๐ RAG Pipeline For LLMs ๐
๐ Project Overview
An intelligent Retrieval-Augmented Generation (RAG) pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.
โจ Key Features
- ๐ Dynamic Knowledge Retrieval from Wikipedia with error handling
- ๐งฎ Semantic Search using sentence transformers (no keyword dependency)
- โก Fast Vector Similarity with FAISS indexing (sub-second search)
- ๐ค Intelligent Answer Generation using pre-trained QA models
- ๐ Confidence Scoring for answer quality assessment
- ๐๏ธ Customizable Parameters (chunk size, retrieval count, overlap)
- โ๏ธ Smart Text Chunking with overlapping segments for context preservation
๐๏ธ Architecture
User Query โ Embedding โ FAISS Search โ Retrieve Chunks โ QA Model โ Answer + Confidence
๐ค AI Models Used
- ๐ Text Chunking:
sentence-transformers/all-mpnet-base-v2
tokenizer - ๐งฎ Vector Embeddings:
sentence-transformers/all-mpnet-base-v2
(768-dimensional) - โ Question Answering:
deepset/roberta-base-squad2
(RoBERTa fine-tuned on SQuAD 2.0) - ๐ Vector Search: FAISS IndexFlatL2 for L2 distance similarity
๐ How to Use
- ๐ Process Article: Enter any Wikipedia topic and configure chunk settings
- โ Ask Questions: Switch to Q&A tab and enter your questions
- ๐ View Results: Explore answers with confidence scores and similarity metrics
- ๐ Analyze: Check retrieved context and visualization analytics
๐ก Example Usage
Topic: "Artificial Intelligence"
Question: "What is machine learning?"
Answer: "Machine learning is a subset of artificial intelligence..."
Confidence: 89.7%
๐ง Configuration Options
- Chunk Size: 128-512 tokens (default: 256)
- Overlap: 10-50 tokens (default: 20)
- Retrieval Count: 1-10 chunks (default: 3)
๐ Performance
- Search Speed: Sub-second retrieval for 1000+ chunks
- Accuracy: High precision with confidence scoring
- Memory Efficient: Optimized chunk sizes prevent token overflow
๐ Links
- ๐ Full Project: GitHub Repository
- ๐ Jupyter Notebook: Complete implementation with explanations
- ๐ Streamlit App: Alternative web interface
๐ค Credits
Built with โค๏ธ using:
- ๐ค Hugging Face for transformers and model hosting
- โก FAISS for efficient vector search
- ๐จ Gradio for the interactive interface
- ๐ Wikipedia API for knowledge content
โญ If you find this useful, please give it a star on GitHub!