Spaces:

Mehardeep7
/

rag-pipeline-llm

Sleeping

App Files Files Community

rag-pipeline-llm / README.md

Mehardeep7

Add space metadata and enhanced description

90e36aa 9 days ago

preview code

raw

history blame contribute delete

3.48 kB

A newer version of the Gradio SDK is available: 5.46.0

Upgrade

metadata

title: RAG Pipeline For LLMs
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
tags:
  - rag
  - question-answering
  - nlp
  - faiss
  - transformers
  - wikipedia
  - semantic-search
  - huggingface
  - sentence-transformers
models:
  - sentence-transformers/all-mpnet-base-v2
  - deepset/roberta-base-squad2
datasets:
  - wikipedia

🔍 RAG Pipeline For LLMs 🚀

📖 Project Overview

An intelligent Retrieval-Augmented Generation (RAG) pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.

✨ Key Features

📚 Dynamic Knowledge Retrieval from Wikipedia with error handling
🧮 Semantic Search using sentence transformers (no keyword dependency)
⚡ Fast Vector Similarity with FAISS indexing (sub-second search)
🤖 Intelligent Answer Generation using pre-trained QA models
📊 Confidence Scoring for answer quality assessment
🎛️ Customizable Parameters (chunk size, retrieval count, overlap)
✂️ Smart Text Chunking with overlapping segments for context preservation

🏗️ Architecture

User Query → Embedding → FAISS Search → Retrieve Chunks → QA Model → Answer + Confidence

🤖 AI Models Used

📏 Text Chunking: sentence-transformers/all-mpnet-base-v2 tokenizer
🧮 Vector Embeddings: sentence-transformers/all-mpnet-base-v2 (768-dimensional)
❓ Question Answering: deepset/roberta-base-squad2 (RoBERTa fine-tuned on SQuAD 2.0)
🔍 Vector Search: FAISS IndexFlatL2 for L2 distance similarity

🚀 How to Use

📖 Process Article: Enter any Wikipedia topic and configure chunk settings
❓ Ask Questions: Switch to Q&A tab and enter your questions
📊 View Results: Explore answers with confidence scores and similarity metrics
🔍 Analyze: Check retrieved context and visualization analytics

💡 Example Usage

Topic: "Artificial Intelligence"
Question: "What is machine learning?"
Answer: "Machine learning is a subset of artificial intelligence..."
Confidence: 89.7%

🔧 Configuration Options

Chunk Size: 128-512 tokens (default: 256)
Overlap: 10-50 tokens (default: 20)
Retrieval Count: 1-10 chunks (default: 3)

📊 Performance

Search Speed: Sub-second retrieval for 1000+ chunks
Accuracy: High precision with confidence scoring
Memory Efficient: Optimized chunk sizes prevent token overflow

🔗 Links

📝 Full Project: GitHub Repository
📓 Jupyter Notebook: Complete implementation with explanations
🌐 Streamlit App: Alternative web interface

🤝 Credits

Built with ❤️ using:

🤗 Hugging Face for transformers and model hosting
⚡ FAISS for efficient vector search
🎨 Gradio for the interactive interface
📖 Wikipedia API for knowledge content

⭐ If you find this useful, please give it a star on GitHub!