rag-pipeline-llm / README.md
Mehardeep7's picture
Add space metadata and enhanced description
90e36aa

A newer version of the Gradio SDK is available: 5.46.0

Upgrade
metadata
title: RAG Pipeline For LLMs
emoji: ๐Ÿ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
tags:
  - rag
  - question-answering
  - nlp
  - faiss
  - transformers
  - wikipedia
  - semantic-search
  - huggingface
  - sentence-transformers
models:
  - sentence-transformers/all-mpnet-base-v2
  - deepset/roberta-base-squad2
datasets:
  - wikipedia

๐Ÿ” RAG Pipeline For LLMs ๐Ÿš€

Hugging Face Spaces Python

๐Ÿ“– Project Overview

An intelligent Retrieval-Augmented Generation (RAG) pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.

โœจ Key Features

  • ๐Ÿ“š Dynamic Knowledge Retrieval from Wikipedia with error handling
  • ๐Ÿงฎ Semantic Search using sentence transformers (no keyword dependency)
  • โšก Fast Vector Similarity with FAISS indexing (sub-second search)
  • ๐Ÿค– Intelligent Answer Generation using pre-trained QA models
  • ๐Ÿ“Š Confidence Scoring for answer quality assessment
  • ๐ŸŽ›๏ธ Customizable Parameters (chunk size, retrieval count, overlap)
  • โœ‚๏ธ Smart Text Chunking with overlapping segments for context preservation

๐Ÿ—๏ธ Architecture

User Query โ†’ Embedding โ†’ FAISS Search โ†’ Retrieve Chunks โ†’ QA Model โ†’ Answer + Confidence

๐Ÿค– AI Models Used

  • ๐Ÿ“ Text Chunking: sentence-transformers/all-mpnet-base-v2 tokenizer
  • ๐Ÿงฎ Vector Embeddings: sentence-transformers/all-mpnet-base-v2 (768-dimensional)
  • โ“ Question Answering: deepset/roberta-base-squad2 (RoBERTa fine-tuned on SQuAD 2.0)
  • ๐Ÿ” Vector Search: FAISS IndexFlatL2 for L2 distance similarity

๐Ÿš€ How to Use

  1. ๐Ÿ“– Process Article: Enter any Wikipedia topic and configure chunk settings
  2. โ“ Ask Questions: Switch to Q&A tab and enter your questions
  3. ๐Ÿ“Š View Results: Explore answers with confidence scores and similarity metrics
  4. ๐Ÿ” Analyze: Check retrieved context and visualization analytics

๐Ÿ’ก Example Usage

Topic: "Artificial Intelligence"
Question: "What is machine learning?"
Answer: "Machine learning is a subset of artificial intelligence..."
Confidence: 89.7%

๐Ÿ”ง Configuration Options

  • Chunk Size: 128-512 tokens (default: 256)
  • Overlap: 10-50 tokens (default: 20)
  • Retrieval Count: 1-10 chunks (default: 3)

๐Ÿ“Š Performance

  • Search Speed: Sub-second retrieval for 1000+ chunks
  • Accuracy: High precision with confidence scoring
  • Memory Efficient: Optimized chunk sizes prevent token overflow

๐Ÿ”— Links

  • ๐Ÿ“ Full Project: GitHub Repository
  • ๐Ÿ““ Jupyter Notebook: Complete implementation with explanations
  • ๐ŸŒ Streamlit App: Alternative web interface

๐Ÿค Credits

Built with โค๏ธ using:

  • ๐Ÿค— Hugging Face for transformers and model hosting
  • โšก FAISS for efficient vector search
  • ๐ŸŽจ Gradio for the interactive interface
  • ๐Ÿ“– Wikipedia API for knowledge content

โญ If you find this useful, please give it a star on GitHub!