rag-as-a-service / README.md
GenAIDevTOProd's picture
Update README.md
06fbd49 verified

A newer version of the Gradio SDK is available: 5.44.0

Upgrade
metadata
title: Rag As A Service
emoji: πŸ“‰
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.42.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Minimal RAG API with MiniLM embeddings and FAISS

RAG API (Minimal) β€” MiniLM + FAISS (Gradio)

Minimal Retrieval-Augmented Generation (RAG) service built with:

  • Sentence-Transformers MiniLM for embeddings
  • FAISS for vector search (cosine similarity)
  • Gradio for both UI and API exposure

Features

  • Ingest documents (one per line) with configurable chunk size/overlap
  • Query top-K relevant chunks with similarity search
  • Get concise answers composed from retrieved context
  • Reset index at any time
  • Call endpoints via UI or API (/api/ingest, /api/answer, /api/reset)

Quick Start

  1. Load sample docs β†’ Ingest β†’ Ask a query using the Gradio UI.
  2. Programmatic access:

```bash

Ingest

curl -s -X POST https://.hf.space/api/ingest
-H "content-type: application/json"
-d '{"data": ["PySpark scales ETL across clusters.\nFAISS powers fast vector similarity search used in retrieval.", 256, 32]}'

Answer

curl -s -X POST https://.hf.space/api/answer
-H "content-type: application/json"
-d '{"data": ["What does FAISS do?", 5, 1000]}'

Python Client

from gradio_client import Client client = Client("https://.hf.space") status, size = client.predict("FAISS powers fast vector search.", 256, 32, api_name="/ingest") res = client.predict("What does FAISS do?", 5, 1000, api_name="/answer") print(res["answer"])

Tech Stack

  • Embeddings: sentence-transformers/all-MiniLM-L6-v2 (384-dim)

  • Vector DB: FAISS (FlatIP index, normalized vectors)

  • UI & API: Gradio Blocks

Notes

  • In-memory index only; resets when Space sleeps.

  • For persistence, extend with save/load to ./data/.

  • Demo-focused β€” fast, light, minimal surface.

Author/Developer: Naga Adithya Kaushik (GenAIDevTOProd)

Utilized AI CoPilot for development purpose : Yes (minimal) - Debug, test cases, experimentation only

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference