Spaces:
Sleeping
Sleeping
File size: 2,218 Bytes
a966ee3 06fbd49 a966ee3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
title: Rag As A Service
emoji: π
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.42.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Minimal RAG API with MiniLM embeddings and FAISS
---
# RAG API (Minimal) β MiniLM + FAISS (Gradio)
Minimal Retrieval-Augmented Generation (RAG) service built with:
- **Sentence-Transformers MiniLM** for embeddings
- **FAISS** for vector search (cosine similarity)
- **Gradio** for both UI and API exposure
---
## Features
- Ingest documents (one per line) with configurable chunk size/overlap
- Query top-K relevant chunks with similarity search
- Get concise answers composed from retrieved context
- Reset index at any time
- Call endpoints via **UI or API** (`/api/ingest`, `/api/answer`, `/api/reset`)
---
## Quick Start
1. **Load sample docs β Ingest β Ask a query** using the Gradio UI.
2. Programmatic access:
## ```bash
## Ingest
curl -s -X POST https://<your-space>.hf.space/api/ingest \
-H "content-type: application/json" \
-d '{"data": ["PySpark scales ETL across clusters.\nFAISS powers fast vector similarity search used in retrieval.", 256, 32]}'
# Answer
curl -s -X POST https://<your-space>.hf.space/api/answer \
-H "content-type: application/json" \
-d '{"data": ["What does FAISS do?", 5, 1000]}'
## Python Client
from gradio_client import Client
client = Client("https://<your-space>.hf.space")
status, size = client.predict("FAISS powers fast vector search.", 256, 32, api_name="/ingest")
res = client.predict("What does FAISS do?", 5, 1000, api_name="/answer")
print(res["answer"])
## Tech Stack
- Embeddings: sentence-transformers/all-MiniLM-L6-v2 (384-dim)
- Vector DB: FAISS (FlatIP index, normalized vectors)
- UI & API: Gradio Blocks
## Notes
- In-memory index only; resets when Space sleeps.
- For persistence, extend with save/load to ./data/.
- Demo-focused β fast, light, minimal surface.
## Author/Developer: Naga Adithya Kaushik (GenAIDevTOProd)
## Utilized AI CoPilot for development purpose : Yes (minimal) - Debug, test cases, experimentation only
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|