--- title: Arabic RAG with DSPy emoji: ๐Ÿง  colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 5.33.0 app_file: app.py pinned: false --- # ๐Ÿง  Arabic RAG System with DSPy + Gradio This is a full Hugging Face Space project that allows: - ๐Ÿงพ Uploading Arabic PDF documents. - ๐Ÿง  Storing and indexing chunks using ChromaDB and SentenceTransformers. - โ“ Asking questions and generating answers using DSPy with context retrieval. - โš™๏ธ Improving answer accuracy using MIPROv2 optimization based on `trainset.jsonl` and `valset.jsonl`. --- ## ๐Ÿš€ How to Use 1. Go to the **"๐Ÿ“ฅ ุชุญู…ูŠู„ ูˆุชุฎุฒูŠู†"** tab to upload your Arabic PDF file. 2. Go to the **"โ“ ุณุคุงู„"** tab to type a question. 3. (Optional) Go to the **"โš™๏ธ ุชุญุณูŠู† ุงู„ู†ู…ูˆุฐุฌ"** tab to upload training and validation sets for improving answer accuracy. --- ## ๐Ÿ“ Project Structure | File | Description | |-------------------|---------------------------------------------------------| | `app.py` | Main Gradio interface and DSPy logic | | `requirements.txt`| Python dependencies | | `trainset.jsonl` | Example training data (question/answer pairs) | | `valset.jsonl` | Example validation data (question/answer pairs) | --- ## โœจ Notes - This project supports full Arabic interaction and is optimized for educational and research purposes. - You can use any Arabic-compatible LLM via DSPy (e.g., OpenAI, OpenChat, Mistral). --- ## ๐Ÿ“š Resources - [DSPy Official Documentation](https://dspy.ai) - [Gradio Docs](https://www.gradio.app) - [Hugging Face Spaces Guide](https://huggingface.co/docs/hub/spaces)