smart-quiz-ui / README.md
NZLouislu's picture
Remove redundant "UI" from project description in README.md
1d3a12c
---
title: Smart Quiz Maker
emoji: "🧠"
colorFrom: "indigo"
colorTo: "pink"
sdk: "gradio"
sdk_version: "5.43.1"
app_file: "app.py"
pinned: false
---
# Smart Quiz Maker
Smart Quiz Maker is a modular Hugging Face and Gradio demo that generates interactive multiple-choice quizzes based on a user's topic. The system uses multi-source retrieval (Wikipedia + Wikidata + web snippets), semantic keyword extraction, and a question-generation model to produce quizzes you can try in the browser.
## Data sources used
The app gathers context from multiple sources to improve coverage and reduce missing information:
- **Wikipedia REST summary** β€” concise topic summary.
- **Wikidata search descriptions** β€” fallback short facts / labels.
- **Web snippets from Wikipedia search results** β€” additional paragraphs and context when the summary is sparse.
These sources are combined and chunked to produce the context used by keyword extraction and question generation.
## Configurable options (UI)
- **Number of questions:** choose **3**, **5**, or **10** questions per quiz. Default is **3**.
- **Difficulty:** choose among **easy**, **medium**, and **hard**. Difficulty affects question templates and phrasing.
## Key features
- Multi-source retrieval: combines Wikipedia, Wikidata, and web snippets for richer context.
- Robust keyword extraction: spaCy NER preferred, fallback frequency extraction if spaCy unavailable.
- Semantic selection: sentence-transformers (`all-MiniLM-L6-v2`) for semantic ranking / MMR-style selection of candidate answers.
- Question generation: uses a T5-based QG model (if available) with templated fallbacks for robustness.
- Better distractors: semantic-neighbor selection using embeddings to create plausible wrong options; heuristic fallback when embeddings are not available.
- Stable Gradio UI: prevents feedback before selection and enforces 3 options per question (1 correct + 2 distractors).
- Deterministic option de-duplication (normalization to avoid repeated options like `python` vs `python:`).
## How it works (pipeline)
1. User enters a topic and selects `n_questions` and `difficulty`.
2. Backend fetches context from Wikipedia, Wikidata, and page snippets.
3. Keywords/candidate answers are extracted (spaCy NER β†’ token frequency fallback).
4. Candidates are ranked by semantic relevance (sentence-transformers) and a top-N set is chosen.
5. For each chosen answer:
- A question is generated (T5 QG model if available; otherwise, randomized templates by difficulty).
- Two distractors are generated using semantic similarity among candidates or a heuristic fallback.
6. UI presents each question with exactly three options. User selections show immediate feedback and scoring.
## Run locally
Install dependencies and run the app:
```bash
pip install -r requirements.txt
python app.py