---
title: Smart Quiz Maker
emoji: "🧠"
colorFrom: "indigo"
colorTo: "pink"
sdk: "gradio"
sdk_version: "5.43.1"
app_file: "app.py"
pinned: false
---

# Smart Quiz Maker

Smart Quiz Maker is a modular Hugging Face and Gradio demo that generates interactive multiple-choice quizzes based on a user's topic. The system uses multi-source retrieval (Wikipedia + Wikidata + web snippets), semantic keyword extraction, and a question-generation model to produce quizzes you can try in the browser.

## Data sources used
The app gathers context from multiple sources to improve coverage and reduce missing information:

- **Wikipedia REST summary** — concise topic summary.  
- **Wikidata search descriptions** — fallback short facts / labels.  
- **Web snippets from Wikipedia search results** — additional paragraphs and context when the summary is sparse.

These sources are combined and chunked to produce the context used by keyword extraction and question generation.

## Configurable options (UI)
- **Number of questions:** choose **3**, **5**, or **10** questions per quiz. Default is **3**.  
- **Difficulty:** choose among **easy**, **medium**, and **hard**. Difficulty affects question templates and phrasing.

## Key features
- Multi-source retrieval: combines Wikipedia, Wikidata, and web snippets for richer context.  
- Robust keyword extraction: spaCy NER preferred, fallback frequency extraction if spaCy unavailable.  
- Semantic selection: sentence-transformers (`all-MiniLM-L6-v2`) for semantic ranking / MMR-style selection of candidate answers.  
- Question generation: uses a T5-based QG model (if available) with templated fallbacks for robustness.  
- Better distractors: semantic-neighbor selection using embeddings to create plausible wrong options; heuristic fallback when embeddings are not available.  
- Stable Gradio UI: prevents feedback before selection and enforces 3 options per question (1 correct + 2 distractors).  
- Deterministic option de-duplication (normalization to avoid repeated options like `python` vs `python:`).

## How it works (pipeline)
1. User enters a topic and selects `n_questions` and `difficulty`.  
2. Backend fetches context from Wikipedia, Wikidata, and page snippets.  
3. Keywords/candidate answers are extracted (spaCy NER → token frequency fallback).  
4. Candidates are ranked by semantic relevance (sentence-transformers) and a top-N set is chosen.  
5. For each chosen answer:
   - A question is generated (T5 QG model if available; otherwise, randomized templates by difficulty).
   - Two distractors are generated using semantic similarity among candidates or a heuristic fallback.
6. UI presents each question with exactly three options. User selections show immediate feedback and scoring.

## Run locally
Install dependencies and run the app:

```bash
pip install -r requirements.txt
python app.py