--- title: Smart Quiz Maker emoji: "🧠" colorFrom: "indigo" colorTo: "pink" sdk: "gradio" sdk_version: "5.43.1" app_file: "app.py" pinned: false --- # Smart Quiz Maker Smart Quiz Maker is a modular Hugging Face and Gradio demo that generates interactive multiple-choice quizzes based on a user's topic. The system uses multi-source retrieval (Wikipedia + Wikidata + web snippets), semantic keyword extraction, and a question-generation model to produce quizzes you can try in the browser. ## Data sources used The app gathers context from multiple sources to improve coverage and reduce missing information: - **Wikipedia REST summary** — concise topic summary. - **Wikidata search descriptions** — fallback short facts / labels. - **Web snippets from Wikipedia search results** — additional paragraphs and context when the summary is sparse. These sources are combined and chunked to produce the context used by keyword extraction and question generation. ## Configurable options (UI) - **Number of questions:** choose **3**, **5**, or **10** questions per quiz. Default is **3**. - **Difficulty:** choose among **easy**, **medium**, and **hard**. Difficulty affects question templates and phrasing. ## Key features - Multi-source retrieval: combines Wikipedia, Wikidata, and web snippets for richer context. - Robust keyword extraction: spaCy NER preferred, fallback frequency extraction if spaCy unavailable. - Semantic selection: sentence-transformers (`all-MiniLM-L6-v2`) for semantic ranking / MMR-style selection of candidate answers. - Question generation: uses a T5-based QG model (if available) with templated fallbacks for robustness. - Better distractors: semantic-neighbor selection using embeddings to create plausible wrong options; heuristic fallback when embeddings are not available. - Stable Gradio UI: prevents feedback before selection and enforces 3 options per question (1 correct + 2 distractors). - Deterministic option de-duplication (normalization to avoid repeated options like `python` vs `python:`). ## How it works (pipeline) 1. User enters a topic and selects `n_questions` and `difficulty`. 2. Backend fetches context from Wikipedia, Wikidata, and page snippets. 3. Keywords/candidate answers are extracted (spaCy NER → token frequency fallback). 4. Candidates are ranked by semantic relevance (sentence-transformers) and a top-N set is chosen. 5. For each chosen answer: - A question is generated (T5 QG model if available; otherwise, randomized templates by difficulty). - Two distractors are generated using semantic similarity among candidates or a heuristic fallback. 6. UI presents each question with exactly three options. User selections show immediate feedback and scoring. ## Run locally Install dependencies and run the app: ```bash pip install -r requirements.txt python app.py