Spaces:

Narsil
/

eval_playground

Running

App Files Files Community

eval_playground / README.md

Narsil HF Staff

Push.

e2152af unverified 14 days ago

preview code

raw

history blame contribute delete

1.73 kB

	---
	title: Evaluation Dataset Quiz
	emoji: 🧠
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 4.19.2
	app_file: app.py
	pinned: false
	license: mit
	---

	# HuggingFace Evaluation Dataset Quiz

	Test your knowledge with questions from popular evaluation datasets!

	## Features

	- 🎯 Interactive quiz interface built with Gradio
	- 📊 8 popular evaluation datasets including:
	- GSM8K (Grade School Math)
	- MMLU (Massive Multitask Language Understanding)
	- AI2 ARC (Science Questions)
	- HellaSwag (Commonsense NLI)
	- WinoGrande (Winograd Schema)
	- BoolQ (Boolean Questions)
	- SQuAD (Reading Comprehension)
	- PIQA (Physical Reasoning)
	- 🎲 Random question selection
	- ✅ Immediate feedback on answers
	- 📈 Score tracking
	- 🔄 Support for multiple question formats:
	- Multiple choice
	- True/False
	- Text input for QA tasks

	## How to Use

	1. Select a Dataset: Choose from the available evaluation datasets
	2. Choose Number of Questions: Select how many questions you want (5-20)
	3. Start Quiz: Click "Start Quiz" to begin
	4. Answer Questions: Select or type your answer and click "Submit Answer"
	5. Get Feedback: See if you got it right and learn the correct answer
	6. Continue: Click "Next Question" to proceed
	7. View Score: See your final score at the end

	## Local Development

	```bash
	# Clone the repository
	git clone <your-repo-url>
	cd eval_quiz_app

	# Install dependencies
	pip install -r requirements.txt

	# Run the app
	python app.py
	```

	## Deployment

	This app is designed to run on HuggingFace Spaces. Simply push to your Space repository and it will deploy automatically.

	## Contributing

	Feel free to add more datasets or improve the quiz functionality!