FutureBench / README.md
vinid's picture
Leaderboard deployment 2025-07-16 18:05:41
6441bc6
---
title: FutureBench Leaderboard
emoji: ๐Ÿ”ฎ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
---
# FutureBench Leaderboard App
A minimal Gradio application for viewing FutureBench prediction data. This app downloads datasets from HuggingFace on startup and provides a web interface to explore the data.
## Features
- ๐Ÿ“Š **Data Summary**: View dataset statistics and information
- ๐Ÿ” **Sample Data**: Browse sample prediction records
- ๐Ÿ“‹ **About**: Learn about the FutureBench system
- ๐Ÿ”„ **Auto-refresh**: Download latest data on startup
- ๐Ÿ“… **Date Range Slider**: Filter the leaderboard by a custom date span
## Setup
1. Install dependencies:
```bash
pip install -r requirements.txt
```
2. (Optional) Set your HuggingFace token for private repositories:
```bash
export HF_TOKEN=your_token_here
```
## Running the App
Launch the Gradio application:
```bash
python app.py
```
The app will:
1. Download datasets from HuggingFace repositories on startup
2. Process the data and create summaries
3. Launch a web interface at `http://localhost:7860`
## Data Sources
The app downloads data from these HuggingFace repositories:
- `futurebench/requests` - Evaluation queue
- `futurebench/results` - Evaluation results
- `futurebench/data` - Main prediction dataset
## Structure
- `app.py` - Main Gradio application
- `process_data/` - Data processing utilities
- `requirements.txt` - Python dependencies
- `README.md` - This file
## Next Steps
This is a minimal version focusing on data download and display. Future enhancements will include:
- Full leaderboard with model rankings
- Interactive filtering and sorting
- Detailed performance metrics
- Model comparison tools