Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.38.2
metadata
title: FutureBench Leaderboard
emoji: ๐ฎ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
FutureBench Leaderboard App
A minimal Gradio application for viewing FutureBench prediction data. This app downloads datasets from HuggingFace on startup and provides a web interface to explore the data.
Features
- ๐ Data Summary: View dataset statistics and information
- ๐ Sample Data: Browse sample prediction records
- ๐ About: Learn about the FutureBench system
- ๐ Auto-refresh: Download latest data on startup
- ๐ Date Range Slider: Filter the leaderboard by a custom date span
Setup
- Install dependencies:
pip install -r requirements.txt
- (Optional) Set your HuggingFace token for private repositories:
export HF_TOKEN=your_token_here
Running the App
Launch the Gradio application:
python app.py
The app will:
- Download datasets from HuggingFace repositories on startup
- Process the data and create summaries
- Launch a web interface at
http://localhost:7860
Data Sources
The app downloads data from these HuggingFace repositories:
futurebench/requests
- Evaluation queuefuturebench/results
- Evaluation resultsfuturebench/data
- Main prediction dataset
Structure
app.py
- Main Gradio applicationprocess_data/
- Data processing utilitiesrequirements.txt
- Python dependenciesREADME.md
- This file
Next Steps
This is a minimal version focusing on data download and display. Future enhancements will include:
- Full leaderboard with model rankings
- Interactive filtering and sorting
- Detailed performance metrics
- Model comparison tools