Spaces:
Sleeping
Sleeping
title: TRAIL Leaderboard | |
emoji: 🏆 | |
colorFrom: green | |
colorTo: indigo | |
sdk: gradio | |
app_file: app.py | |
pinned: true | |
license: mit | |
short_description: 'Trace Reasoning and Agentic Issue Localization Leaderboard' | |
sdk_version: 5.19.0 | |
tags: | |
- leaderboard | |
# Model Performance Leaderboard | |
This is a Hugging Face Space that hosts a leaderboard for comparing model performances across various metrics of TRAIL dataset. | |
## Features | |
- **Submit Your Answers**: Run your model on the TRAIL dataset according to the instructions. Submit your outputs. | |
- **Leaderboard**: Once the evaluation is complete, we’ll upload the scores (this process will soon be automated). | |
## Instructions | |
1. Follow the instructions in https://github.com/patronus-ai/trail-benchmark to run the benchmark data on your model. | |
2. Submit the resulting JSON files in a zip. Make sure to prefix the zip filename with SWE_ and GAIA_ separately. | |
3. We will upload the scores after running through our eval. (This will be made automatic soon). | |
## License | |
This project is open source and available under the MIT license. |