Spaces:
Sleeping
Sleeping
File size: 1,087 Bytes
998c769 702e656 cb82186 998c769 c64ff99 92c3d55 998c769 702e656 998c769 8532750 c341352 8532750 3ed1f96 8532750 702e656 dc70710 702e656 dc70710 8532750 998c769 dc70710 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
---
title: TRAIL Leaderboard
emoji: 🏆
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: mit
short_description: 'Trace Reasoning and Agentic Issue Localization Leaderboard'
sdk_version: 5.19.0
tags:
- leaderboard
---
# Model Performance Leaderboard
This is a Hugging Face Space that hosts a leaderboard for comparing model performances across various metrics of TRAIL dataset.
## Features
- **Submit Your Answers**: Run your model on the TRAIL dataset according to the instructions. Submit your outputs.
- **Leaderboard**: Once the evaluation is complete, we’ll upload the scores (this process will soon be automated).
## Instructions
1. Follow the instructions in https://github.com/patronus-ai/trail-benchmark to run the benchmark data on your model.
2. Submit the resulting JSON files in a zip. Make sure to prefix the zip filename with SWE_ and GAIA_ separately.
3. We will upload the scores after running through our eval. (This will be made automatic soon).
## License
This project is open source and available under the MIT license. |