Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.43.1
metadata
title: TRAIL Leaderboard
emoji: 🏆
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: mit
short_description: Trace Reasoning and Agentic Issue Localization Leaderboard
sdk_version: 5.19.0
tags:
- leaderboard
Model Performance Leaderboard
This is a Hugging Face Space that hosts a leaderboard for comparing model performances across various metrics of TRAIL dataset.
Features
- Submit Your Answers: Run your model on the TRAIL dataset according to the instructions. Submit your outputs.
- Leaderboard: Once the evaluation is complete, we’ll upload the scores (this process will soon be automated).
Instructions
- Follow the instructions in https://github.com/patronus-ai/trail-benchmark to run the benchmark data on your model.
- Submit the resulting JSON files in a zip. Make sure to prefix the zip filename with SWE_ and GAIA_ separately.
- We will upload the scores after running through our eval. (This will be made automatic soon).
License
This project is open source and available under the MIT license.