File size: 1,087 Bytes
998c769
702e656
cb82186
998c769
 
 
 
 
c64ff99
92c3d55
998c769
702e656
 
998c769
8532750
 
c341352
8532750
 
 
3ed1f96
 
8532750
702e656
dc70710
702e656
 
 
dc70710
8532750
998c769
dc70710
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
title: TRAIL Leaderboard
emoji: 🏆
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: mit
short_description: 'Trace Reasoning and Agentic Issue Localization Leaderboard'
sdk_version: 5.19.0
tags:
  - leaderboard
---
# Model Performance Leaderboard

This is a Hugging Face Space that hosts a leaderboard for comparing model performances across various metrics of TRAIL dataset.

## Features

- **Submit Your Answers**: Run your model on the TRAIL dataset according to the instructions. Submit your outputs.
- **Leaderboard**: Once the evaluation is complete, we’ll upload the scores (this process will soon be automated).

## Instructions

1. Follow the instructions in https://github.com/patronus-ai/trail-benchmark to run the benchmark data on your model. 
2. Submit the resulting JSON files in a zip. Make sure to prefix the zip filename with SWE_ and GAIA_ separately.
3. We will upload the scores after running through our eval. (This will be made automatic soon).

## License

This project is open source and available under the MIT license.