File size: 3,502 Bytes
f9a7c9b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# Hands-On

Now that you’re ready to dive deeper into the creation of your final agent, let’s see how you can submit it for review.

## The Dataset 

The Dataset used in this leaderboard consist of 20 questions extracted from the level 1 questions of the **validation** set from GAIA. 

The chosen question were filtered based on the number of tools and steps needed to answer a question.

Based on the current look of the GAIA benchmark, we think that getting you to try to aim for 30% on level 1 question is a fair test.

<img src="https://huggingface.co/datasets/agents-course/course-images/resolve/main/en/unit4/leaderboard%20GAIA%2024%3A04%3A2025.png" alt="GAIA current status!" />

## The process 

Now the big question in your mind is probably : "How do I start submitting ?"

For this Unit, we created an API that will allow you to get the questions, and send your answers for scoring.
Here is a summary of the routes (see the [live documentation](https://agents-course-unit4-scoring.hf.space/docs) for interactive details):

* **`GET /questions`**: Retrieve the full list of filtered evaluation questions.
* **`GET /random-question`**: Fetch a single random question from the list.
* **`GET /files/{task_id}`**: Download a specific file associated with a given task ID.
* **`POST /submit`**: Submit agent answers, calculate the score, and update the leaderboard.

The submit function will compare the answer to the ground truth in an **EXACT MATCH** manner, hence prompt it well ! The GAIA team shared a prompting example for your agent [here](https://huggingface.co/spaces/gaia-benchmark/leaderboard) (for the sake of this course, make sure you don't include the text "FINAL ANSWER" in your submission, just make your agent reply with the answer and nothing else).

🎨 **Make the Template Your Own!**

To demonstrate the process of interacting with the API, we've included a [basic template](https://huggingface.co/spaces/agents-course/Final_Assignment_Template) as a starting point.

Please feel free—and **actively encouraged**—to change, add to, or completely restructure it! Modify it in any way that best suits your approach and creativity.

In order to submit this templates compute 3 things needed by the API :

* **Username:** Your Hugging Face username (here obtained via Gradio login), which is used to identify your submission.
* **Code Link (`agent_code`):** the URL linking to your Hugging Face Space code (`.../tree/main`) for verification purposes, so please keep your space public.
* **Answers (`answers`):** The list of responses (`{"task_id": ..., "submitted_answer": ...}`) generated by your Agent for scoring.

Hence we encourage you to start by duplicating this [template](https://huggingface.co/spaces/agents-course/Final_Assignment_Template) on your own huggingface profile.

🏆 Check out the leaderboard [here](https://huggingface.co/spaces/agents-course/Students_leaderboard)

*A friendly note: This leaderboard is meant for fun! We know it's possible to submit scores without full verification. If we see too many high scores posted without a public link to back them up, we might need to review, adjust, or remove some entries to keep the leaderboard useful.*
The leaderboard will show the link to your space code-base, since this leaderboard is for students only, please keep your space public if you get a score you're proud of.
<iframe
	src="https://agents-course-students-leaderboard.hf.space"
	frameborder="0"
	width="850"
	height="450"
></iframe>