File size: 6,559 Bytes
75210f1
4701375
f95c630
75210f1
 
f95c630
75210f1
 
4701375
f95c630
 
 
 
 
75210f1
 
4701375
76b50c7
f95c630
 
 
 
 
 
 
 
 
 
 
 
 
 
4701375
 
 
f95c630
 
 
4701375
 
f95c630
 
 
 
 
 
 
 
4701375
 
f95c630
 
4701375
76b50c7
f95c630
76b50c7
f95c630
 
 
 
 
 
76b50c7
 
 
f95c630
 
 
 
 
 
 
 
 
 
 
751d628
 
 
 
76b50c7
f95c630
76b50c7
 
 
4701375
 
 
 
f95c630
 
 
 
 
4701375
f95c630
4701375
 
f95c630
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4701375
 
f95c630
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4701375
f95c630
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
---
title: JARVIS Gaia Agent
emoji: 🦾
colorFrom: indigo
colorTo: green
sdk: gradio
pinned: false
license: mit
short_description: Enhanced JARVIS AI agent for GAIA benchmark
models:
  - meta-llama/Llama-3.2-1B-Instruct
  - sentence-transformers/all-MiniLM-L6-v2
datasets:
  - gaia-benchmark/GAIA
---

# Evolved JARVIS Gaia Agent

An advanced Python-based AI agent built with `langchain`, `langgraph`, SERPAPI, and OCR capabilities for web searches, file parsing, image analysis, and data retrieval. Deployed as a Hugging Face Space (`onisj/jarvis_gaia_agent`) for evaluating performance on the GAIA benchmark, targeting a score >30% (6/20 correct).

## Features

- **Web Search**: Integrates SERPAPI and DuckDuckGo for robust, multi-hop searches.
- **File Parsing**: Processes CSV, TXT, Excel, and PDF files for GAIA tasks.
- **Image Parsing**: Uses OCR (`easyocr`) to extract text from images.
- **Data Retrieval**: Includes a guest info retriever for structured queries.
- **External APIs**: Supports weather data (OpenWeatherMap) and Hugging Face Hub stats.
- **State Management**: Employs `langgraph` for multi-step reasoning workflows.
- **Exact-Match Answers**: Optimized for GAIA Level 1 questions with precise formatting (e.g., USD to two decimals, comma-separated lists).
- **Gradio Interface**: Provides a user-friendly UI for running evaluations and submitting answers.

## Directory Structure

```
jarvis_gaia_agent/
β”œβ”€β”€ app.py                  # Main Gradio application with agent logic
β”œβ”€β”€ state.py                # Defines JARVISState for LangGraph state management
β”œβ”€β”€ search.py               # Web search tools (SERPAPI, multi-hop search)
β”œβ”€β”€ tools/                  # Directory for all tools
β”‚   β”œβ”€β”€ __init__.py         # Exports all tools
β”‚   β”œβ”€β”€ file_parser.py      # Parses CSV, TXT, Excel, and PDF files
β”‚   β”œβ”€β”€ image_parser.py     # OCR-based image parsing
β”‚   β”œβ”€β”€ calculator.py       # Mathematical calculations
β”‚   β”œβ”€β”€ document_retriever.py # PDF document retrieval
β”‚   β”œβ”€β”€ duckduckgo_search.py # DuckDuckGo search integration
β”‚   β”œβ”€β”€ weather_info.py     # Weather data via OpenWeatherMap
β”‚   β”œβ”€β”€ hub_stats.py        # Hugging Face Hub statistics
β”‚   β”œβ”€β”€ guest_info.py       # Guest information retrieval
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ README.md               # Project documentation
β”œβ”€β”€ .gitignore              # Excludes .env, temp/, etc.
β”œβ”€β”€ temp/                   # Temporary directory for GAIA files (created at runtime)
```

## Models and Datasets

- **Models**:
  - `meta-llama/Llama-3.2-1B-Instruct`: Primary LLM for reasoning and tool selection (Hugging Face Inference API or local).
  - `sentence-transformers/all-MiniLM-L6-v2`: Embedding model for text similarity tasks.
  - Note: Together AI models (`meta-llama/Llama-3.3-70B-Instruct-Turbo-Free`, `deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free`) are used via API but not hosted on Hugging Face, so they’re not listed in metadata.
- **Datasets**:
  - `gaia-benchmark/GAIA`: Benchmark dataset for evaluating agent performance.

## Prerequisites

- **Python**: 3.9 or higher.
- **Tesseract OCR**: Required for image parsing.
  - macOS: `brew install tesseract`
  - Ubuntu: `sudo apt-get install tesseract-ocr`
  - Windows: Install via [Tesseract Installer](https://github.com/UB-Mannheim/tesseract/wiki).
- **API Keys**: Set in `.env` (local) or Hugging Face Space Secrets (deployment):
  - `HUGGINGFACEHUB_API_TOKEN`: Hugging Face token for model access.
  - `TOGETHER_API_KEY`: Together AI API key for LLM inference.
  - `SERPAPI_API_KEY`: SERPAPI key for web searches.
  - `OPENWEATHERMAP_API_KEY`: OpenWeatherMap key for weather queries.
  - `SPACE_ID`: `onisj/jarvis_gaia_agent`.
- Install dependencies:
  ```bash
  pip install -r requirements.txt
  ```

## Setup and Local Testing

1. **Clone the Repository**:
   ```bash
   git clone https://huggingface.co/spaces/onisj/jarvis_gaia_agent
   cd jarvis_gaia_agent
   ```

2. **Create Virtual Environment**:
   ```bash
   python -m venv venv
   source venv/bin/activate  # Windows: venv\Scripts\activate
   ```

3. **Install Dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

4. **Configure Environment Variables**:
   Create a `.env` file:
   ```text
   SPACE_ID=onisj/jarvis_gaia_agent
   HUGGINGFACEHUB_API_TOKEN=your_hf_token
   TOGETHER_API_KEY=your_together_api_key
   SERPAPI_API_KEY=your_serpapi_key
   OPENWEATHERMAP_API_KEY=your_openweather_key
   ```

5. **Test with Mock File** (optional):
   ```bash
   mkdir temp
   echo "Item,Type,Sales\nBurger,Food,1000\nCola,Drink,500" > temp/7bd855d8-463d-4ed5-93ca-5fe35145f733.xlsx
   ```

6. **Run Locally**:
   ```bash
   python app.py
   ```
   - Open `http://127.0.0.1:7860` (port may vary).
   - Log in with Hugging Face credentials.
   - Click β€œRun Evaluation & Submit All Answers” to test GAIA tasks.

## Deployment to Hugging Face Space

1. **Push Code**:
   ```bash
   git add .
   git commit -m "Update JARVIS Gaia Agent with README metadata"
   git push origin main
   ```

2. **Set Space Secrets**:
   - Go to `https://huggingface.co/spaces/onisj/jarvis_gaia_agent` > Settings > Repository Secrets.
   - Add:
     - `SPACE_ID`: `onisj/jarvis_gaia_agent`
     - `HUGGINGFACEHUB_API_TOKEN`
     - `TOGETHER_API_KEY`
     - `SERPAPI_API_KEY`
     - `OPENWEATHERMAP_API_KEY`

3. **Build and Run**:
   - Hugging Face auto-builds the Space after pushing.
   - Access the Gradio interface at `https://onisj-jarvis-gaia-agent.hf.space`.
   - Log in and click β€œRun Evaluation & Submit All Answers” to submit GAIA answers.

4. **Verify Submission**:
   - Check `status_output` for:
     ```
     Submission Successful!
     User: your_username
     Overall Score: XX% (Y/20 correct)
     Message: ...
     ```
   - Aim for >30% (6/20 correct).

## Troubleshooting

- **Model Access (404)**: Verify API keys; test `initialize_llm` locally.
- **SERPAPI Timeout**: Ensure `SERPAPI_API_KEY` is valid; check `search.py` logs.
- **GAIA File Access**: Confirm `temp/` directory permissions; test `download_file`.
- **Low GAIA Score**: Analyze `results_table` for errors; enhance `multi_hop_search_tool` or answer formatting.
- **Logs**: Check Space > Settings > Logs for build/run errors.

## License

MIT License. See [LICENSE](LICENSE) for details.

## Acknowledgements

- Built with `langchain`, `langgraph`, and Hugging Face tools.
- Evaluated on the GAIA benchmark (`gaia-benchmark/GAIA`).