Spaces:

onisj
/

jarvis_gaia_agent

Starting

File size: 6,559 Bytes

---
title: JARVIS Gaia Agent
emoji: 🦾
colorFrom: indigo
colorTo: green
sdk: gradio
pinned: false
license: mit
short_description: Enhanced JARVIS AI agent for GAIA benchmark
models:
  - meta-llama/Llama-3.2-1B-Instruct
  - sentence-transformers/all-MiniLM-L6-v2
datasets:
  - gaia-benchmark/GAIA
---

# Evolved JARVIS Gaia Agent

An advanced Python-based AI agent built with `langchain`, `langgraph`, SERPAPI, and OCR capabilities for web searches, file parsing, image analysis, and data retrieval. Deployed as a Hugging Face Space (`onisj/jarvis_gaia_agent`) for evaluating performance on the GAIA benchmark, targeting a score >30% (6/20 correct).

## Features

- **Web Search**: Integrates SERPAPI and DuckDuckGo for robust, multi-hop searches.
- **File Parsing**: Processes CSV, TXT, Excel, and PDF files for GAIA tasks.
- **Image Parsing**: Uses OCR (`easyocr`) to extract text from images.
- **Data Retrieval**: Includes a guest info retriever for structured queries.
- **External APIs**: Supports weather data (OpenWeatherMap) and Hugging Face Hub stats.
- **State Management**: Employs `langgraph` for multi-step reasoning workflows.
- **Exact-Match Answers**: Optimized for GAIA Level 1 questions with precise formatting (e.g., USD to two decimals, comma-separated lists).
- **Gradio Interface**: Provides a user-friendly UI for running evaluations and submitting answers.

## Directory Structure

```
jarvis_gaia_agent/
├── app.py                  # Main Gradio application with agent logic
├── state.py                # Defines JARVISState for LangGraph state management
├── search.py               # Web search tools (SERPAPI, multi-hop search)
├── tools/                  # Directory for all tools
│   ├── __init__.py         # Exports all tools
│   ├── file_parser.py      # Parses CSV, TXT, Excel, and PDF files
│   ├── image_parser.py     # OCR-based image parsing
│   ├── calculator.py       # Mathematical calculations
│   ├── document_retriever.py # PDF document retrieval
│   ├── duckduckgo_search.py # DuckDuckGo search integration
│   ├── weather_info.py     # Weather data via OpenWeatherMap
│   ├── hub_stats.py        # Hugging Face Hub statistics
│   ├── guest_info.py       # Guest information retrieval
├── requirements.txt        # Python dependencies
├── README.md               # Project documentation
├── .gitignore              # Excludes .env, temp/, etc.
├── temp/                   # Temporary directory for GAIA files (created at runtime)
```

## Models and Datasets

- **Models**:
  - `meta-llama/Llama-3.2-1B-Instruct`: Primary LLM for reasoning and tool selection (Hugging Face Inference API or local).
  - `sentence-transformers/all-MiniLM-L6-v2`: Embedding model for text similarity tasks.
  - Note: Together AI models (`meta-llama/Llama-3.3-70B-Instruct-Turbo-Free`, `deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free`) are used via API but not hosted on Hugging Face, so they’re not listed in metadata.
- **Datasets**:
  - `gaia-benchmark/GAIA`: Benchmark dataset for evaluating agent performance.

## Prerequisites

- **Python**: 3.9 or higher.
- **Tesseract OCR**: Required for image parsing.
  - macOS: `brew install tesseract`
  - Ubuntu: `sudo apt-get install tesseract-ocr`
  - Windows: Install via [Tesseract Installer](https://github.com/UB-Mannheim/tesseract/wiki).
- **API Keys**: Set in `.env` (local) or Hugging Face Space Secrets (deployment):
  - `HUGGINGFACEHUB_API_TOKEN`: Hugging Face token for model access.
  - `TOGETHER_API_KEY`: Together AI API key for LLM inference.
  - `SERPAPI_API_KEY`: SERPAPI key for web searches.
  - `OPENWEATHERMAP_API_KEY`: OpenWeatherMap key for weather queries.
  - `SPACE_ID`: `onisj/jarvis_gaia_agent`.
- Install dependencies:
  ```bash
  pip install -r requirements.txt
  ```

## Setup and Local Testing

1. **Clone the Repository**:
   ```bash
   git clone https://huggingface.co/spaces/onisj/jarvis_gaia_agent
   cd jarvis_gaia_agent
   ```

2. **Create Virtual Environment**:
   ```bash
   python -m venv venv
   source venv/bin/activate  # Windows: venv\Scripts\activate
   ```

3. **Install Dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

4. **Configure Environment Variables**:
   Create a `.env` file:
   ```text
   SPACE_ID=onisj/jarvis_gaia_agent
   HUGGINGFACEHUB_API_TOKEN=your_hf_token
   TOGETHER_API_KEY=your_together_api_key
   SERPAPI_API_KEY=your_serpapi_key
   OPENWEATHERMAP_API_KEY=your_openweather_key
   ```

5. **Test with Mock File** (optional):
   ```bash
   mkdir temp
   echo "Item,Type,Sales\nBurger,Food,1000\nCola,Drink,500" > temp/7bd855d8-463d-4ed5-93ca-5fe35145f733.xlsx
   ```

6. **Run Locally**:
   ```bash
   python app.py
   ```
   - Open `http://127.0.0.1:7860` (port may vary).
   - Log in with Hugging Face credentials.
   - Click “Run Evaluation & Submit All Answers” to test GAIA tasks.

## Deployment to Hugging Face Space

1. **Push Code**:
   ```bash
   git add .
   git commit -m "Update JARVIS Gaia Agent with README metadata"
   git push origin main
   ```

2. **Set Space Secrets**:
   - Go to `https://huggingface.co/spaces/onisj/jarvis_gaia_agent` > Settings > Repository Secrets.
   - Add:
     - `SPACE_ID`: `onisj/jarvis_gaia_agent`
     - `HUGGINGFACEHUB_API_TOKEN`
     - `TOGETHER_API_KEY`
     - `SERPAPI_API_KEY`
     - `OPENWEATHERMAP_API_KEY`

3. **Build and Run**:
   - Hugging Face auto-builds the Space after pushing.
   - Access the Gradio interface at `https://onisj-jarvis-gaia-agent.hf.space`.
   - Log in and click “Run Evaluation & Submit All Answers” to submit GAIA answers.

4. **Verify Submission**:
   - Check `status_output` for:
     ```
     Submission Successful!
     User: your_username
     Overall Score: XX% (Y/20 correct)
     Message: ...
     ```
   - Aim for >30% (6/20 correct).

## Troubleshooting

- **Model Access (404)**: Verify API keys; test `initialize_llm` locally.
- **SERPAPI Timeout**: Ensure `SERPAPI_API_KEY` is valid; check `search.py` logs.
- **GAIA File Access**: Confirm `temp/` directory permissions; test `download_file`.
- **Low GAIA Score**: Analyze `results_table` for errors; enhance `multi_hop_search_tool` or answer formatting.
- **Logs**: Check Space > Settings > Logs for build/run errors.

## License

MIT License. See [LICENSE](LICENSE) for details.

## Acknowledgements

- Built with `langchain`, `langgraph`, and Hugging Face tools.
- Evaluated on the GAIA benchmark (`gaia-benchmark/GAIA`).