Spaces:
Starting
Starting
File size: 6,559 Bytes
75210f1 4701375 f95c630 75210f1 f95c630 75210f1 4701375 f95c630 75210f1 4701375 76b50c7 f95c630 4701375 f95c630 4701375 f95c630 4701375 f95c630 4701375 76b50c7 f95c630 76b50c7 f95c630 76b50c7 f95c630 751d628 76b50c7 f95c630 76b50c7 4701375 f95c630 4701375 f95c630 4701375 f95c630 4701375 f95c630 4701375 f95c630 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
---
title: JARVIS Gaia Agent
emoji: π¦Ύ
colorFrom: indigo
colorTo: green
sdk: gradio
pinned: false
license: mit
short_description: Enhanced JARVIS AI agent for GAIA benchmark
models:
- meta-llama/Llama-3.2-1B-Instruct
- sentence-transformers/all-MiniLM-L6-v2
datasets:
- gaia-benchmark/GAIA
---
# Evolved JARVIS Gaia Agent
An advanced Python-based AI agent built with `langchain`, `langgraph`, SERPAPI, and OCR capabilities for web searches, file parsing, image analysis, and data retrieval. Deployed as a Hugging Face Space (`onisj/jarvis_gaia_agent`) for evaluating performance on the GAIA benchmark, targeting a score >30% (6/20 correct).
## Features
- **Web Search**: Integrates SERPAPI and DuckDuckGo for robust, multi-hop searches.
- **File Parsing**: Processes CSV, TXT, Excel, and PDF files for GAIA tasks.
- **Image Parsing**: Uses OCR (`easyocr`) to extract text from images.
- **Data Retrieval**: Includes a guest info retriever for structured queries.
- **External APIs**: Supports weather data (OpenWeatherMap) and Hugging Face Hub stats.
- **State Management**: Employs `langgraph` for multi-step reasoning workflows.
- **Exact-Match Answers**: Optimized for GAIA Level 1 questions with precise formatting (e.g., USD to two decimals, comma-separated lists).
- **Gradio Interface**: Provides a user-friendly UI for running evaluations and submitting answers.
## Directory Structure
```
jarvis_gaia_agent/
βββ app.py # Main Gradio application with agent logic
βββ state.py # Defines JARVISState for LangGraph state management
βββ search.py # Web search tools (SERPAPI, multi-hop search)
βββ tools/ # Directory for all tools
β βββ __init__.py # Exports all tools
β βββ file_parser.py # Parses CSV, TXT, Excel, and PDF files
β βββ image_parser.py # OCR-based image parsing
β βββ calculator.py # Mathematical calculations
β βββ document_retriever.py # PDF document retrieval
β βββ duckduckgo_search.py # DuckDuckGo search integration
β βββ weather_info.py # Weather data via OpenWeatherMap
β βββ hub_stats.py # Hugging Face Hub statistics
β βββ guest_info.py # Guest information retrieval
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
βββ .gitignore # Excludes .env, temp/, etc.
βββ temp/ # Temporary directory for GAIA files (created at runtime)
```
## Models and Datasets
- **Models**:
- `meta-llama/Llama-3.2-1B-Instruct`: Primary LLM for reasoning and tool selection (Hugging Face Inference API or local).
- `sentence-transformers/all-MiniLM-L6-v2`: Embedding model for text similarity tasks.
- Note: Together AI models (`meta-llama/Llama-3.3-70B-Instruct-Turbo-Free`, `deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free`) are used via API but not hosted on Hugging Face, so theyβre not listed in metadata.
- **Datasets**:
- `gaia-benchmark/GAIA`: Benchmark dataset for evaluating agent performance.
## Prerequisites
- **Python**: 3.9 or higher.
- **Tesseract OCR**: Required for image parsing.
- macOS: `brew install tesseract`
- Ubuntu: `sudo apt-get install tesseract-ocr`
- Windows: Install via [Tesseract Installer](https://github.com/UB-Mannheim/tesseract/wiki).
- **API Keys**: Set in `.env` (local) or Hugging Face Space Secrets (deployment):
- `HUGGINGFACEHUB_API_TOKEN`: Hugging Face token for model access.
- `TOGETHER_API_KEY`: Together AI API key for LLM inference.
- `SERPAPI_API_KEY`: SERPAPI key for web searches.
- `OPENWEATHERMAP_API_KEY`: OpenWeatherMap key for weather queries.
- `SPACE_ID`: `onisj/jarvis_gaia_agent`.
- Install dependencies:
```bash
pip install -r requirements.txt
```
## Setup and Local Testing
1. **Clone the Repository**:
```bash
git clone https://huggingface.co/spaces/onisj/jarvis_gaia_agent
cd jarvis_gaia_agent
```
2. **Create Virtual Environment**:
```bash
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
```
3. **Install Dependencies**:
```bash
pip install -r requirements.txt
```
4. **Configure Environment Variables**:
Create a `.env` file:
```text
SPACE_ID=onisj/jarvis_gaia_agent
HUGGINGFACEHUB_API_TOKEN=your_hf_token
TOGETHER_API_KEY=your_together_api_key
SERPAPI_API_KEY=your_serpapi_key
OPENWEATHERMAP_API_KEY=your_openweather_key
```
5. **Test with Mock File** (optional):
```bash
mkdir temp
echo "Item,Type,Sales\nBurger,Food,1000\nCola,Drink,500" > temp/7bd855d8-463d-4ed5-93ca-5fe35145f733.xlsx
```
6. **Run Locally**:
```bash
python app.py
```
- Open `http://127.0.0.1:7860` (port may vary).
- Log in with Hugging Face credentials.
- Click βRun Evaluation & Submit All Answersβ to test GAIA tasks.
## Deployment to Hugging Face Space
1. **Push Code**:
```bash
git add .
git commit -m "Update JARVIS Gaia Agent with README metadata"
git push origin main
```
2. **Set Space Secrets**:
- Go to `https://huggingface.co/spaces/onisj/jarvis_gaia_agent` > Settings > Repository Secrets.
- Add:
- `SPACE_ID`: `onisj/jarvis_gaia_agent`
- `HUGGINGFACEHUB_API_TOKEN`
- `TOGETHER_API_KEY`
- `SERPAPI_API_KEY`
- `OPENWEATHERMAP_API_KEY`
3. **Build and Run**:
- Hugging Face auto-builds the Space after pushing.
- Access the Gradio interface at `https://onisj-jarvis-gaia-agent.hf.space`.
- Log in and click βRun Evaluation & Submit All Answersβ to submit GAIA answers.
4. **Verify Submission**:
- Check `status_output` for:
```
Submission Successful!
User: your_username
Overall Score: XX% (Y/20 correct)
Message: ...
```
- Aim for >30% (6/20 correct).
## Troubleshooting
- **Model Access (404)**: Verify API keys; test `initialize_llm` locally.
- **SERPAPI Timeout**: Ensure `SERPAPI_API_KEY` is valid; check `search.py` logs.
- **GAIA File Access**: Confirm `temp/` directory permissions; test `download_file`.
- **Low GAIA Score**: Analyze `results_table` for errors; enhance `multi_hop_search_tool` or answer formatting.
- **Logs**: Check Space > Settings > Logs for build/run errors.
## License
MIT License. See [LICENSE](LICENSE) for details.
## Acknowledgements
- Built with `langchain`, `langgraph`, and Hugging Face tools.
- Evaluated on the GAIA benchmark (`gaia-benchmark/GAIA`).
|