Spaces:

Agents-MCP-Hackathon
/

GraphiqueAcademia

Sleeping

File size: 4,311 Bytes

---
title: GraphiqueAcademia
emoji: 🐠
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Sorry this is still a **WIP** 🛠️⚙️

---
tags:
  - agent-demo-track
---



# 🚀 Scientific Paper Assistant

This project is an intelligent **AI Agent** for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images.

🌟 **Live demo:** \[link ]

---

## 🧠 What does it do?

👉 **Upload a PDF or image**
👉 **Ask a question** (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.)
👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers!

---

## 📦 Key components

### 1️⃣ Core logic: `main_agent.py`

* Uses **SmolAgents** with a `CodeAgent` and a Hugging Face `InferenceClientModel`.
* Loads **custom tools**:

  * `PDFQATool`: answers questions about PDF content.
  * `MindMapTool`: generates mind maps from text.
  * `DataGraphTool`: creates data graphs and visualizations.
  * `ImageAnalysisTool`: extracts text from images (OCR).
  * `WebSearchTool`: performs real-time web searches.
* Lets the agent decide which tools to use!

---

### 2️⃣ Tools

| Tool name             | File                     | Purpose                                                            |
| --------------------- | ------------------------ | ------------------------------------------------------------------ |
| **PDFQATool**         | `pdf_qa_tool.py`         | Answers questions about scientific PDFs.                           |
| **MindMapTool**       | `mind_map_tool.py`       | Converts concepts into mind maps for clearer understanding.        |
| **DataGraphTool**     | `data_graph_tool.py`     | Creates data visualizations to illustrate key points.              |
| **ImageAnalysisTool** | `image_analysis_tool.py` | Extracts text from images using OCR (Tesseract).                   |
| **WebSearchTool**     | `web_search_tool.py`     | Performs web searches using a **Modal-deployed FastAPI** endpoint. |

---

### 3️⃣ Modal Deployments

Two **FastAPI apps** deployed on Modal for:

* 🔍 **Web search** (`modal_web_search_app.py`)
* 🖼️ **Image analysis** (`modal_image_analyzer_app.py`)

These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests.

---

### 4️⃣ User Interface: `app.py`

* Built with **Gradio**.
* Lets users:

  * Upload PDFs (`pdf_upload`).
  * Upload images (`image_upload`).
  * Enter natural language questions (`user_input`).
* The agent **logically** decides whether to:

  * Use a tool directly (e.g., PDF analysis, mind map creation).
  * Use Modal services (e.g., web search, image analysis).
  * Or combine multiple tools for complex tasks!

---

## ⚙️ Installation & Usage

1️⃣ **Install dependencies** (adjust as needed for your local dev environment):

```bash
pip install -r requirements.txt
```

2️⃣ **Set up Modal deployments** for:

* `modal_web_search_app.py`
* `modal_image_analyzer_app.py`

3️⃣ **Launch the app**:

```bash
python app.py
```

Or deploy it to **Hugging Face Spaces** for a live demo 🤗

---

## 💡 Example prompts

✅ “Summarize the uploaded paper.”
✅ “Generate a mind map of the main contributions.”
✅ “Plot a graph of the data trends discussed.”
✅ “Analyze this image (uploaded) for any text.”
✅ “Web search for related works on this topic.”
✅ “Generate code to implement the method in the paper.”

---

## 🛠️ Custom code generation

I also have plans to add a **Code Implementation Tool**. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers!

---

## 📜 License

Open-source under the [MIT License](LICENSE).

---

## ✨ Acknowledgments

* Hugging Face
* Modal
* LangChain
* SmolAgents
* DuckDuckGo (for web search)
* PDFMiner, PyMuPDF (for PDF parsing)