GraphiqueAcademia / README.md
AxDutta's picture
Update README.md
4ac2d5f verified
---
title: GraphiqueAcademia
emoji: 🐠
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Sorry this is still a **WIP** 🛠️⚙️
---
tags:
- agent-demo-track
---
# 🚀 Scientific Paper Assistant
This project is an intelligent **AI Agent** for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images.
🌟 **Live demo:** \[link ]
---
## 🧠 What does it do?
👉 **Upload a PDF or image**
👉 **Ask a question** (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.)
👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers!
---
## 📦 Key components
### 1️⃣ Core logic: `main_agent.py`
* Uses **SmolAgents** with a `CodeAgent` and a Hugging Face `InferenceClientModel`.
* Loads **custom tools**:
* `PDFQATool`: answers questions about PDF content.
* `MindMapTool`: generates mind maps from text.
* `DataGraphTool`: creates data graphs and visualizations.
* `ImageAnalysisTool`: extracts text from images (OCR).
* `WebSearchTool`: performs real-time web searches.
* Lets the agent decide which tools to use!
---
### 2️⃣ Tools
| Tool name | File | Purpose |
| --------------------- | ------------------------ | ------------------------------------------------------------------ |
| **PDFQATool** | `pdf_qa_tool.py` | Answers questions about scientific PDFs. |
| **MindMapTool** | `mind_map_tool.py` | Converts concepts into mind maps for clearer understanding. |
| **DataGraphTool** | `data_graph_tool.py` | Creates data visualizations to illustrate key points. |
| **ImageAnalysisTool** | `image_analysis_tool.py` | Extracts text from images using OCR (Tesseract). |
| **WebSearchTool** | `web_search_tool.py` | Performs web searches using a **Modal-deployed FastAPI** endpoint. |
---
### 3️⃣ Modal Deployments
Two **FastAPI apps** deployed on Modal for:
* 🔍 **Web search** (`modal_web_search_app.py`)
* 🖼️ **Image analysis** (`modal_image_analyzer_app.py`)
These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests.
---
### 4️⃣ User Interface: `app.py`
* Built with **Gradio**.
* Lets users:
* Upload PDFs (`pdf_upload`).
* Upload images (`image_upload`).
* Enter natural language questions (`user_input`).
* The agent **logically** decides whether to:
* Use a tool directly (e.g., PDF analysis, mind map creation).
* Use Modal services (e.g., web search, image analysis).
* Or combine multiple tools for complex tasks!
---
## ⚙️ Installation & Usage
1️⃣ **Install dependencies** (adjust as needed for your local dev environment):
```bash
pip install -r requirements.txt
```
2️⃣ **Set up Modal deployments** for:
* `modal_web_search_app.py`
* `modal_image_analyzer_app.py`
3️⃣ **Launch the app**:
```bash
python app.py
```
Or deploy it to **Hugging Face Spaces** for a live demo 🤗
---
## 💡 Example prompts
✅ “Summarize the uploaded paper.”
✅ “Generate a mind map of the main contributions.”
✅ “Plot a graph of the data trends discussed.”
✅ “Analyze this image (uploaded) for any text.”
✅ “Web search for related works on this topic.”
✅ “Generate code to implement the method in the paper.”
---
## 🛠️ Custom code generation
I also have plans to add a **Code Implementation Tool**. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers!
---
## 📜 License
Open-source under the [MIT License](LICENSE).
---
## ✨ Acknowledgments
* Hugging Face
* Modal
* LangChain
* SmolAgents
* DuckDuckGo (for web search)
* PDFMiner, PyMuPDF (for PDF parsing)