File size: 4,311 Bytes
0f9751c 2b2929c 4ac2d5f 78c1d51 2b2929c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
---
title: GraphiqueAcademia
emoji: 🐠
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Sorry this is still a **WIP** 🛠️⚙️
---
tags:
- agent-demo-track
---
# 🚀 Scientific Paper Assistant
This project is an intelligent **AI Agent** for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images.
🌟 **Live demo:** \[link ]
---
## 🧠 What does it do?
👉 **Upload a PDF or image**
👉 **Ask a question** (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.)
👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers!
---
## 📦 Key components
### 1️⃣ Core logic: `main_agent.py`
* Uses **SmolAgents** with a `CodeAgent` and a Hugging Face `InferenceClientModel`.
* Loads **custom tools**:
* `PDFQATool`: answers questions about PDF content.
* `MindMapTool`: generates mind maps from text.
* `DataGraphTool`: creates data graphs and visualizations.
* `ImageAnalysisTool`: extracts text from images (OCR).
* `WebSearchTool`: performs real-time web searches.
* Lets the agent decide which tools to use!
---
### 2️⃣ Tools
| Tool name | File | Purpose |
| --------------------- | ------------------------ | ------------------------------------------------------------------ |
| **PDFQATool** | `pdf_qa_tool.py` | Answers questions about scientific PDFs. |
| **MindMapTool** | `mind_map_tool.py` | Converts concepts into mind maps for clearer understanding. |
| **DataGraphTool** | `data_graph_tool.py` | Creates data visualizations to illustrate key points. |
| **ImageAnalysisTool** | `image_analysis_tool.py` | Extracts text from images using OCR (Tesseract). |
| **WebSearchTool** | `web_search_tool.py` | Performs web searches using a **Modal-deployed FastAPI** endpoint. |
---
### 3️⃣ Modal Deployments
Two **FastAPI apps** deployed on Modal for:
* 🔍 **Web search** (`modal_web_search_app.py`)
* 🖼️ **Image analysis** (`modal_image_analyzer_app.py`)
These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests.
---
### 4️⃣ User Interface: `app.py`
* Built with **Gradio**.
* Lets users:
* Upload PDFs (`pdf_upload`).
* Upload images (`image_upload`).
* Enter natural language questions (`user_input`).
* The agent **logically** decides whether to:
* Use a tool directly (e.g., PDF analysis, mind map creation).
* Use Modal services (e.g., web search, image analysis).
* Or combine multiple tools for complex tasks!
---
## ⚙️ Installation & Usage
1️⃣ **Install dependencies** (adjust as needed for your local dev environment):
```bash
pip install -r requirements.txt
```
2️⃣ **Set up Modal deployments** for:
* `modal_web_search_app.py`
* `modal_image_analyzer_app.py`
3️⃣ **Launch the app**:
```bash
python app.py
```
Or deploy it to **Hugging Face Spaces** for a live demo 🤗
---
## 💡 Example prompts
✅ “Summarize the uploaded paper.”
✅ “Generate a mind map of the main contributions.”
✅ “Plot a graph of the data trends discussed.”
✅ “Analyze this image (uploaded) for any text.”
✅ “Web search for related works on this topic.”
✅ “Generate code to implement the method in the paper.”
---
## 🛠️ Custom code generation
I also have plans to add a **Code Implementation Tool**. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers!
---
## 📜 License
Open-source under the [MIT License](LICENSE).
---
## ✨ Acknowledgments
* Hugging Face
* Modal
* LangChain
* SmolAgents
* DuckDuckGo (for web search)
* PDFMiner, PyMuPDF (for PDF parsing)
|