|
--- |
|
title: GraphiqueAcademia |
|
emoji: 🐠 |
|
colorFrom: purple |
|
colorTo: green |
|
sdk: gradio |
|
sdk_version: 5.33.0 |
|
app_file: app.py |
|
pinned: false |
|
license: mit |
|
--- |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
Sorry this is still a **WIP** 🛠️⚙️ |
|
|
|
--- |
|
tags: |
|
- agent-demo-track |
|
--- |
|
|
|
|
|
|
|
# 🚀 Scientific Paper Assistant |
|
|
|
This project is an intelligent **AI Agent** for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images. |
|
|
|
🌟 **Live demo:** \[link ] |
|
|
|
--- |
|
|
|
## 🧠 What does it do? |
|
|
|
👉 **Upload a PDF or image** |
|
👉 **Ask a question** (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.) |
|
👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers! |
|
|
|
--- |
|
|
|
## 📦 Key components |
|
|
|
### 1️⃣ Core logic: `main_agent.py` |
|
|
|
* Uses **SmolAgents** with a `CodeAgent` and a Hugging Face `InferenceClientModel`. |
|
* Loads **custom tools**: |
|
|
|
* `PDFQATool`: answers questions about PDF content. |
|
* `MindMapTool`: generates mind maps from text. |
|
* `DataGraphTool`: creates data graphs and visualizations. |
|
* `ImageAnalysisTool`: extracts text from images (OCR). |
|
* `WebSearchTool`: performs real-time web searches. |
|
* Lets the agent decide which tools to use! |
|
|
|
--- |
|
|
|
### 2️⃣ Tools |
|
|
|
| Tool name | File | Purpose | |
|
| --------------------- | ------------------------ | ------------------------------------------------------------------ | |
|
| **PDFQATool** | `pdf_qa_tool.py` | Answers questions about scientific PDFs. | |
|
| **MindMapTool** | `mind_map_tool.py` | Converts concepts into mind maps for clearer understanding. | |
|
| **DataGraphTool** | `data_graph_tool.py` | Creates data visualizations to illustrate key points. | |
|
| **ImageAnalysisTool** | `image_analysis_tool.py` | Extracts text from images using OCR (Tesseract). | |
|
| **WebSearchTool** | `web_search_tool.py` | Performs web searches using a **Modal-deployed FastAPI** endpoint. | |
|
|
|
--- |
|
|
|
### 3️⃣ Modal Deployments |
|
|
|
Two **FastAPI apps** deployed on Modal for: |
|
|
|
* 🔍 **Web search** (`modal_web_search_app.py`) |
|
* 🖼️ **Image analysis** (`modal_image_analyzer_app.py`) |
|
|
|
These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests. |
|
|
|
--- |
|
|
|
### 4️⃣ User Interface: `app.py` |
|
|
|
* Built with **Gradio**. |
|
* Lets users: |
|
|
|
* Upload PDFs (`pdf_upload`). |
|
* Upload images (`image_upload`). |
|
* Enter natural language questions (`user_input`). |
|
* The agent **logically** decides whether to: |
|
|
|
* Use a tool directly (e.g., PDF analysis, mind map creation). |
|
* Use Modal services (e.g., web search, image analysis). |
|
* Or combine multiple tools for complex tasks! |
|
|
|
--- |
|
|
|
## ⚙️ Installation & Usage |
|
|
|
1️⃣ **Install dependencies** (adjust as needed for your local dev environment): |
|
|
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
2️⃣ **Set up Modal deployments** for: |
|
|
|
* `modal_web_search_app.py` |
|
* `modal_image_analyzer_app.py` |
|
|
|
3️⃣ **Launch the app**: |
|
|
|
```bash |
|
python app.py |
|
``` |
|
|
|
Or deploy it to **Hugging Face Spaces** for a live demo 🤗 |
|
|
|
--- |
|
|
|
## 💡 Example prompts |
|
|
|
✅ “Summarize the uploaded paper.” |
|
✅ “Generate a mind map of the main contributions.” |
|
✅ “Plot a graph of the data trends discussed.” |
|
✅ “Analyze this image (uploaded) for any text.” |
|
✅ “Web search for related works on this topic.” |
|
✅ “Generate code to implement the method in the paper.” |
|
|
|
--- |
|
|
|
## 🛠️ Custom code generation |
|
|
|
I also have plans to add a **Code Implementation Tool**. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers! |
|
|
|
--- |
|
|
|
## 📜 License |
|
|
|
Open-source under the [MIT License](LICENSE). |
|
|
|
--- |
|
|
|
## ✨ Acknowledgments |
|
|
|
* Hugging Face |
|
* Modal |
|
* LangChain |
|
* SmolAgents |
|
* DuckDuckGo (for web search) |
|
* PDFMiner, PyMuPDF (for PDF parsing) |
|
|
|
|