A newer version of the Gradio SDK is available:
5.42.0
title: GraphiqueAcademia
emoji: 🐠
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Sorry this is still a WIP 🛠️⚙️
tags: - agent-demo-track
🚀 Scientific Paper Assistant
This project is an intelligent AI Agent for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images.
🌟 Live demo: [link ]
🧠 What does it do?
👉 Upload a PDF or image 👉 Ask a question (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.) 👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers!
📦 Key components
1️⃣ Core logic: main_agent.py
Uses SmolAgents with a
CodeAgent
and a Hugging FaceInferenceClientModel
.Loads custom tools:
PDFQATool
: answers questions about PDF content.MindMapTool
: generates mind maps from text.DataGraphTool
: creates data graphs and visualizations.ImageAnalysisTool
: extracts text from images (OCR).WebSearchTool
: performs real-time web searches.
Lets the agent decide which tools to use!
2️⃣ Tools
Tool name | File | Purpose |
---|---|---|
PDFQATool | pdf_qa_tool.py |
Answers questions about scientific PDFs. |
MindMapTool | mind_map_tool.py |
Converts concepts into mind maps for clearer understanding. |
DataGraphTool | data_graph_tool.py |
Creates data visualizations to illustrate key points. |
ImageAnalysisTool | image_analysis_tool.py |
Extracts text from images using OCR (Tesseract). |
WebSearchTool | web_search_tool.py |
Performs web searches using a Modal-deployed FastAPI endpoint. |
3️⃣ Modal Deployments
Two FastAPI apps deployed on Modal for:
- 🔍 Web search (
modal_web_search_app.py
) - 🖼️ Image analysis (
modal_image_analyzer_app.py
)
These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests.
4️⃣ User Interface: app.py
Built with Gradio.
Lets users:
- Upload PDFs (
pdf_upload
). - Upload images (
image_upload
). - Enter natural language questions (
user_input
).
- Upload PDFs (
The agent logically decides whether to:
- Use a tool directly (e.g., PDF analysis, mind map creation).
- Use Modal services (e.g., web search, image analysis).
- Or combine multiple tools for complex tasks!
⚙️ Installation & Usage
1️⃣ Install dependencies (adjust as needed for your local dev environment):
pip install -r requirements.txt
2️⃣ Set up Modal deployments for:
modal_web_search_app.py
modal_image_analyzer_app.py
3️⃣ Launch the app:
python app.py
Or deploy it to Hugging Face Spaces for a live demo 🤗
💡 Example prompts
✅ “Summarize the uploaded paper.” ✅ “Generate a mind map of the main contributions.” ✅ “Plot a graph of the data trends discussed.” ✅ “Analyze this image (uploaded) for any text.” ✅ “Web search for related works on this topic.” ✅ “Generate code to implement the method in the paper.”
🛠️ Custom code generation
I also have plans to add a Code Implementation Tool. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers!
📜 License
Open-source under the MIT License.
✨ Acknowledgments
- Hugging Face
- Modal
- LangChain
- SmolAgents
- DuckDuckGo (for web search)
- PDFMiner, PyMuPDF (for PDF parsing)