metadata

title: GraphiqueAcademia
emoji: 🐠
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Sorry this is still a WIP 🛠️⚙️

tags: - agent-demo-track

🚀 Scientific Paper Assistant

This project is an intelligent AI Agent for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images.

🌟 Live demo: [link ]

🧠 What does it do?

👉 Upload a PDF or image 👉 Ask a question (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.) 👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers!

📦 Key components

1️⃣ Core logic: `main_agent.py`

Uses SmolAgents with a CodeAgent and a Hugging Face InferenceClientModel.
Loads custom tools:
- PDFQATool: answers questions about PDF content.
- MindMapTool: generates mind maps from text.
- DataGraphTool: creates data graphs and visualizations.
- ImageAnalysisTool: extracts text from images (OCR).
- WebSearchTool: performs real-time web searches.
Lets the agent decide which tools to use!

2️⃣ Tools

Tool name	File	Purpose
PDFQATool	`pdf_qa_tool.py`	Answers questions about scientific PDFs.
MindMapTool	`mind_map_tool.py`	Converts concepts into mind maps for clearer understanding.
DataGraphTool	`data_graph_tool.py`	Creates data visualizations to illustrate key points.
ImageAnalysisTool	`image_analysis_tool.py`	Extracts text from images using OCR (Tesseract).
WebSearchTool	`web_search_tool.py`	Performs web searches using a Modal-deployed FastAPI endpoint.

3️⃣ Modal Deployments

Two FastAPI apps deployed on Modal for:

🔍 Web search (modal_web_search_app.py)
🖼️ Image analysis (modal_image_analyzer_app.py)

These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests.

4️⃣ User Interface: `app.py`

Built with Gradio.
Lets users:
- Upload PDFs (pdf_upload).
- Upload images (image_upload).
- Enter natural language questions (user_input).
The agent logically decides whether to:
- Use a tool directly (e.g., PDF analysis, mind map creation).
- Use Modal services (e.g., web search, image analysis).
- Or combine multiple tools for complex tasks!

⚙️ Installation & Usage

1️⃣ Install dependencies (adjust as needed for your local dev environment):

pip install -r requirements.txt

2️⃣ Set up Modal deployments for:

modal_web_search_app.py
modal_image_analyzer_app.py

3️⃣ Launch the app:

python app.py

Or deploy it to Hugging Face Spaces for a live demo 🤗

💡 Example prompts

✅ “Summarize the uploaded paper.” ✅ “Generate a mind map of the main contributions.” ✅ “Plot a graph of the data trends discussed.” ✅ “Analyze this image (uploaded) for any text.” ✅ “Web search for related works on this topic.” ✅ “Generate code to implement the method in the paper.”

🛠️ Custom code generation

I also have plans to add a Code Implementation Tool. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers!

📜 License

Open-source under the MIT License.

✨ Acknowledgments

Hugging Face
Modal
LangChain
SmolAgents
DuckDuckGo (for web search)
PDFMiner, PyMuPDF (for PDF parsing)