GraphiqueAcademia / README.md
AxDutta's picture
Update README.md
4ac2d5f verified

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: GraphiqueAcademia
emoji: 🐠
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Sorry this is still a WIP 🛠️⚙️


tags: - agent-demo-track

🚀 Scientific Paper Assistant

This project is an intelligent AI Agent for scientific papers, built with Gradio, Modal, and the Smol Agents framework. It leverages a variety of specialized tools to analyze PDFs of AI papers or regular pdfs with educational content, create mind maps, generate data visualizations, perform web searches, and analyze images.

🌟 Live demo: [link ]


🧠 What does it do?

👉 Upload a PDF or image 👉 Ask a question (e.g., "Summarize this section", "Create a mind map", "Generate Python code to implement this method", etc.) 👉 The agent dynamically chooses which specialized tool(s) to use to generate accurate, insightful answers!


📦 Key components

1️⃣ Core logic: main_agent.py

  • Uses SmolAgents with a CodeAgent and a Hugging Face InferenceClientModel.

  • Loads custom tools:

    • PDFQATool: answers questions about PDF content.
    • MindMapTool: generates mind maps from text.
    • DataGraphTool: creates data graphs and visualizations.
    • ImageAnalysisTool: extracts text from images (OCR).
    • WebSearchTool: performs real-time web searches.
  • Lets the agent decide which tools to use!


2️⃣ Tools

Tool name File Purpose
PDFQATool pdf_qa_tool.py Answers questions about scientific PDFs.
MindMapTool mind_map_tool.py Converts concepts into mind maps for clearer understanding.
DataGraphTool data_graph_tool.py Creates data visualizations to illustrate key points.
ImageAnalysisTool image_analysis_tool.py Extracts text from images using OCR (Tesseract).
WebSearchTool web_search_tool.py Performs web searches using a Modal-deployed FastAPI endpoint.

3️⃣ Modal Deployments

Two FastAPI apps deployed on Modal for:

  • 🔍 Web search (modal_web_search_app.py)
  • 🖼️ Image analysis (modal_image_analyzer_app.py)

These APIs handle the heavy lifting outside of the main app, seamlessly integrated via HTTP requests.


4️⃣ User Interface: app.py

  • Built with Gradio.

  • Lets users:

    • Upload PDFs (pdf_upload).
    • Upload images (image_upload).
    • Enter natural language questions (user_input).
  • The agent logically decides whether to:

    • Use a tool directly (e.g., PDF analysis, mind map creation).
    • Use Modal services (e.g., web search, image analysis).
    • Or combine multiple tools for complex tasks!

⚙️ Installation & Usage

1️⃣ Install dependencies (adjust as needed for your local dev environment):

pip install -r requirements.txt

2️⃣ Set up Modal deployments for:

  • modal_web_search_app.py
  • modal_image_analyzer_app.py

3️⃣ Launch the app:

python app.py

Or deploy it to Hugging Face Spaces for a live demo 🤗


💡 Example prompts

✅ “Summarize the uploaded paper.” ✅ “Generate a mind map of the main contributions.” ✅ “Plot a graph of the data trends discussed.” ✅ “Analyze this image (uploaded) for any text.” ✅ “Web search for related works on this topic.” ✅ “Generate code to implement the method in the paper.”


🛠️ Custom code generation

I also have plans to add a Code Implementation Tool. This will allow the agent to generate Python code snippets to clarify methods or experiments described in the papers!


📜 License

Open-source under the MIT License.


✨ Acknowledgments

  • Hugging Face
  • Modal
  • LangChain
  • SmolAgents
  • DuckDuckGo (for web search)
  • PDFMiner, PyMuPDF (for PDF parsing)