pdf-chat / README.md
Kunal
updated readme
e018450

A newer version of the Gradio SDK is available: 5.44.1

Upgrade
metadata
title: Pdf Chat
emoji: 
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false

📄 Chat with PDF – Gemini AI Agent

A conversational AI agent that lets you upload any PDF and chat with its contents!
Built with smolagents, Google Gemini, pypdf, and Gradio.


🌐 Try it Online

Live Demo:
https://huggingface.co/spaces/your-username/your-space-name


🏗️ Tech Stack

Layer Technology Purpose
LLM Orchestration smolagents Agent framework, tool integration
Language Model Google Gemini 1.5 Flash Large Language Model for Q&A
LLM API Adapter LiteLLM Unified LLM API interface
PDF Processing pypdf Extracts text from uploaded PDF files
Web UI Gradio Interactive chat interface
Environment Mgmt python-dotenv Loads environment variables (local dev)
Deployment Hugging Face Spaces Cloud hosting (public demo)

🚀 Features

  • Upload any PDF and extract its text instantly.
  • Ask questions about the document—get answers powered by Google Gemini (1.5 Flash).
  • Conversational interface using Gradio.
  • Runs locally or on Hugging Face Spaces.

🛠️ Installation (Local)

  1. Clone this repository:
git clone https://github.com/KunalGupta25/chat-with-pdf.git
cd chat-with-pdf
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up your Gemini API Key:
  • Get your key from Google AI Studio.
  • Create a .env file in the project root:
    GEMINI_API_KEY=your_actual_key_here
    

📝 Usage

Local

python app.py

Open http://localhost:7860 in your browser.

Hugging Face Spaces


⚙️ Configuration

  • Model: Uses "gemini/gemini-1.5-flash" by default. If you have access to "gemini/gemini-1.5-pro", change the model_id in app.py.
  • API Key: Must be a Google AI Studio key, not a Vertex AI or GCP key.
  • No api_base needed for Gemini AI Studio keys.

🧩 How it Works

  1. Upload a PDF: The app extracts all text using pypdf.
  2. Ask a question: Your question and the extracted text are sent to the Gemini model via smolagents.
  3. Get answers: The agent uses Gemini to answer your question, referencing the PDF content.

🧑‍💻 For Developers

  • Add more tools: Use the @tool decorator from smolagents to add custom functions.
  • Customize the UI: Edit the Gradio blocks in app.py.
  • Chunking or RAG: For large PDFs, consider splitting text into chunks and using retrieval-augmented generation.

📜 License

MIT License.
See LICENSE for details.


🙌 Acknowledgments


Enjoy chatting with your PDFs!
For questions or contributions, open an issue or pull request.