A newer version of the Gradio SDK is available:
5.44.1
metadata
title: Pdf Chat
emoji: ⚡
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
📄 Chat with PDF – Gemini AI Agent
A conversational AI agent that lets you upload any PDF and chat with its contents!
Built with smolagents, Google Gemini, pypdf, and Gradio.
🌐 Try it Online
Live Demo:
https://huggingface.co/spaces/your-username/your-space-name
🏗️ Tech Stack
Layer | Technology | Purpose |
---|---|---|
LLM Orchestration | smolagents | Agent framework, tool integration |
Language Model | Google Gemini 1.5 Flash | Large Language Model for Q&A |
LLM API Adapter | LiteLLM | Unified LLM API interface |
PDF Processing | pypdf | Extracts text from uploaded PDF files |
Web UI | Gradio | Interactive chat interface |
Environment Mgmt | python-dotenv | Loads environment variables (local dev) |
Deployment | Hugging Face Spaces | Cloud hosting (public demo) |
🚀 Features
- Upload any PDF and extract its text instantly.
- Ask questions about the document—get answers powered by Google Gemini (1.5 Flash).
- Conversational interface using Gradio.
- Runs locally or on Hugging Face Spaces.
🛠️ Installation (Local)
- Clone this repository:
git clone https://github.com/KunalGupta25/chat-with-pdf.git
cd chat-with-pdf
- Install dependencies:
pip install -r requirements.txt
- Set up your Gemini API Key:
- Get your key from Google AI Studio.
- Create a
.env
file in the project root:GEMINI_API_KEY=your_actual_key_here
📝 Usage
Local
python app.py
Open http://localhost:7860 in your browser.
Hugging Face Spaces
- Go to your Space: https://huggingface.co/spaces/your-username/your-space-name
- Click "Duplicate Space" to make your own copy, or use it directly if public.
- Add your
GEMINI_API_KEY
as a secret in the Space settings (if you duplicate or deploy privately).
⚙️ Configuration
- Model: Uses
"gemini/gemini-1.5-flash"
by default. If you have access to"gemini/gemini-1.5-pro"
, change themodel_id
inapp.py
. - API Key: Must be a Google AI Studio key, not a Vertex AI or GCP key.
- No
api_base
needed for Gemini AI Studio keys.
🧩 How it Works
- Upload a PDF: The app extracts all text using
pypdf
. - Ask a question: Your question and the extracted text are sent to the Gemini model via smolagents.
- Get answers: The agent uses Gemini to answer your question, referencing the PDF content.
🧑💻 For Developers
- Add more tools: Use the
@tool
decorator from smolagents to add custom functions. - Customize the UI: Edit the Gradio blocks in
app.py
. - Chunking or RAG: For large PDFs, consider splitting text into chunks and using retrieval-augmented generation.
📜 License
MIT License.
See LICENSE for details.
🙌 Acknowledgments
Enjoy chatting with your PDFs!
For questions or contributions, open an issue or pull request.