File size: 4,261 Bytes
eef5ed1
 
 
 
 
 
 
 
 
 
 
9d5c81b
 
 
 
 
 
 
 
 
 
e018450
9d5c81b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e018450
 
9d5c81b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e018450
9d5c81b
 
 
 
7772de6
9d5c81b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7772de6
 
9d5c81b
eef5ed1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
title: Pdf Chat
emoji: 
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
---

# 📄 Chat with PDF – Gemini AI Agent

A conversational AI agent that lets you upload any PDF and chat with its contents!  
Built with [smolagents](https://github.com/smol-ai/smol-agents), [Google Gemini](https://aistudio.google.com/), [pypdf](https://pypdf.readthedocs.io/), and [Gradio](https://gradio.app/).

---

## 🌐 Try it Online

> **Live Demo:**  
> [https://huggingface.co/spaces/your-username/your-space-name](https://huggingface.co/spaces/LazyHuman/pdf-chat)

---

## 🏗️ Tech Stack

| Layer              | Technology                                | Purpose                                    |
|--------------------|-------------------------------------------|--------------------------------------------|
| LLM Orchestration  | [smolagents](https://github.com/smol-ai/smol-agents) | Agent framework, tool integration          |
| Language Model     | [Google Gemini 1.5 Flash](https://aistudio.google.com/) | Large Language Model for Q&A               |
| LLM API Adapter    | [LiteLLM](https://github.com/BerriAI/litellm)           | Unified LLM API interface                  |
| PDF Processing     | [pypdf](https://pypdf.readthedocs.io/)                  | Extracts text from uploaded PDF files      |
| Web UI             | [Gradio](https://gradio.app/)                           | Interactive chat interface                 |
| Environment Mgmt   | [python-dotenv](https://pypi.org/project/python-dotenv/) | Loads environment variables (local dev)    |
| Deployment         | [Hugging Face Spaces](https://huggingface.co/spaces)    | Cloud hosting (public demo)                |

---

## 🚀 Features

- **Upload any PDF** and extract its text instantly.
- **Ask questions** about the document—get answers powered by Google Gemini (1.5 Flash).
- **Conversational interface** using Gradio.
- **Runs locally or on Hugging Face Spaces**.

---

## 🛠️ Installation (Local)

1. **Clone this repository:**
```bash
git clone https://github.com/KunalGupta25/chat-with-pdf.git
cd chat-with-pdf
```
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Set up your Gemini API Key:**
- Get your key from [Google AI Studio](https://aistudio.google.com/app/apikey).
- Create a `.env` file in the project root:
  ```
  GEMINI_API_KEY=your_actual_key_here
  ```

---
## 📝 Usage

### **Local**
```py
python app.py
```
Open [http://localhost:7860](http://localhost:7860) in your browser.

### **Hugging Face Spaces**

- Go to your Space: [https://huggingface.co/spaces/your-username/your-space-name](https://huggingface.co/spaces/LazyHuman/pdf-chat)
- Click **"Duplicate Space"** to make your own copy, or use it directly if public.
- Add your `GEMINI_API_KEY` as a secret in the Space settings (if you duplicate or deploy privately).


---

## ⚙️ Configuration

- **Model:** Uses `"gemini/gemini-1.5-flash"` by default. If you have access to `"gemini/gemini-1.5-pro"`, change the `model_id` in `app.py`.
- **API Key:** Must be a [Google AI Studio](https://aistudio.google.com/app/apikey) key, not a Vertex AI or GCP key.
- **No `api_base` needed** for Gemini AI Studio keys.

---

## 🧩 How it Works

1. **Upload a PDF**: The app extracts all text using `pypdf`.
2. **Ask a question**: Your question and the extracted text are sent to the Gemini model via smolagents.
3. **Get answers**: The agent uses Gemini to answer your question, referencing the PDF content.

---

## 🧑‍💻 For Developers

- **Add more tools**: Use the `@tool` decorator from smolagents to add custom functions.
- **Customize the UI**: Edit the Gradio blocks in `app.py`.
- **Chunking or RAG**: For large PDFs, consider splitting text into chunks and using retrieval-augmented generation.

---


## 📜 License

MIT License.  
See [LICENSE](LICENSE) for details.

---

## 🙌 Acknowledgments

- [smol-ai/smol-agents](https://github.com/smol-ai/smol-agents)
- [Google AI Studio](https://aistudio.google.com/)
- [Gradio](https://gradio.app/)
- [pypdf](https://pypdf.readthedocs.io/)

---

**Enjoy chatting with your PDFs!**  
For questions or contributions, open an issue or pull request.

---