ai-pdf-summarizer / model_change.md
Kiruthick18's picture
Upload model_change.md
df38952 verified
|
raw
history blame
5.46 kB
# πŸš€ Speed Optimized Summarization with DistilBART
The BART model is quite large (~1.6GB) and slow. I optimized it with a much faster, lighter model and better performance settings.
---
## πŸš€ Major Speed Optimizations Applied
### 1. Faster Model
- **Switched from** `facebook/bart-large-cnn` (**~1.6GB**)
- **To** `sshleifer/distilbart-cnn-12-6` (**~400MB**)
- πŸ”₯ **6x smaller model size** = Much faster loading and inference
### 2. Processing Optimizations
- **Smaller chunks:** 512 words vs 900 (faster processing)
- **Limited chunks:** Max 5 chunks processed (prevents hanging on huge docs)
- **Faster tokenization:** Word count instead of full tokenization for chunking
- **Reduced beam search:** 2 beams instead of 4 (2x faster)
### 3. Smart Summarization
- **Shorter summaries:** Reduced max lengths across all modes
- **Skip final summary:** For documents with ≀2 chunks (saves time)
- **Early stopping:** Enabled for faster convergence
- **Progress tracking:** Shows which chunk is being processed
### 4. Memory & Performance
- **Float16 precision:** Used when GPU is available (faster inference)
- **Optimized pipeline:** Better model loading with fallback
- **`optimum` library added:** For additional speed improvements
---
## ⚑ Expected Speed Improvements
| Task | Before | After |
|-------------------|----------------------|------------------------------|
| Model loading | ~30+ seconds | ~10 seconds |
| PDF processing | Minutes | ~5–15 seconds |
| Memory usage | ~1.6GB | ~400MB |
| Overall speed | Slow | πŸš€ 5–10x faster |
---
## 🧬 What is DistilBART?
**DistilBART** is a **compressed version of the BART model** designed to be **lighter and faster** while retaining most of BART’s performance. It’s the result of **model distillation**, where a smaller model (the *student*) learns from a larger one (the *teacher*), in this case, `facebook/bart-large`.
| Attribute | Description |
|------------------|---------------------------------------------------------------------|
| **Full Name** | Distilled BART |
| **Base Model** | `facebook/bart-large` |
| **Distilled By** | Hugging Face πŸ€— |
| **Purpose** | Faster inference and smaller footprint for tasks like summarization |
| **Architecture** | Encoder-decoder Transformer, like BART, but with fewer layers |
---
## βš™οΈ Key Differences: BART vs DistilBART
| Feature | BART (Large) | DistilBART |
|----------------|--------------|------------------------|
| Encoder Layers | 12 | 6 |
| Decoder Layers | 12 | 6 |
| Parameters | ~406M | ~222M |
| Model Size | ~1.6GB | ~400MB (~55% smaller) |
| Speed | Slower | ~2x faster |
| Performance | Very high | Slight drop (~1–2%) |
---
## 🎯 Use Cases
- βœ… **Text Summarization** (primary use case)
- 🌐 **Translation** (basic use)
- ⚑ Ideal for **edge devices** or **real-time systems** where speed & size matter
---
## πŸ§ͺ Example: Summarization with DistilBART
You can easily use DistilBART with Hugging Face Transformers:
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
# Load pretrained DistilBART model
tokenizer = AutoTokenizer.from_pretrained("sshleifer/distilbart-cnn-12-6")
model = AutoModelForSeq2SeqLM.from_pretrained("sshleifer/distilbart-cnn-12-6")
# Input text
ARTICLE = "The Indian Space Research Organisation (ISRO) launched a new satellite today from the Satish Dhawan Space Centre..."
# Tokenize and summarize
inputs = tokenizer([ARTICLE], max_length=1024, return_tensors="pt", truncation=True)
summary_ids = model.generate(
inputs["input_ids"],
max_length=150,
min_length=40,
length_penalty=2.0,
num_beams=4,
early_stopping=True
)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
````
---
## πŸ“¦ Available Variants
| Model Name | Task | Description |
| --------------------------------- | ---------------------------- | ---------------------------------------- |
| `sshleifer/distilbart-cnn-12-6` | Summarization | Distilled from `facebook/bart-large-cnn` |
| `philschmid/distilbart-xsum-12-6` | Summarization (XSUM dataset) | Short, abstractive summaries |
πŸ”Ž [Find more on Hugging Face Model Hub](https://huggingface.co/models?search=distilbart)
---
## πŸ“˜ Summary
* 🧠 **DistilBART** is a distilled, faster version of **BART**
* 🧩 Ideal for summarization tasks with lower memory and latency requirements
* πŸ’‘ Trained using **knowledge distillation** from `facebook/bart-large`
* βš™οΈ Works well in apps needing faster performance without significant loss in quality
---
βœ… **Try it now β€” it should be significantly faster!** πŸƒβ€β™‚οΈπŸ’¨
```
Thank You
```