Spaces:

Tulika2000
/

ai-pdf-summarizer

Running

App Files Files Community

ai-pdf-summarizer / model_change.md

Kiruthick18

Upload model_change.md

df38952 verified 5 days ago

preview code

raw

history blame

5.46 kB

	# 🚀 Speed Optimized Summarization with DistilBART

	The BART model is quite large (~1.6GB) and slow. I optimized it with a much faster, lighter model and better performance settings.

	---

	## 🚀 Major Speed Optimizations Applied

	### 1. Faster Model
	- Switched from `facebook/bart-large-cnn` (~1.6GB)
	- To `sshleifer/distilbart-cnn-12-6` (~400MB)
	- 🔥 6x smaller model size = Much faster loading and inference

	### 2. Processing Optimizations
	- Smaller chunks: 512 words vs 900 (faster processing)
	- Limited chunks: Max 5 chunks processed (prevents hanging on huge docs)
	- Faster tokenization: Word count instead of full tokenization for chunking
	- Reduced beam search: 2 beams instead of 4 (2x faster)

	### 3. Smart Summarization
	- Shorter summaries: Reduced max lengths across all modes
	- Skip final summary: For documents with ≤2 chunks (saves time)
	- Early stopping: Enabled for faster convergence
	- Progress tracking: Shows which chunk is being processed

	### 4. Memory & Performance
	- Float16 precision: Used when GPU is available (faster inference)
	- Optimized pipeline: Better model loading with fallback
	- `optimum` library added: For additional speed improvements

	---

	## ⚡ Expected Speed Improvements

	\| Task \| Before \| After \|
	\|-------------------\|----------------------\|------------------------------\|
	\| Model loading \| ~30+ seconds \| ~10 seconds \|
	\| PDF processing \| Minutes \| ~5–15 seconds \|
	\| Memory usage \| ~1.6GB \| ~400MB \|
	\| Overall speed \| Slow \| 🚀 5–10x faster \|

	---

	## 🧬 What is DistilBART?

	DistilBART is a compressed version of the BART model designed to be lighter and faster while retaining most of BART’s performance. It’s the result of model distillation, where a smaller model (the student) learns from a larger one (the teacher), in this case, `facebook/bart-large`.

	\| Attribute \| Description \|
	\|------------------\|---------------------------------------------------------------------\|
	\| Full Name \| Distilled BART \|
	\| Base Model \| `facebook/bart-large` \|
	\| Distilled By \| Hugging Face 🤗 \|
	\| Purpose \| Faster inference and smaller footprint for tasks like summarization \|
	\| Architecture \| Encoder-decoder Transformer, like BART, but with fewer layers \|

	---

	## ⚙️ Key Differences: BART vs DistilBART

	\| Feature \| BART (Large) \| DistilBART \|
	\|----------------\|--------------\|------------------------\|
	\| Encoder Layers \| 12 \| 6 \|
	\| Decoder Layers \| 12 \| 6 \|
	\| Parameters \| ~406M \| ~222M \|
	\| Model Size \| ~1.6GB \| ~400MB (~55% smaller) \|
	\| Speed \| Slower \| ~2x faster \|
	\| Performance \| Very high \| Slight drop (~1–2%) \|

	---

	## 🎯 Use Cases

	- ✅ Text Summarization (primary use case)
	- 🌐 Translation (basic use)
	- ⚡ Ideal for edge devices or real-time systems where speed & size matter

	---

	## 🧪 Example: Summarization with DistilBART

	You can easily use DistilBART with Hugging Face Transformers:

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	# Load pretrained DistilBART model
	tokenizer = AutoTokenizer.from_pretrained("sshleifer/distilbart-cnn-12-6")
	model = AutoModelForSeq2SeqLM.from_pretrained("sshleifer/distilbart-cnn-12-6")

	# Input text
	ARTICLE = "The Indian Space Research Organisation (ISRO) launched a new satellite today from the Satish Dhawan Space Centre..."

	# Tokenize and summarize
	inputs = tokenizer([ARTICLE], max_length=1024, return_tensors="pt", truncation=True)
	summary_ids = model.generate(
	inputs["input_ids"],
	max_length=150,
	min_length=40,
	length_penalty=2.0,
	num_beams=4,
	early_stopping=True
	)

	print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
	````

	---

	## 📦 Available Variants

	\| Model Name \| Task \| Description \|
	\| --------------------------------- \| ---------------------------- \| ---------------------------------------- \|
	\| `sshleifer/distilbart-cnn-12-6` \| Summarization \| Distilled from `facebook/bart-large-cnn` \|
	\| `philschmid/distilbart-xsum-12-6` \| Summarization (XSUM dataset) \| Short, abstractive summaries \|

	🔎 [Find more on Hugging Face Model Hub](https://huggingface.co/models?search=distilbart)

	---

	## 📘 Summary

	* 🧠 DistilBART is a distilled, faster version of BART
	* 🧩 Ideal for summarization tasks with lower memory and latency requirements
	* 💡 Trained using knowledge distillation from `facebook/bart-large`
	* ⚙️ Works well in apps needing faster performance without significant loss in quality

	---

	✅ Try it now — it should be significantly faster! 🏃‍♂️💨

	```

	Thank You
	```