yhamidullah
/

custom-classifier-demo

custom_classifier

Model card Files Files and versions Community

custom-classifier-demo / README.md

yhamidullah's picture

update readme

cba8814 verified 2 days ago

|

history blame contribute delete

2.88 kB

	---
	license: apache-2.0
	tags:
	- summarization
	- custom-model
	- pegasus
	- seq2seq
	- huggingface
	- transformers
	library_name: transformers
	inference: false
	model-index:
	- name: Custom Pegasus Summarizer
	results: []
	---

	# 🦅 Custom Pegasus Summarizer

	This model is a custom-wrapped version of \[`google/pegasus-xsum`\](https://huggingface.co/google/pegasus-xsum) built for summarization tasks. It\'s implemented using Hugging Face\'s \`transformers\` library and wrapped with a custom model class for educational and experimental flexibility.

	✅ It supports:
	- Easy fine-tuning and extension \(e.g., adapters, prompt tuning\)
	- Drop-in replacement for the original model
	- Hugging Face Hub compatibility
	- Works with \`AutoTokenizer\` and \`CustomSeq2SeqModel\`

	---

	## 🧠 Model Architecture

	- Base: google/pegasus-xsum
	- Wrapper: CustomSeq2SeqModel \(inherits from PreTrainedModel\)
	- Tokenizer: AutoTokenizer from the same repo
	- Configuration: CustomSeq2SeqConfig \(inherits from PretrainedConfig\)

	---

	## 🧪 Training Details

	- Dataset: xsum \(500-sample subset\)
	- Task: Abstractive Summarization
	- Epochs: 1
	- Batch Size: 4
	- Learning Rate: 2e-5
	- Training Framework: Hugging Face Trainer

	---

	## 💡 Usage Example

	\`\`\`python
	from transformers import AutoTokenizer
	from model import CustomSeq2SeqModel # Your custom wrapper

	tokenizer = AutoTokenizer.from_pretrained("your-username/custom-pegasus-summarizer")
	model = CustomSeq2SeqModel.from_pretrained("your-username/custom-pegasus-summarizer")

	text = "summarize: The Apollo program was a major milestone in space exploration..."
	inputs = tokenizer(text, return_tensors="pt", truncation=True)
	summary_ids = model.generate(**inputs, max_length=60)
	print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
	\`\`\`

	---

	## 🎛 Live Demos

	You can try this model interactively on Hugging Face Spaces:

	- Gradio App: https://huggingface.co/spaces/your-username/custom-pegasus-gradio
	- Streamlit App: https://huggingface.co/spaces/your-username/custom-pegasus-streamlit

	---

	## 📦 Files Included

	- \`config.json\` – Model configuration \(used by \`from_pretrained\`\)
	- \`pytorch_model.bin\` – Fine-tuned model weights
	- \`tokenizer_config.json\` – Tokenizer settings
	- \`vocab.json\` / \`merges.txt\` – Tokenizer vocab \(depends on tokenizer type\)
	- \`special_tokens_map.json\` – Special tokens for summarization
	- \`README.md\` – This model card
	- \`model.py\` – \(if included\) Your \`CustomSeq2SeqModel\` class

	---

	## 📜 License

	Apache 2.0 — same license as the original \`pegasus-xsum\`.

	---

	## 🙏 Acknowledgments

	- Hugging Face for \`transformers\`, \`datasets\`, and \`hub\`
	- Authors of PEGASUS
	- Educational/Research communities building custom NLP models