πŸ¦… Custom Pegasus Summarizer

This model is a custom-wrapped version of [google/pegasus-xsum](https://huggingface.co/google/pegasus-xsum) built for summarization tasks. It's implemented using Hugging Face's `transformers` library and wrapped with a custom model class for educational and experimental flexibility.

βœ… It supports:

  • Easy fine-tuning and extension (e.g., adapters, prompt tuning)
  • Drop-in replacement for the original model
  • Hugging Face Hub compatibility
  • Works with `AutoTokenizer` and `CustomSeq2SeqModel`

🧠 Model Architecture

  • Base: google/pegasus-xsum
  • Wrapper: CustomSeq2SeqModel (inherits from PreTrainedModel)
  • Tokenizer: AutoTokenizer from the same repo
  • Configuration: CustomSeq2SeqConfig (inherits from PretrainedConfig)

πŸ§ͺ Training Details

  • Dataset: xsum (500-sample subset)
  • Task: Abstractive Summarization
  • Epochs: 1
  • Batch Size: 4
  • Learning Rate: 2e-5
  • Training Framework: Hugging Face Trainer

πŸ’‘ Usage Example

```python from transformers import AutoTokenizer from model import CustomSeq2SeqModel # Your custom wrapper

tokenizer = AutoTokenizer.from_pretrained("your-username/custom-pegasus-summarizer") model = CustomSeq2SeqModel.from_pretrained("your-username/custom-pegasus-summarizer")

text = "summarize: The Apollo program was a major milestone in space exploration..." inputs = tokenizer(text, return_tensors="pt", truncation=True) summary_ids = model.generate(**inputs, max_length=60) print(tokenizer.decode(summary_ids[0], skip_special_tokens=True)) ```


πŸŽ› Live Demos

You can try this model interactively on Hugging Face Spaces:


πŸ“¦ Files Included

  • `config.json` – Model configuration (used by `from_pretrained`)
  • `pytorch_model.bin` – Fine-tuned model weights
  • `tokenizer_config.json` – Tokenizer settings
  • `vocab.json` / `merges.txt` – Tokenizer vocab (depends on tokenizer type)
  • `special_tokens_map.json` – Special tokens for summarization
  • `README.md` – This model card
  • `model.py` – (if included) Your `CustomSeq2SeqModel` class

πŸ“œ License

Apache 2.0 β€” same license as the original `pegasus-xsum`.


πŸ™ Acknowledgments

  • Hugging Face for `transformers`, `datasets`, and `hub`
  • Authors of PEGASUS
  • Educational/Research communities building custom NLP models
Downloads last month
15
Safetensors
Model size
6.4k params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using yhamidullah/custom-classifier-demo 1