kunjcr2's picture
Update README.md
4ade264 verified
---
license: mit
datasets:
- pacovaldez/stackoverflow-questions
language:
- en
base_model:
- google/flan-t5-base
tokenuzer:
- google/flan-t5-base
library_name: transformers
tags:
- Stackoverflow
- flan-t5
- peft
- lora
- seq2seq
---
# πŸ€– FLAN-T5 Base Fine-Tuned on Stack Overflow Questions (LoRA)
This is a fine-tuned version of [`google/flan-t5-base`](https://huggingface.co/google/flan-t5-base) on a curated dataset of Stack Overflow programming questions. It was trained using [LoRA](https://arxiv.org/abs/2106.09685) (Low-Rank Adaptation) for parameter-efficient fine-tuning, making it compact, efficient, and effective at modeling developer-style Q&A tasks.
---
## 🧠 Model Objective
The model is trained to:
- Rewrite or improve unclear programming questions
- Generate relevant clarifying questions or answers
- Summarize long developer queries
- Serve as a code-aware Q&A assistant
---
## πŸ“š Training Data
- **Source**: Stack Overflow public questions dataset (cleaned)
- **Format**: Instruction-like examples, Q&A pairs, summarization prompts
- **Cleaning**: HTML stripping, markdown-to-text, code-preserved
- **Size**: ~15k examples
---
## πŸ—οΈ Training Details
- **Base Model**: `google/flan-t5-base`
- **Adapter Format**: LoRA using [`peft`](https://github.com/huggingface/peft)
- **Files**:
- `adapter_model.safetensors`
- `adapter_config.json`
- **Hyperparameters**:
- `r`: 8
- `lora_alpha`: 16
- `lora_dropout`: 0.1
- `bias`: "none"
- `task_type`: "SEQ_2_SEQ_LM"
- **Inference Mode**: Enabled
---
## πŸ’‘ How to Use
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel
# Load tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")
base_model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "your-model-folder")
model.eval()
# Inference
prompt = "Rewrite this question more clearly: why is my javascript function undefined?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
πŸ§ͺ Intended Use
This model is best suited for:
Code-aware chatbot assistants
Prompt engineering for developer tools
Developer-focused summarization / rephrasing
Auto-moderation / clarification of tech questions
⚠️ Limitations
Not trained for code generation or long-form answers
May hallucinate incorrect or generic responses
Finetuned only on Stack Overflow β€” domain-specific
✨ Acknowledgements
Hugging Face Transformers
LoRA (PEFT)
Stack Overflow for open data
FLAN-T5: Scaling Instruction-Finetuned Models
πŸ› οΈ Created with love by Kunj | Model suggestion & guidance by ChatGPT