manny-uncharted's picture
Update README.md
dc0be88 verified
---
base_model: EleutherAI/pythia-70m-deduped
model_name: "Pythia-70M Sarcasm LoRA by hyvve"
library_name: peft
tags:
- text-generation
- lora
- peft
- sarcasm
- pythia
- fine-tuning
- causal-lm
- EleutherAI
license: apache-2.0
pipeline_tag: text-generation
---
# Model Card for Pythia-70M Sarcasm LoRA
This model is a LoRA (Low-Rank Adaptation) fine-tune of the `EleutherAI/pythia-70m-deduped` model, specifically adapted for tasks related to sarcasm.
## Model Details
### Model Description
This is a PEFT LoRA adapter for the `EleutherAI/pythia-70m-deduped` model. It has been fine-tuned on a dataset related to sarcasm. As a Causal Language Model (CLM), its primary function is to predict the next token in a sequence. This fine-tuning aims to imbue the model with an understanding or stylistic representation of sarcastic language.
- **Developed by:** [hyvve](https://hyvve.xyz) (based on job configurations)
- **Model type:** Causal Language Model (specifically, a LoRA adapter for a GPT-NeoX based model)
- **Language(s) (NLP):** English (derived from the base model and assumed dataset language)
- **License:** Apache-2.0 (inherited from the base model `EleutherAI/pythia-70m-deduped`)
- **Finetuned from model:** `EleutherAI/pythia-70m-deduped`
### Model Sources [optional]
- **Repository (LoRA Adapter):** `https://huggingface.co/manny-uncharted/pythia-70m-sarcasm-lora` (based on `hf_target_model_repo_id`)
- **Base Model Repository:** `https://huggingface.co/EleutherAI/pythia-70m-deduped`
- **Paper [optional]:** For Pythia suite: [arXiv:2304.01373](https://arxiv.org/abs/2304.01373)
- **Demo [optional]:** [Not Provided]
## Uses
### Direct Use
This LoRA adapter is intended to be loaded on top of the `EleutherAI/pythia-70m-deduped` base model. It can be used for:
* Generating text with a sarcastic tone or style.
* Completing prompts in a sarcastic manner.
* Research into modeling nuanced aspects of language like sarcasm with smaller LMs.
**Note:** Due to the extremely small dataset size used for fine-tuning (~1000 examples with each file containing 200 examples), the model's ability to robustly generate or understand sarcasm will be very limited. It primarily serves as a pipeline and integration test.
### Downstream Use [optional]
* Further fine-tuning on larger, more diverse sarcasm datasets.
* Integration into applications requiring conditional text generation with a sarcastic flavor (e.g., chatbots, creative writing tools), though extensive further tuning would be necessary.
### Out-of-Scope Use
* Reliable sarcasm detection or classification without significant further development and evaluation.
* Generating harmful, biased, or offensive content, even if framed as sarcasm.
* Use in critical applications where misinterpretation of sarcasm could have negative consequences.
* Generating fluent, coherent, and factually accurate long-form text beyond the capabilities of the 70M parameter base model.
## Bias, Risks, and Limitations
* **Limited Scope:** Fine-tuned on a very small dataset (1000 examples), so its understanding and generation of sarcasm will be superficial and not generalizable.
* **Inherited Biases:** Inherits biases from the `EleutherAI/pythia-70m-deduped` base model, which was trained on The Pile. These can include societal, gender, and racial biases.
* **Misinterpretation of Sarcasm:** Sarcasm is highly context-dependent and subjective. The model may generate text that is inappropriately sarcastic or fail to understand sarcastic prompts correctly.
* **Potential for Harmful Sarcasm:** Sarcasm can be used to convey negativity or veiled aggression. The model might inadvertently generate such content.
* **Numerical Instability:** During the logged training run, an `eval_loss: nan` was observed, indicating potential issues with evaluation on the tiny validation set or numerical instability under the given configuration. The `train_loss: 0.0` also suggests extreme overfitting or issues with the learning process on such limited data.
### Recommendations
* **Thorough Evaluation:** Before any production use, the model (after further fine-tuning on a substantial dataset) would require rigorous evaluation for both sarcasm generation quality and potential biases.
* **Content Moderation:** Downstream applications should implement content moderation and safety filters.
* **Context is Key:** Use with clear context and be aware that its sarcastic capabilities are likely very brittle due to the limited training data.
* **Do Not Use for Critical Decisions:** This model, in its current state, is not suitable for any critical applications.
## How to Get Started with the Model
To use this LoRA adapter, you'll need to load the base model and then apply the adapter using the PEFT library.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model_id = "EleutherAI/pythia-70m-deduped"
adapter_model_id = "manny-uncharted/pythia-70m-sarcasm-lora" # Replace with your actual model ID
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Load the base model (e.g., in 4-bit if that's how the adapter was trained/intended)
# For QLoRA, BitsAndBytesConfig would be needed here as during training
# For simplicity, this example loads without quantization. Adapt as needed.
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
# quantization_config=BitsAndBytesConfig(...) # Add if loading in 4-bit/8-bit
# torch_dtype=torch.float16, # Or torch.bfloat16
device_map="auto"
)
# Load the PEFT LoRA model (adapter)
model = PeftModel.from_pretrained(base_model, adapter_model_id)
model = model.merge_and_unload() # Optional: merge adapter into base model for faster inference
# Now you can use the model for generation
prompt = "The weather today is just " # Example prompt
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate text
# Adjust generation parameters as needed
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, top_k=50, top_p=0.95, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))