Update README.md

dc0be88 verified 4 months ago

6.26 kB

	---
	base_model: EleutherAI/pythia-70m-deduped
	model_name: "Pythia-70M Sarcasm LoRA by hyvve"
	library_name: peft
	tags:
	- text-generation
	- lora
	- peft
	- sarcasm
	- pythia
	- fine-tuning
	- causal-lm
	- EleutherAI
	license: apache-2.0
	pipeline_tag: text-generation
	---

	# Model Card for Pythia-70M Sarcasm LoRA

	This model is a LoRA (Low-Rank Adaptation) fine-tune of the `EleutherAI/pythia-70m-deduped` model, specifically adapted for tasks related to sarcasm.

	## Model Details

	### Model Description

	This is a PEFT LoRA adapter for the `EleutherAI/pythia-70m-deduped` model. It has been fine-tuned on a dataset related to sarcasm. As a Causal Language Model (CLM), its primary function is to predict the next token in a sequence. This fine-tuning aims to imbue the model with an understanding or stylistic representation of sarcastic language.

	- Developed by: [hyvve](https://hyvve.xyz) (based on job configurations)
	- Model type: Causal Language Model (specifically, a LoRA adapter for a GPT-NeoX based model)
	- Language(s) (NLP): English (derived from the base model and assumed dataset language)
	- License: Apache-2.0 (inherited from the base model `EleutherAI/pythia-70m-deduped`)
	- Finetuned from model: `EleutherAI/pythia-70m-deduped`

	### Model Sources [optional]

	- Repository (LoRA Adapter): `https://huggingface.co/manny-uncharted/pythia-70m-sarcasm-lora` (based on `hf_target_model_repo_id`)
	- Base Model Repository: `https://huggingface.co/EleutherAI/pythia-70m-deduped`
	- Paper [optional]: For Pythia suite: [arXiv:2304.01373](https://arxiv.org/abs/2304.01373)
	- Demo [optional]: [Not Provided]

	## Uses

	### Direct Use

	This LoRA adapter is intended to be loaded on top of the `EleutherAI/pythia-70m-deduped` base model. It can be used for:
	* Generating text with a sarcastic tone or style.
	* Completing prompts in a sarcastic manner.
	* Research into modeling nuanced aspects of language like sarcasm with smaller LMs.

	Note: Due to the extremely small dataset size used for fine-tuning (~1000 examples with each file containing 200 examples), the model's ability to robustly generate or understand sarcasm will be very limited. It primarily serves as a pipeline and integration test.

	### Downstream Use [optional]

	* Further fine-tuning on larger, more diverse sarcasm datasets.
	* Integration into applications requiring conditional text generation with a sarcastic flavor (e.g., chatbots, creative writing tools), though extensive further tuning would be necessary.

	### Out-of-Scope Use

	* Reliable sarcasm detection or classification without significant further development and evaluation.
	* Generating harmful, biased, or offensive content, even if framed as sarcasm.
	* Use in critical applications where misinterpretation of sarcasm could have negative consequences.
	* Generating fluent, coherent, and factually accurate long-form text beyond the capabilities of the 70M parameter base model.

	## Bias, Risks, and Limitations

	* Limited Scope: Fine-tuned on a very small dataset (1000 examples), so its understanding and generation of sarcasm will be superficial and not generalizable.
	* Inherited Biases: Inherits biases from the `EleutherAI/pythia-70m-deduped` base model, which was trained on The Pile. These can include societal, gender, and racial biases.
	* Misinterpretation of Sarcasm: Sarcasm is highly context-dependent and subjective. The model may generate text that is inappropriately sarcastic or fail to understand sarcastic prompts correctly.
	* Potential for Harmful Sarcasm: Sarcasm can be used to convey negativity or veiled aggression. The model might inadvertently generate such content.
	* Numerical Instability: During the logged training run, an `eval_loss: nan` was observed, indicating potential issues with evaluation on the tiny validation set or numerical instability under the given configuration. The `train_loss: 0.0` also suggests extreme overfitting or issues with the learning process on such limited data.

	### Recommendations

	* Thorough Evaluation: Before any production use, the model (after further fine-tuning on a substantial dataset) would require rigorous evaluation for both sarcasm generation quality and potential biases.
	* Content Moderation: Downstream applications should implement content moderation and safety filters.
	* Context is Key: Use with clear context and be aware that its sarcastic capabilities are likely very brittle due to the limited training data.
	* Do Not Use for Critical Decisions: This model, in its current state, is not suitable for any critical applications.

	## How to Get Started with the Model

	To use this LoRA adapter, you'll need to load the base model and then apply the adapter using the PEFT library.

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	base_model_id = "EleutherAI/pythia-70m-deduped"
	adapter_model_id = "manny-uncharted/pythia-70m-sarcasm-lora" # Replace with your actual model ID

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained(base_model_id)
	if tokenizer.pad_token is None:
	tokenizer.pad_token = tokenizer.eos_token

	# Load the base model (e.g., in 4-bit if that's how the adapter was trained/intended)
	# For QLoRA, BitsAndBytesConfig would be needed here as during training
	# For simplicity, this example loads without quantization. Adapt as needed.
	base_model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	# quantization_config=BitsAndBytesConfig(...) # Add if loading in 4-bit/8-bit
	# torch_dtype=torch.float16, # Or torch.bfloat16
	device_map="auto"
	)

	# Load the PEFT LoRA model (adapter)
	model = PeftModel.from_pretrained(base_model, adapter_model_id)
	model = model.merge_and_unload() # Optional: merge adapter into base model for faster inference

	# Now you can use the model for generation
	prompt = "The weather today is just " # Example prompt
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	# Generate text
	# Adjust generation parameters as needed
	outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, top_k=50, top_p=0.95, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))