ParamDev
/

mistral-7b-v0.3_alpaca

Text Generation

instruction-tuning

Model card Files Files and versions

mistral-7b-v0.3_alpaca / README.md

ParamDev's picture

Update README.md

838be76 verified 28 days ago

|

history blame contribute delete

1.11 kB

	---
	tags:
	- llama
	- alpaca
	- qlora
	- unsloth
	- instruction-tuning
	- fine-tuned
	base_model: unsloth/llama-3.1-8b-bnb-4bit
	library_name: peft
	license: apache-2.0
	datasets:
	- tatsu-lab/alpaca
	language:
	- en
	pipeline_tag: text-generation
	---
	# unsloth/mistral-7b-v0.3-bnb-4bit Fine-tuned with QLoRA (Unsloth) on Alpaca

	This model is a fine-tuned version of [`unsloth/mistral-7b-v0.3-bnb-4bit`](https://huggingface.co/unsloth/mistral-7b-v0.3-bnb-4bit) using QLoRA and [Unsloth](https://github.com/unslothai/unsloth) for efficient instruction-tuning.

	## 📖 Training Details
	- Dataset: [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned)
	- QLoRA: 4 bit quantization (NF4) using `bitsandbytes`
	- LoRA Rank: 16 (adjust based on your config)
	- LoRA Alpha: 16
	- Batch Size: 2 per device
	- Gradient Accumulation: 4
	- Learning Rate: 2e-4
	- Epochs: 1
	- Trainer: `trl.SFTTrainer`

	## 💡 Notes
	- Optimized for memory-efficient fine-tuning with Unsloth
	- No evaluation was run during training — please evaluate separately

	## 📝 License
	Apache 2.0