Update README.md

7db7e63 verified 7 months ago

4.1 kB

	---
	library_name: transformers
	tags:
	- axolotl
	- generated_from_trainer
	datasets:
	- jondurbin/contextual-dpo-v0.1
	model-index:
	- name: Tiny-Darkllama3.2-1B-Instruct-v0.2
	results:
	- task:
	type: arc_easy
	dataset:
	name: arc_easy
	type: arc_easy
	metrics:
	- name: acc
	type: accuracy
	value: 0.2622
	stderr: 0.0090
	- name: acc_norm
	type: normalized_accuracy
	value: 0.2639
	stderr: 0.0090
	source:
	name: eval-harness
	url: https://github.com/EleutherAI/lm-evaluation-harness
	base_model: unsloth/Llama-3.2-1B
	---

	# Model Card for Tiny-Darkllama3.2-1B-Instruct-v0.2

	## Model Details

	- Model Name: Tiny-Darkllama3.2-1B-Instruct-v0.2
	- Base Model: [unsloth/Llama-3.2-1B](https://huggingface.co/unsloth/Llama-3.2-1B)
	- Model Type: LlamaForCausalLM
	- Training Framework: Transformers 4.48.3
	- Training Hardware: NVIDIA GPU with CUDA 12.4

	## Training Data

	- Dataset: [jondurbin/contextual-dpo-v0.1](https://huggingface.co/datasets/jondurbin/contextual-dpo-v0.1)
	- Training Split: train

	## Training Procedure

	### Hyperparameters

	- Learning Rate: 0.0002
	- Optimizer: AdamW
	- LR Scheduler: Linear
	- Batch Size: 1
	- Gradient Accumulation Steps: 1
	- Max Steps: 20
	- Epochs: 4
	- Warmup Steps: 10
	- Weight Decay: 0.0
	- Sequence Length: 1096

	### Training Configuration

	- Gradient Checkpointing: Enabled
	- Sample Packing: Enabled
	- Pad to Sequence Length: True
	- Flash Attention: Disabled
	- FP16/BF16: Disabled
	- DeepSpeed/FSDP: Not used

	## Evaluation

	### Results

	- ARC Easy Dataset:
	- Accuracy: 0.2622
	- Standard Error: 0.0090
	- Normalized Accuracy: 0.2639
	- Normalized Standard Error: 0.0090

	## Usage

	This model is designed for instruction-following tasks and can be used for various natural language processing applications. It is fine-tuned using the DPO (Direct Preference Optimization) method on the contextual-dpo dataset.

	## Limitations

	- The model's performance may vary depending on the specific task and dataset.
	- Fine-tuning on additional datasets may be required for optimal performance on specific tasks.

	## Citation

	If you use this model in your research, please cite the original Llama model and the Axolotl training framework.

	## License

	This model is licensed under the terms of the [License Name](link-to-license).

	## Contact

	For more information, please contact [Your Contact Information].

	[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)

	<details>
	<summary>See Axolotl Config</summary>

	```yaml
	axolotl version: '0.6.0'
	base_model: mrcuddle/tiny-darkllama3.2-1B
	bf16: false
	dataset_prepared_path: last_run_prepared
	rl: dpo
	datasets:
	- path: jondurbin/contextual-dpo-v0.1
	field_messages: prompt
	field_chosen: chosen
	field_rejected: rejected
	split: train
	debug: null
	deepspeed: null
	early_stopping_patience: null
	evals_per_epoch: null
	flash_attention: false
	fp16: false
	fsdp: null
	fsdp_config: null
	gradient_accumulation_steps: 1
	gradient_checkpointing: true
	group_by_length: false
	hub_model_id: mrcuddle/Tiny-Darkllama3.2-1B-Instruct
	is_llama_derived_model: true
	learning_rate: 0.0002
	load_in_4bit: false
	load_in_8bit: false
	local_rank: null
	logging_steps: 1
	lr_scheduler: linear
	max_steps: 20
	micro_batch_size: 1
	mlflow_experiment_name: colab-example
	model_type: LlamaForCausalLM
	num_epochs: 4
	optimizer: adamw_torch
	output_dir: ./llama2
	pad_to_sequence_len: true
	resume_from_checkpoint: null
	sample_packing: true
	saves_per_epoch: null
	sequence_len: 1096
	special_tokens: null
	strict: false
	tf32: false
	tokenizer_type: LlamaTokenizer
	train_on_inputs: false
	wandb_entity: null
	wandb_log_model: null
	wandb_name: null
	wandb_project: null
	wandb_watch: null
	warmup_steps: 10
	weight_decay: 0.0
	xformers_attention: null
	```
	</details>