File size: 4,098 Bytes
8842543 07845fb 8842543 07845fb 1003d09 8842543 7db7e63 234348c 7db7e63 234348c 7db7e63 8842543 7db7e63 8842543 07845fb 8842543 7db7e63 07845fb 7db7e63 1003d09 07845fb 1003d09 07845fb 7db7e63 07845fb 7db7e63 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
---
library_name: transformers
tags:
- axolotl
- generated_from_trainer
datasets:
- jondurbin/contextual-dpo-v0.1
model-index:
- name: Tiny-Darkllama3.2-1B-Instruct-v0.2
results:
- task:
type: arc_easy
dataset:
name: arc_easy
type: arc_easy
metrics:
- name: acc
type: accuracy
value: 0.2622
stderr: 0.0090
- name: acc_norm
type: normalized_accuracy
value: 0.2639
stderr: 0.0090
source:
name: eval-harness
url: https://github.com/EleutherAI/lm-evaluation-harness
base_model: unsloth/Llama-3.2-1B
---
# Model Card for Tiny-Darkllama3.2-1B-Instruct-v0.2
## Model Details
- **Model Name:** Tiny-Darkllama3.2-1B-Instruct-v0.2
- **Base Model:** [unsloth/Llama-3.2-1B](https://huggingface.co/unsloth/Llama-3.2-1B)
- **Model Type:** LlamaForCausalLM
- **Training Framework:** Transformers 4.48.3
- **Training Hardware:** NVIDIA GPU with CUDA 12.4
## Training Data
- **Dataset:** [jondurbin/contextual-dpo-v0.1](https://huggingface.co/datasets/jondurbin/contextual-dpo-v0.1)
- **Training Split:** train
## Training Procedure
### Hyperparameters
- **Learning Rate:** 0.0002
- **Optimizer:** AdamW
- **LR Scheduler:** Linear
- **Batch Size:** 1
- **Gradient Accumulation Steps:** 1
- **Max Steps:** 20
- **Epochs:** 4
- **Warmup Steps:** 10
- **Weight Decay:** 0.0
- **Sequence Length:** 1096
### Training Configuration
- **Gradient Checkpointing:** Enabled
- **Sample Packing:** Enabled
- **Pad to Sequence Length:** True
- **Flash Attention:** Disabled
- **FP16/BF16:** Disabled
- **DeepSpeed/FSDP:** Not used
## Evaluation
### Results
- **ARC Easy Dataset:**
- Accuracy: 0.2622
- Standard Error: 0.0090
- Normalized Accuracy: 0.2639
- Normalized Standard Error: 0.0090
## Usage
This model is designed for instruction-following tasks and can be used for various natural language processing applications. It is fine-tuned using the DPO (Direct Preference Optimization) method on the contextual-dpo dataset.
## Limitations
- The model's performance may vary depending on the specific task and dataset.
- Fine-tuning on additional datasets may be required for optimal performance on specific tasks.
## Citation
If you use this model in your research, please cite the original Llama model and the Axolotl training framework.
## License
This model is licensed under the terms of the [License Name](link-to-license).
## Contact
For more information, please contact [Your Contact Information].
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
<details>
<summary>See Axolotl Config</summary>
```yaml
axolotl version: '0.6.0'
base_model: mrcuddle/tiny-darkllama3.2-1B
bf16: false
dataset_prepared_path: last_run_prepared
rl: dpo
datasets:
- path: jondurbin/contextual-dpo-v0.1
field_messages: prompt
field_chosen: chosen
field_rejected: rejected
split: train
debug: null
deepspeed: null
early_stopping_patience: null
evals_per_epoch: null
flash_attention: false
fp16: false
fsdp: null
fsdp_config: null
gradient_accumulation_steps: 1
gradient_checkpointing: true
group_by_length: false
hub_model_id: mrcuddle/Tiny-Darkllama3.2-1B-Instruct
is_llama_derived_model: true
learning_rate: 0.0002
load_in_4bit: false
load_in_8bit: false
local_rank: null
logging_steps: 1
lr_scheduler: linear
max_steps: 20
micro_batch_size: 1
mlflow_experiment_name: colab-example
model_type: LlamaForCausalLM
num_epochs: 4
optimizer: adamw_torch
output_dir: ./llama2
pad_to_sequence_len: true
resume_from_checkpoint: null
sample_packing: true
saves_per_epoch: null
sequence_len: 1096
special_tokens: null
strict: false
tf32: false
tokenizer_type: LlamaTokenizer
train_on_inputs: false
wandb_entity: null
wandb_log_model: null
wandb_name: null
wandb_project: null
wandb_watch: null
warmup_steps: 10
weight_decay: 0.0
xformers_attention: null
```
</details> |