mrcuddle's picture
Update README.md
7db7e63 verified
---
library_name: transformers
tags:
- axolotl
- generated_from_trainer
datasets:
- jondurbin/contextual-dpo-v0.1
model-index:
- name: Tiny-Darkllama3.2-1B-Instruct-v0.2
results:
- task:
type: arc_easy
dataset:
name: arc_easy
type: arc_easy
metrics:
- name: acc
type: accuracy
value: 0.2622
stderr: 0.0090
- name: acc_norm
type: normalized_accuracy
value: 0.2639
stderr: 0.0090
source:
name: eval-harness
url: https://github.com/EleutherAI/lm-evaluation-harness
base_model: unsloth/Llama-3.2-1B
---
# Model Card for Tiny-Darkllama3.2-1B-Instruct-v0.2
## Model Details
- **Model Name:** Tiny-Darkllama3.2-1B-Instruct-v0.2
- **Base Model:** [unsloth/Llama-3.2-1B](https://huggingface.co/unsloth/Llama-3.2-1B)
- **Model Type:** LlamaForCausalLM
- **Training Framework:** Transformers 4.48.3
- **Training Hardware:** NVIDIA GPU with CUDA 12.4
## Training Data
- **Dataset:** [jondurbin/contextual-dpo-v0.1](https://huggingface.co/datasets/jondurbin/contextual-dpo-v0.1)
- **Training Split:** train
## Training Procedure
### Hyperparameters
- **Learning Rate:** 0.0002
- **Optimizer:** AdamW
- **LR Scheduler:** Linear
- **Batch Size:** 1
- **Gradient Accumulation Steps:** 1
- **Max Steps:** 20
- **Epochs:** 4
- **Warmup Steps:** 10
- **Weight Decay:** 0.0
- **Sequence Length:** 1096
### Training Configuration
- **Gradient Checkpointing:** Enabled
- **Sample Packing:** Enabled
- **Pad to Sequence Length:** True
- **Flash Attention:** Disabled
- **FP16/BF16:** Disabled
- **DeepSpeed/FSDP:** Not used
## Evaluation
### Results
- **ARC Easy Dataset:**
- Accuracy: 0.2622
- Standard Error: 0.0090
- Normalized Accuracy: 0.2639
- Normalized Standard Error: 0.0090
## Usage
This model is designed for instruction-following tasks and can be used for various natural language processing applications. It is fine-tuned using the DPO (Direct Preference Optimization) method on the contextual-dpo dataset.
## Limitations
- The model's performance may vary depending on the specific task and dataset.
- Fine-tuning on additional datasets may be required for optimal performance on specific tasks.
## Citation
If you use this model in your research, please cite the original Llama model and the Axolotl training framework.
## License
This model is licensed under the terms of the [License Name](link-to-license).
## Contact
For more information, please contact [Your Contact Information].
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
<details>
<summary>See Axolotl Config</summary>
```yaml
axolotl version: '0.6.0'
base_model: mrcuddle/tiny-darkllama3.2-1B
bf16: false
dataset_prepared_path: last_run_prepared
rl: dpo
datasets:
- path: jondurbin/contextual-dpo-v0.1
field_messages: prompt
field_chosen: chosen
field_rejected: rejected
split: train
debug: null
deepspeed: null
early_stopping_patience: null
evals_per_epoch: null
flash_attention: false
fp16: false
fsdp: null
fsdp_config: null
gradient_accumulation_steps: 1
gradient_checkpointing: true
group_by_length: false
hub_model_id: mrcuddle/Tiny-Darkllama3.2-1B-Instruct
is_llama_derived_model: true
learning_rate: 0.0002
load_in_4bit: false
load_in_8bit: false
local_rank: null
logging_steps: 1
lr_scheduler: linear
max_steps: 20
micro_batch_size: 1
mlflow_experiment_name: colab-example
model_type: LlamaForCausalLM
num_epochs: 4
optimizer: adamw_torch
output_dir: ./llama2
pad_to_sequence_len: true
resume_from_checkpoint: null
sample_packing: true
saves_per_epoch: null
sequence_len: 1096
special_tokens: null
strict: false
tf32: false
tokenizer_type: LlamaTokenizer
train_on_inputs: false
wandb_entity: null
wandb_log_model: null
wandb_name: null
wandb_project: null
wandb_watch: null
warmup_steps: 10
weight_decay: 0.0
xformers_attention: null
```
</details>