|
--- |
|
library_name: transformers |
|
tags: |
|
- axolotl |
|
- generated_from_trainer |
|
datasets: |
|
- jondurbin/contextual-dpo-v0.1 |
|
model-index: |
|
- name: Tiny-Darkllama3.2-1B-Instruct-v0.2 |
|
results: |
|
- task: |
|
type: arc_easy |
|
dataset: |
|
name: arc_easy |
|
type: arc_easy |
|
metrics: |
|
- name: acc |
|
type: accuracy |
|
value: 0.2622 |
|
stderr: 0.0090 |
|
- name: acc_norm |
|
type: normalized_accuracy |
|
value: 0.2639 |
|
stderr: 0.0090 |
|
source: |
|
name: eval-harness |
|
url: https://github.com/EleutherAI/lm-evaluation-harness |
|
base_model: unsloth/Llama-3.2-1B |
|
--- |
|
|
|
# Model Card for Tiny-Darkllama3.2-1B-Instruct-v0.2 |
|
|
|
## Model Details |
|
|
|
- **Model Name:** Tiny-Darkllama3.2-1B-Instruct-v0.2 |
|
- **Base Model:** [unsloth/Llama-3.2-1B](https://huggingface.co/unsloth/Llama-3.2-1B) |
|
- **Model Type:** LlamaForCausalLM |
|
- **Training Framework:** Transformers 4.48.3 |
|
- **Training Hardware:** NVIDIA GPU with CUDA 12.4 |
|
|
|
## Training Data |
|
|
|
- **Dataset:** [jondurbin/contextual-dpo-v0.1](https://huggingface.co/datasets/jondurbin/contextual-dpo-v0.1) |
|
- **Training Split:** train |
|
|
|
## Training Procedure |
|
|
|
### Hyperparameters |
|
|
|
- **Learning Rate:** 0.0002 |
|
- **Optimizer:** AdamW |
|
- **LR Scheduler:** Linear |
|
- **Batch Size:** 1 |
|
- **Gradient Accumulation Steps:** 1 |
|
- **Max Steps:** 20 |
|
- **Epochs:** 4 |
|
- **Warmup Steps:** 10 |
|
- **Weight Decay:** 0.0 |
|
- **Sequence Length:** 1096 |
|
|
|
### Training Configuration |
|
|
|
- **Gradient Checkpointing:** Enabled |
|
- **Sample Packing:** Enabled |
|
- **Pad to Sequence Length:** True |
|
- **Flash Attention:** Disabled |
|
- **FP16/BF16:** Disabled |
|
- **DeepSpeed/FSDP:** Not used |
|
|
|
## Evaluation |
|
|
|
### Results |
|
|
|
- **ARC Easy Dataset:** |
|
- Accuracy: 0.2622 |
|
- Standard Error: 0.0090 |
|
- Normalized Accuracy: 0.2639 |
|
- Normalized Standard Error: 0.0090 |
|
|
|
## Usage |
|
|
|
This model is designed for instruction-following tasks and can be used for various natural language processing applications. It is fine-tuned using the DPO (Direct Preference Optimization) method on the contextual-dpo dataset. |
|
|
|
## Limitations |
|
|
|
- The model's performance may vary depending on the specific task and dataset. |
|
- Fine-tuning on additional datasets may be required for optimal performance on specific tasks. |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite the original Llama model and the Axolotl training framework. |
|
|
|
## License |
|
|
|
This model is licensed under the terms of the [License Name](link-to-license). |
|
|
|
## Contact |
|
|
|
For more information, please contact [Your Contact Information]. |
|
|
|
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl) |
|
|
|
<details> |
|
<summary>See Axolotl Config</summary> |
|
|
|
```yaml |
|
axolotl version: '0.6.0' |
|
base_model: mrcuddle/tiny-darkllama3.2-1B |
|
bf16: false |
|
dataset_prepared_path: last_run_prepared |
|
rl: dpo |
|
datasets: |
|
- path: jondurbin/contextual-dpo-v0.1 |
|
field_messages: prompt |
|
field_chosen: chosen |
|
field_rejected: rejected |
|
split: train |
|
debug: null |
|
deepspeed: null |
|
early_stopping_patience: null |
|
evals_per_epoch: null |
|
flash_attention: false |
|
fp16: false |
|
fsdp: null |
|
fsdp_config: null |
|
gradient_accumulation_steps: 1 |
|
gradient_checkpointing: true |
|
group_by_length: false |
|
hub_model_id: mrcuddle/Tiny-Darkllama3.2-1B-Instruct |
|
is_llama_derived_model: true |
|
learning_rate: 0.0002 |
|
load_in_4bit: false |
|
load_in_8bit: false |
|
local_rank: null |
|
logging_steps: 1 |
|
lr_scheduler: linear |
|
max_steps: 20 |
|
micro_batch_size: 1 |
|
mlflow_experiment_name: colab-example |
|
model_type: LlamaForCausalLM |
|
num_epochs: 4 |
|
optimizer: adamw_torch |
|
output_dir: ./llama2 |
|
pad_to_sequence_len: true |
|
resume_from_checkpoint: null |
|
sample_packing: true |
|
saves_per_epoch: null |
|
sequence_len: 1096 |
|
special_tokens: null |
|
strict: false |
|
tf32: false |
|
tokenizer_type: LlamaTokenizer |
|
train_on_inputs: false |
|
wandb_entity: null |
|
wandb_log_model: null |
|
wandb_name: null |
|
wandb_project: null |
|
wandb_watch: null |
|
warmup_steps: 10 |
|
weight_decay: 0.0 |
|
xformers_attention: null |
|
``` |
|
</details> |