Transformers
Safetensors
m2m_100
text2text-generation
Generated from Trainer

nllb-200-finetunning-5e-5-32batch-9310steps

This model is a fine-tuned version of facebook/nllb-200-3.3B It achieves the following results on the evaluation set:

  • Loss: 0.9026

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 3407
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 32
  • optimizer: Use adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
1.3132 0.0268 50 1.0922 0.0202
1.3205 0.0537 100 1.0550
1.1442 0.0805 150 1.0364
0.9155 0.1074 200 1.0216
1.2449 0.1342 250 1.0114
0.7856 0.1611 300 1.0021
1.1645 0.1879 350 0.9942
1.168 0.2148 400 0.9872
1.0991 0.2416 450 0.9800
0.9243 0.2684 500 0.9751
1.3314 0.2953 550 0.9689
1.0203 0.3221 600 0.9650
0.9849 0.3490 650 0.9630
1.1588 0.3758 700 0.9581
0.8723 0.4027 750 0.9560
0.8312 0.4295 800 0.9528
0.7363 0.4564 850 0.9519
0.9292 0.4832 900 0.9494
0.9326 0.5100 950 0.9449
0.8994 0.5369 1000 0.9424
1.0149 0.5637 1050 0.9401
0.9467 0.5906 1100 0.9381
1.0278 0.6174 1150 0.9375
0.9975 0.6443 1200 0.9368
1.0143 0.6711 1250 0.9342
1.1696 0.6980 1300 0.9328
0.8223 0.7248 1350 0.9316
0.9384 0.7517 1400 0.9312
0.9466 0.7785 1450 0.9296
0.8791 0.8053 1500 0.9287
1.1393 0.8322 1550 0.9287
0.9288 0.8590 1600 0.9277
0.9249 0.8859 1650 0.9260
1.0053 0.9127 1700 0.9246
0.7675 0.9396 1750 0.9235
1.0448 0.9664 1800 0.9234
0.8382 0.9933 1850 0.9214
0.7418 1.0204 1900 0.9221
0.8152 1.0472 1950 0.9234
0.888 1.0741 2000 0.9225
0.762 1.1009 2050 0.9213
0.8621 1.1278 2100 0.9210
0.9048 1.1546 2150 0.9200
0.7952 1.1815 2200 0.9192
0.9558 1.2083 2250 0.9186
1.0422 1.2352 2300 0.9190
0.7353 1.2620 2350 0.9167
0.924 1.2888 2400 0.9169
0.8895 1.3157 2450 0.9156
0.9581 1.3425 2500 0.9149
0.7616 1.3694 2550 0.9148
0.77 1.3962 2600 0.9143
0.8474 1.4231 2650 0.9134
0.8242 1.4499 2700 0.9133
0.8491 1.4768 2750 0.9131
0.8286 1.5036 2800 0.9119
0.7373 1.5305 2850 0.9116
1.1709 1.5573 2900 0.9109
0.918 1.5841 2950 0.9100
0.8682 1.6110 3000 0.9104
0.7289 1.6378 3050 0.9098
0.9615 1.6647 3100 0.9098
0.9054 1.6915 3150 0.9101
0.9033 1.7184 3200 0.9094
0.8673 1.7452 3250 0.9095
1.0133 1.7721 3300 0.9078
0.8208 1.7989 3350 0.9075
0.8854 1.8257 3400 0.9072
0.81 1.8526 3450 0.9074
0.9013 1.8794 3500 0.9069
0.8539 1.9063 3550 0.9064
0.7346 1.9331 3600 0.9066
0.9698 1.9600 3650 0.9061
0.7256 1.9868 3700 0.9062
0.8813 2.0134 3750 0.9057
1.1117 2.0403 3800 0.9068
0.766 2.0671 3850 0.9062
0.8469 2.0940 3900 0.9066
0.9628 2.1208 3950 0.9063
0.9167 2.1476 4000 0.9062
0.8287 2.1745 4050 0.9058
0.866 2.2013 4100 0.9053
0.9124 2.2282 4150 0.9055
0.722 2.2550 4200 0.9057
0.956 2.2819 4250 0.9056
0.6837 2.3087 4300 0.9050
1.0191 2.3356 4350 0.9045
0.9707 2.3624 4400 0.9050
0.9852 2.3892 4450 0.9054
0.8172 2.4161 4500 0.9050
0.979 2.4429 4550 0.9050
0.9173 2.4698 4600 0.9042
0.8936 2.4966 4650 0.9043
0.6992 2.5235 4700 0.9045
0.79 2.5503 4750 0.9045
0.7661 2.5772 4800 0.9043
0.9067 2.6040 4850 0.9036
0.7251 2.6309 4900 0.9035
0.7873 2.6577 4950 0.9036
0.8441 2.6845 5000 0.9034
0.9242 2.7114 5050 0.9034
0.8931 2.7382 5100 0.9029
1.0847 2.7651 5150 0.9028
0.7797 2.7919 5200 0.9028
0.7537 2.8188 5250 0.9030
0.7131 2.8456 5300 0.9030
0.8321 2.8725 5350 0.9030
0.7554 2.8993 5400 0.9032
0.8003 2.9261 5450 0.9032
0.862 2.9530 5500 0.9034
0.9439 2.9798 5550 0.9031
0.7934 3.0064 5600 0.9030
0.7656 3.0333 5650 0.9030
1.0536 3.0601 5700 0.9033
0.7046 3.0870 5750 0.9032
0.7297 3.1138 5800 0.9028
0.7948 3.1407 5850 0.9028
0.7877 3.1675 5900 0.9030
0.8918 3.1944 5950 0.9028
0.8123 3.2212 6000 0.9030
0.7079 3.2480 6050 0.9029
0.9428 3.2749 6100 0.9030
0.7774 3.3017 6150 0.9030
0.8418 3.3286 6200 0.9033
1.0364 3.3554 6250 0.9032
0.7611 3.3823 6300 0.9031
0.8938 3.4091 6350 0.9030
0.9085 3.4360 6400 0.9030
0.8015 3.4628 6450 0.9030
0.7286 3.4896 6500 0.9030
0.7203 3.5165 6550 0.9030
0.8212 3.5433 6600 0.9030
0.7335 3.5702 6650 0.9028
0.7196 3.5970 6700 0.9029
0.6572 3.6239 6750 0.9030
0.8649 3.6507 6800 0.9029
0.805 3.6776 6850 0.9029
0.8108 3.7044 6900 0.9027
0.8756 3.7313 6950 0.9028
0.895 3.7581 7000 0.9026
0.8497 3.7849 7050 0.9028
0.9445 3.8118 7100 0.9026
0.7153 3.8386 7150 0.9026
0.7897 3.8655 7200 0.9026
0.858 3.8923 7250 0.9027
0.9963 3.9192 7300 0.9028
0.7619 3.9460 7350 0.9027
0.8844 3.9729 7400 0.9028
0.8264 3.9997 7450 0.9028
0.9657 4.0263 7500 0.9026
0.7688 4.0532 7550 0.9028
0.9613 4.0800 7600 0.9027
0.7074 4.1068 7650 0.9025
0.7589 4.1337 7700 0.9028
0.8279 4.1605 7750 0.9028
0.7417 4.1874 7800 0.9027
0.8121 4.2142 7850 0.9026
0.877 4.2411 7900 0.9026
0.7371 4.2679 7950 0.9027
0.8387 4.2948 8000 0.9027
0.8789 4.3216 8050 0.9028
1.0297 4.3484 8100 0.9027
0.7222 4.3753 8150 0.9028
0.8673 4.4021 8200 0.9027
0.7866 4.4290 8250 0.9027
0.7187 4.4558 8300 0.9027
0.8237 4.4827 8350 0.9027
0.8223 4.5095 8400 0.9027
0.8093 4.5364 8450 0.9027
0.815 4.5632 8500 0.9026
0.7278 4.5900 8550 0.9028
0.7515 4.6169 8600 0.9027
0.9041 4.6437 8650 0.9026
0.7683 4.6706 8700 0.9026
0.8538 4.6974 8750 0.9027
0.837 4.7243 8800 0.9027
0.7077 4.7511 8850 0.9027
0.8734 4.7780 8900 0.9027
0.8391 4.8048 8950 0.9027
0.7243 4.8316 9000 0.9028
0.6905 4.8585 9050 0.9026
0.8787 4.8853 9100 0.9026
0.9105 4.9122 9150 0.9026
0.9295 4.9390 9200 0.9025
1.0437 4.9659 9250 0.9026
0.9296 4.9927 9300 0.9026

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.1.0+cu118
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Youseff1987/nllb-200-finetuning-20250305

Finetuned
(26)
this model

Datasets used to train Youseff1987/nllb-200-finetuning-20250305