Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

flan-t5laa2-large

This model is a fine-tuned version of hrezaei/flan-t5laa2-large on the HuggingFaceFW/fineweb sample-350BT dataset. It achieves the following results on the evaluation set:

  • Perplexity: 1.1522
  • Loss: 0.1417
  • Accuracy: 0.0025
  • Lookahead Perplexity: 523.9169
  • Lookahead Loss: 6.2613
  • Base Perplexity: 1.1386
  • Base Loss: 0.1298

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 524288

Training results

Training Loss Epoch Step Accuracy Base Loss Base Perplexity Lookahead Loss Lookahead Perplexity Validation Loss Perplexity
0.3249 0.0095 5000 0.0025 0.1298 1.1386 9.1628 9536.0974 0.1473 1.1587
0.3149 0.0191 10000 0.0025 0.1298 1.1386 8.2987 4018.5587 0.1457 1.1568
0.3455 0.0286 15000 0.0025 0.1298 1.1386 7.8543 2576.7294 0.1448 1.1558
0.3164 0.0381 20000 0.0025 0.1298 1.1386 7.6043 2006.8105 0.1443 1.1552
0.3412 0.0477 25000 0.0025 0.1298 1.1386 7.4405 1703.6775 0.1440 1.1549
0.3334 0.0572 30000 0.0025 0.1298 1.1386 7.3224 1513.8265 0.1438 1.1546
0.3182 0.0668 35000 0.0025 0.1298 1.1386 7.2325 1383.6223 0.1436 1.1544
0.3193 0.0763 40000 0.0025 0.1298 1.1386 7.1599 1286.7995 0.1434 1.1542
0.3112 0.0858 45000 0.0025 0.1298 1.1386 7.1003 1212.3711 0.1433 1.1541
0.3084 0.0954 50000 0.0025 0.1298 1.1386 7.0527 1155.9677 0.1432 1.1540
0.3281 0.1049 55000 0.0025 0.1298 1.1386 7.0097 1107.3615 0.1431 1.1539
0.3096 0.1144 60000 0.0025 0.1298 1.1386 6.9716 1065.9302 0.1431 1.1538
0.3168 1.0048 65000 0.0025 0.1298 1.1386 6.9373 1029.9387 0.1430 1.1537
0.3158 1.0143 70000 0.0025 0.1298 1.1386 6.9058 998.0076 0.1429 1.1536
0.3149 1.0238 75000 0.0025 0.1298 1.1386 6.8755 968.2614 0.1429 1.1536
0.3185 1.0334 80000 0.0025 0.1298 1.1386 6.8495 943.4458 0.1428 1.1535
0.3247 1.0429 85000 0.0025 0.1298 1.1386 6.8251 920.7070 0.1428 1.1535
0.3166 1.0525 90000 0.0025 0.1298 1.1386 6.8033 900.7799 0.1427 1.1534
0.3171 1.0620 95000 0.0025 0.1298 1.1386 6.7802 880.2640 0.1427 1.1534
0.3247 1.0715 100000 0.0025 0.1298 1.1386 6.7589 861.6873 0.1426 1.1533
0.3199 1.0095 105000 0.0025 0.1298 1.1386 6.7393 844.9302 0.1426 1.1533
0.3116 1.0191 110000 0.0025 0.1298 1.1386 6.7202 828.9964 0.1426 1.1532
0.3431 1.0286 115000 0.0025 0.1298 1.1386 6.7028 814.7008 0.1425 1.1532
0.3145 1.0381 120000 0.0025 0.1298 1.1386 6.6864 801.3963 0.1425 1.1531
0.3396 1.0477 125000 0.0025 0.1298 1.1386 6.6714 789.4998 0.1425 1.1531
0.332 1.0572 130000 0.0025 0.1298 1.1386 6.6562 777.5850 0.1424 1.1531
0.3169 1.0668 135000 0.0025 0.1298 1.1386 6.6411 765.9321 0.1424 1.1530
0.3183 1.0763 140000 0.0025 0.1298 1.1386 6.6259 754.4168 0.1424 1.1530
0.3102 1.0858 145000 0.0025 0.1298 1.1386 6.6117 743.7487 0.1423 1.1530
0.3075 1.0954 150000 0.0025 0.1298 1.1386 6.6002 735.2481 0.1423 1.1530
0.3272 1.1049 155000 0.0025 0.1298 1.1386 6.5881 726.3988 0.1423 1.1529
0.3088 1.1144 160000 0.0025 0.1298 1.1386 6.5765 717.9999 0.1423 1.1529
0.316 2.0048 165000 0.0025 0.1298 1.1386 6.5648 709.6853 0.1423 1.1529
0.315 2.0143 170000 0.0025 0.1298 1.1386 6.5536 701.7924 0.1422 1.1528
0.3142 2.0238 175000 0.0025 0.1298 1.1386 6.5417 693.4763 0.1422 1.1528
0.3178 2.0334 180000 0.0025 0.1298 1.1386 6.5319 686.6713 0.1422 1.1528
0.324 2.0429 185000 0.0025 0.1298 1.1386 6.5221 680.0125 0.1422 1.1528
0.316 2.0525 190000 0.0025 0.1298 1.1386 6.5135 674.1869 0.1422 1.1528
0.3165 2.0620 195000 0.0025 0.1298 1.1386 6.5032 667.2772 0.1421 1.1527
0.3241 2.0715 200000 0.0025 0.1298 1.1386 6.4936 660.9243 0.1421 1.1527
0.3166 1.0095 205000 0.0025 0.1298 1.1386 6.4848 655.1015 0.1421 1.1527
0.3104 1.0191 210000 0.0025 0.1298 1.1386 6.4757 649.1789 0.1421 1.1527
0.3416 1.0286 215000 0.0025 0.1298 1.1386 6.4674 643.8252 0.1421 1.1526
0.3152 1.0381 220000 0.0025 0.1298 1.1386 6.4597 638.8783 0.1420 1.1526
0.3391 1.0477 225000 0.0025 0.1298 1.1386 6.4528 634.4516 0.1420 1.1526
0.3319 1.0572 230000 0.0025 0.1298 1.1386 6.4455 629.8329 0.1420 1.1526
0.3171 1.0668 235000 0.0025 0.1298 1.1386 6.4376 624.9101 0.1420 1.1526
0.3176 1.0763 240000 0.0025 0.1298 1.1386 6.4300 620.1895 0.1420 1.1526
0.3101 1.0858 245000 0.0025 0.1298 1.1386 6.4223 615.4315 0.1420 1.1526
0.3081 1.0954 250000 0.0025 0.1298 1.1386 6.4166 611.9308 0.1420 1.1525
0.3277 1.1049 255000 0.0025 0.1298 1.1386 6.4105 608.1798 0.1420 1.1525
0.3083 1.1144 260000 0.0025 0.1298 1.1386 6.4047 604.6944 0.1419 1.1525
0.3162 2.0048 265000 0.0025 0.1298 1.1386 6.3984 600.9119 0.1419 1.1525
0.3118 2.0143 270000 0.0025 0.1298 1.1386 6.3924 597.3080 0.1419 1.1525
0.314 2.0238 275000 0.0025 0.1298 1.1386 6.3858 593.3590 0.1419 1.1525
0.3149 2.0334 280000 0.0025 0.1298 1.1386 6.3805 590.2321 0.1419 1.1525
0.3232 2.0429 285000 0.0025 0.1298 1.1386 6.3752 587.1187 0.1419 1.1524
0.3179 2.0525 290000 0.0025 0.1298 1.1386 6.3707 584.4872 0.1419 1.1524
0.3149 2.0620 295000 0.0025 0.1298 1.1386 6.3652 581.2809 0.1419 1.1524
0.3259 2.0715 300000 0.0025 0.1298 1.1386 6.3602 578.3447 0.1419 1.1524
0.3389 2.0811 305000 0.0025 0.1298 1.1386 6.3549 575.3080 0.1418 1.1524
0.3119 2.0906 310000 0.0025 0.1298 1.1386 6.3502 572.6125 0.1418 1.1524
0.3149 2.1001 315000 0.0025 0.1298 1.1386 6.3465 570.4726 0.1418 1.1524
0.3289 2.1097 320000 0.0025 0.1298 1.1386 6.3422 568.0665 0.1418 1.1524
0.3111 2.1192 325000 0.0025 0.1298 1.1386 6.3381 565.7143 0.1418 1.1524
0.3179 3.0095 330000 0.0025 0.1298 1.1386 6.3341 563.4519 0.1418 1.1524
0.3094 3.0191 335000 0.0025 0.1298 1.1386 6.3298 561.0462 0.1418 1.1523
0.3396 3.0286 340000 0.0025 0.1298 1.1386 6.3259 558.8331 0.1418 1.1523
0.314 3.0381 345000 0.0025 0.1298 1.1386 6.3223 556.8332 0.1418 1.1523
0.3392 3.0477 350000 0.0025 0.1298 1.1386 6.3190 555.0390 0.1418 1.1523
0.3319 3.0572 355000 0.0025 0.1298 1.1386 6.3157 553.2072 0.1418 1.1523
0.3169 3.0668 360000 0.0025 0.1298 1.1386 6.3122 551.2481 0.1418 1.1523
0.3166 3.0763 365000 0.0025 0.1298 1.1386 6.3088 549.3597 0.1418 1.1523
0.3087 3.0858 370000 0.0025 0.1298 1.1386 6.3053 547.4488 0.1417 1.1523
0.3074 3.0954 375000 0.0025 0.1298 1.1386 6.3028 546.1247 0.1417 1.1523
0.3288 3.1049 380000 0.0025 0.1298 1.1386 6.3002 544.7043 0.1417 1.1523
0.3082 3.1144 385000 0.0025 0.1298 1.1386 6.2977 543.3141 0.1417 1.1523
0.3171 4.0048 390000 0.0025 0.1298 1.1386 6.2950 541.8326 0.1417 1.1523
0.3132 4.0143 395000 0.0025 0.1298 1.1386 6.2925 540.4888 0.1417 1.1523
0.3153 4.0238 400000 0.0025 0.1298 1.1386 6.2896 538.9527 0.1417 1.1523
0.317 4.0334 405000 0.0025 0.1298 1.1386 6.2874 537.7680 0.1417 1.1522
0.3161 1.0095 410000 1.1522 0.1417 0.0025 536.6438 6.2853 1.1386 0.1298
0.31 1.0191 415000 1.1522 0.1417 0.0025 535.4539 6.2831 1.1386 0.1298
0.3412 1.0286 420000 1.1522 0.1417 0.0025 534.3656 6.2811 1.1386 0.1298
0.3148 1.0381 425000 1.1522 0.1417 0.0025 533.3932 6.2793 1.1386 0.1298
0.3387 1.0477 430000 1.1522 0.1417 0.0025 532.5283 6.2776 1.1386 0.1298
0.3316 1.0572 435000 1.1522 0.1417 0.0025 531.6586 6.2760 1.1386 0.1298
0.3168 1.0668 440000 1.1522 0.1417 0.0025 530.7140 6.2742 1.1386 0.1298
0.3173 1.0763 445000 1.1522 0.1417 0.0025 529.8676 6.2726 1.1386 0.1298
0.3098 1.0858 450000 1.1522 0.1417 0.0025 529.0170 6.2710 1.1386 0.1298
0.3078 1.0954 455000 1.1522 0.1417 0.0025 528.4528 6.2700 1.1386 0.1298
0.3274 1.1049 460000 1.1522 0.1417 0.0025 527.8631 6.2688 1.1386 0.1298
0.308 1.1144 465000 1.1522 0.1417 0.0025 527.3422 6.2678 1.1386 0.1298
0.3159 2.0048 470000 1.1522 0.1417 0.0025 526.7932 6.2668 1.1386 0.1298
0.3115 2.0143 475000 1.1522 0.1417 0.0025 526.2932 6.2659 1.1386 0.1298
0.3137 2.0238 480000 1.1522 0.1417 0.0025 525.7652 6.2649 1.1386 0.1298
0.3146 2.0334 485000 1.1522 0.1417 0.0025 525.3872 6.2641 1.1386 0.1298
0.323 2.0429 490000 1.1522 0.1417 0.0025 525.0387 6.2635 1.1386 0.1298
0.3177 2.0525 495000 1.1522 0.1417 0.0025 524.7778 6.2630 1.1386 0.1298
0.3147 2.0620 500000 1.1522 0.1417 0.0025 524.4927 6.2624 1.1386 0.1298
0.3257 2.0715 505000 1.1522 0.1417 0.0025 524.2781 6.2620 1.1386 0.1298
0.3387 2.0811 510000 1.1522 0.1417 0.0025 524.1006 6.2617 1.1386 0.1298
0.3117 2.0906 515000 1.1522 0.1417 0.0025 523.9845 6.2615 1.1386 0.1298
0.3147 2.1001 520000 1.1522 0.1417 0.0025 523.9337 6.2614 1.1386 0.1298

Framework versions

  • Transformers 4.57.0.dev0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
61
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hrezaei/flan-t5laa2-large

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train hrezaei/flan-t5laa2-large

Evaluation results