9820033747fd2bc74e50b940454ea100
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [it-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 2.8058
- Data Size: 1.0
- Epoch Runtime: 32.4416
- Bleu: 0.1985
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 220.5446 | 0 | 2.6730 | 0.0036 |
| No log | 1 | 58 | 204.5957 | 0.0078 | 4.3093 | 0.0036 |
| No log | 2 | 116 | 191.1346 | 0.0156 | 3.9760 | 0.0036 |
| No log | 3 | 174 | 165.1904 | 0.0312 | 5.5890 | 0.0029 |
| No log | 4 | 232 | 132.2721 | 0.0625 | 7.4449 | 0.0026 |
| No log | 5 | 290 | 88.7089 | 0.125 | 10.7634 | 0.0013 |
| 11.694 | 6 | 348 | 40.9796 | 0.25 | 14.3784 | 0.0011 |
| 8.7816 | 7 | 406 | 17.6196 | 0.5 | 19.4515 | 0.0038 |
| 20.3087 | 8.0 | 464 | 10.8403 | 1.0 | 33.8301 | 0.0051 |
| 17.5162 | 9.0 | 522 | 9.2789 | 1.0 | 30.4905 | 0.0295 |
| 14.7784 | 10.0 | 580 | 8.4976 | 1.0 | 31.4386 | 0.0104 |
| 13.0537 | 11.0 | 638 | 7.9470 | 1.0 | 30.9553 | 0.0278 |
| 11.9767 | 12.0 | 696 | 7.6253 | 1.0 | 31.7243 | 0.0265 |
| 10.465 | 13.0 | 754 | 6.7038 | 1.0 | 30.7987 | 0.0490 |
| 9.8629 | 14.0 | 812 | 6.1852 | 1.0 | 31.0795 | 0.0481 |
| 9.1462 | 15.0 | 870 | 5.7621 | 1.0 | 31.1096 | 0.0682 |
| 8.6466 | 16.0 | 928 | 5.5545 | 1.0 | 30.6564 | 0.0612 |
| 8.209 | 17.0 | 986 | 5.2784 | 1.0 | 30.7827 | 0.0320 |
| 7.8017 | 18.0 | 1044 | 5.1448 | 1.0 | 31.3435 | 0.0801 |
| 7.2001 | 19.0 | 1102 | 4.7046 | 1.0 | 30.7288 | 0.0777 |
| 6.8989 | 20.0 | 1160 | 4.5510 | 1.0 | 30.7220 | 0.0732 |
| 6.6344 | 21.0 | 1218 | 4.4494 | 1.0 | 30.8027 | 0.1000 |
| 6.42 | 22.0 | 1276 | 4.3321 | 1.0 | 30.9185 | 0.0775 |
| 6.1573 | 23.0 | 1334 | 4.1810 | 1.0 | 30.7196 | 0.0709 |
| 6.0025 | 24.0 | 1392 | 4.1812 | 1.0 | 30.7153 | 0.0762 |
| 5.637 | 25.0 | 1450 | 3.9501 | 1.0 | 32.2425 | 0.0785 |
| 5.5103 | 26.0 | 1508 | 3.7917 | 1.0 | 31.7518 | 0.1297 |
| 5.3807 | 27.0 | 1566 | 3.7109 | 1.0 | 32.4725 | 0.0926 |
| 5.2493 | 28.0 | 1624 | 3.6484 | 1.0 | 32.1762 | 0.0779 |
| 5.0971 | 29.0 | 1682 | 3.7100 | 1.0 | 32.0763 | 0.0838 |
| 4.9723 | 30.0 | 1740 | 3.3917 | 1.0 | 32.5480 | 0.0943 |
| 4.8582 | 31.0 | 1798 | 3.4487 | 1.0 | 31.8977 | 0.0952 |
| 4.6482 | 32.0 | 1856 | 3.3220 | 1.0 | 32.1636 | 0.1071 |
| 4.5502 | 33.0 | 1914 | 3.3602 | 1.0 | 31.9986 | 0.0914 |
| 4.4548 | 34.0 | 1972 | 3.2248 | 1.0 | 31.7632 | 0.1156 |
| 4.3663 | 35.0 | 2030 | 3.2001 | 1.0 | 31.7785 | 0.1274 |
| 4.2816 | 36.0 | 2088 | 3.1693 | 1.0 | 32.2818 | 0.0931 |
| 4.2056 | 37.0 | 2146 | 3.0939 | 1.0 | 32.5222 | 0.0870 |
| 4.0655 | 38.0 | 2204 | 3.2121 | 1.0 | 32.3663 | 0.0873 |
| 3.9848 | 39.0 | 2262 | 2.9937 | 1.0 | 32.6995 | 0.1050 |
| 3.9345 | 40.0 | 2320 | 2.9854 | 1.0 | 32.4806 | 0.1002 |
| 3.8787 | 41.0 | 2378 | 2.9914 | 1.0 | 32.3101 | 0.1151 |
| 3.7917 | 42.0 | 2436 | 2.9243 | 1.0 | 32.5237 | 0.1507 |
| 3.7642 | 43.0 | 2494 | 2.8550 | 1.0 | 32.1596 | 0.1668 |
| 3.66 | 44.0 | 2552 | 2.9300 | 1.0 | 32.1260 | 0.1921 |
| 3.6051 | 45.0 | 2610 | 2.8908 | 1.0 | 31.8458 | 0.1565 |
| 3.5778 | 46.0 | 2668 | 2.8238 | 1.0 | 32.6554 | 0.1940 |
| 3.534 | 47.0 | 2726 | 2.8003 | 1.0 | 31.9395 | 0.1799 |
| 3.4921 | 48.0 | 2784 | 2.8139 | 1.0 | 32.5989 | 0.1728 |
| 3.4553 | 49.0 | 2842 | 2.8279 | 1.0 | 33.0087 | 0.2065 |
| 3.3782 | 50.0 | 2900 | 2.8058 | 1.0 | 32.4416 | 0.1985 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/9820033747fd2bc74e50b940454ea100
Base model
google/long-t5-local-large