45708a0d059ed93530d491c789594362

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [fi-no] dataset. It achieves the following results on the evaluation set:

Loss: 2.1243
Data Size: 1.0
Epoch Runtime: 42.2031
Bleu: 1.2243

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.9249	0	3.4669	0.1182
No log	1	85	3.6615	0.0078	4.2294	0.1103
No log	2	170	3.4019	0.0156	4.9947	0.1269
No log	3	255	3.2351	0.0312	7.2919	0.2225
No log	4	340	2.9875	0.0625	10.0975	0.4917
0.2103	5	425	2.8348	0.125	12.4218	0.4585
0.2103	6	510	2.6954	0.25	16.6129	0.5159
0.9022	7	595	2.5696	0.5	25.6759	0.5955
2.6758	8.0	680	2.4300	1.0	40.2808	0.6996
2.483	9.0	765	2.3450	1.0	40.1323	0.8415
2.3831	10.0	850	2.2836	1.0	40.2233	0.9187
2.312	11.0	935	2.2384	1.0	42.9298	0.9976
2.2102	12.0	1020	2.2046	1.0	41.1967	1.0499
2.1434	13.0	1105	2.1767	1.0	40.9961	1.0440
2.0855	14.0	1190	2.1685	1.0	43.2021	1.0962
2.0334	15.0	1275	2.1549	1.0	40.8731	1.1771
1.9746	16.0	1360	2.1352	1.0	40.2888	1.1626
1.9178	17.0	1445	2.1351	1.0	40.4238	1.1299
1.8783	18.0	1530	2.1240	1.0	41.0342	1.1383
1.832	19.0	1615	2.1213	1.0	42.7702	1.2087
1.7738	20.0	1700	2.1065	1.0	46.3673	1.1660
1.7315	21.0	1785	2.1132	1.0	41.6451	1.2002
1.7067	22.0	1870	2.1156	1.0	42.7100	1.1664
1.6444	23.0	1955	2.1311	1.0	47.6908	1.1957
1.6231	24.0	2040	2.1243	1.0	42.2031	1.2243

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 5

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/45708a0d059ed93530d491c789594362

Base model

google-t5/t5-large

Finetuned

(171)

this model

contemmcm
/

45708a0d059ed93530d491c789594362

45708a0d059ed93530d491c789594362

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for contemmcm/45708a0d059ed93530d491c789594362

Evaluation results