File size: 19,326 Bytes
e6aea3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
---
library_name: peft
license: apache-2.0
base_model: google/mt5-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: base-lora
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# base-lora

This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 3.6392
- Rouge1: 7.4493
- Rouge2: 1.1542
- Rougel: 5.9231
- Rougelsum: 5.9247

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 4

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 18.055        | 0.0203 | 5    | 11.6119         | 3.9124 | 0.4535 | 3.3358 | 3.3389    |
| 18.0636       | 0.0407 | 10   | 11.6017         | 3.9202 | 0.4545 | 3.3537 | 3.3567    |
| 18.9841       | 0.0610 | 15   | 11.5957         | 3.9219 | 0.4510 | 3.3520 | 3.3526    |
| 17.1948       | 0.0813 | 20   | 11.5334         | 3.9691 | 0.4539 | 3.3957 | 3.3985    |
| 17.9028       | 0.1016 | 25   | 11.4994         | 3.9721 | 0.4727 | 3.4036 | 3.4074    |
| 16.9331       | 0.1220 | 30   | 11.4228         | 3.9728 | 0.4712 | 3.4087 | 3.4104    |
| 19.6095       | 0.1423 | 35   | 11.3819         | 3.9596 | 0.4660 | 3.3891 | 3.3898    |
| 16.8935       | 0.1626 | 40   | 11.2556         | 3.9354 | 0.4682 | 3.3794 | 3.3839    |
| 16.3195       | 0.1829 | 45   | 11.2026         | 3.9342 | 0.4632 | 3.3706 | 3.3735    |
| 16.091        | 0.2033 | 50   | 11.1314         | 3.9621 | 0.4624 | 3.3697 | 3.3749    |
| 17.8748       | 0.2236 | 55   | 11.0424         | 4.0045 | 0.4748 | 3.4006 | 3.4093    |
| 16.3976       | 0.2439 | 60   | 10.8993         | 4.0416 | 0.4779 | 3.4208 | 3.4330    |
| 15.8952       | 0.2642 | 65   | 10.7943         | 4.0830 | 0.4840 | 3.4525 | 3.4582    |
| 16.3455       | 0.2846 | 70   | 10.6194         | 4.0705 | 0.4937 | 3.4405 | 3.4496    |
| 15.957        | 0.3049 | 75   | 10.4643         | 4.1590 | 0.4948 | 3.5013 | 3.5069    |
| 16.3171       | 0.3252 | 80   | 10.2894         | 4.1872 | 0.4886 | 3.5251 | 3.5294    |
| 14.3489       | 0.3455 | 85   | 10.1070         | 4.0644 | 0.4394 | 3.4238 | 3.4249    |
| 14.9094       | 0.3659 | 90   | 9.9338          | 4.0472 | 0.4433 | 3.4151 | 3.4233    |
| 15.1002       | 0.3862 | 95   | 9.7629          | 4.0657 | 0.4444 | 3.4207 | 3.4287    |
| 13.5895       | 0.4065 | 100  | 9.5797          | 4.1563 | 0.4675 | 3.4797 | 3.4807    |
| 13.5211       | 0.4268 | 105  | 9.3524          | 4.1978 | 0.4748 | 3.5357 | 3.5418    |
| 14.1282       | 0.4472 | 110  | 9.1117          | 4.2016 | 0.4745 | 3.5392 | 3.5417    |
| 13.0696       | 0.4675 | 115  | 8.9185          | 4.1619 | 0.4581 | 3.5058 | 3.5053    |
| 13.5505       | 0.4878 | 120  | 8.7076          | 4.0617 | 0.4730 | 3.4863 | 3.4884    |
| 12.7191       | 0.5081 | 125  | 8.5103          | 4.1287 | 0.5343 | 3.5354 | 3.5418    |
| 11.9604       | 0.5285 | 130  | 8.3253          | 4.2676 | 0.5616 | 3.6476 | 3.6568    |
| 12.1359       | 0.5488 | 135  | 8.1128          | 4.2319 | 0.5450 | 3.6035 | 3.6020    |
| 12.6743       | 0.5691 | 140  | 7.9260          | 4.1931 | 0.4958 | 3.5669 | 3.5720    |
| 10.8556       | 0.5894 | 145  | 7.7804          | 4.3623 | 0.5388 | 3.6812 | 3.6801    |
| 10.6165       | 0.6098 | 150  | 7.6086          | 4.2635 | 0.4790 | 3.6205 | 3.6191    |
| 10.3137       | 0.6301 | 155  | 7.4471          | 4.3057 | 0.5036 | 3.6870 | 3.6775    |
| 10.8264       | 0.6504 | 160  | 7.3016          | 4.2296 | 0.4382 | 3.6120 | 3.6114    |
| 9.707         | 0.6707 | 165  | 7.1770          | 4.4023 | 0.4457 | 3.7598 | 3.7648    |
| 9.0948        | 0.6911 | 170  | 7.0761          | 4.2964 | 0.4772 | 3.7047 | 3.7047    |
| 8.9284        | 0.7114 | 175  | 6.9906          | 4.2649 | 0.4765 | 3.7074 | 3.7053    |
| 8.8882        | 0.7317 | 180  | 6.9178          | 4.0879 | 0.4358 | 3.5706 | 3.5620    |
| 8.4453        | 0.7520 | 185  | 6.8269          | 3.9639 | 0.4308 | 3.4378 | 3.4379    |
| 9.0233        | 0.7724 | 190  | 6.7421          | 3.9524 | 0.4346 | 3.4217 | 3.4278    |
| 8.6501        | 0.7927 | 195  | 6.6840          | 3.8078 | 0.4012 | 3.3565 | 3.3576    |
| 7.8102        | 0.8130 | 200  | 6.6259          | 3.6878 | 0.3628 | 3.2840 | 3.2853    |
| 8.1242        | 0.8333 | 205  | 6.5638          | 3.6267 | 0.3355 | 3.2284 | 3.2301    |
| 8.1765        | 0.8537 | 210  | 6.5169          | 3.5751 | 0.3065 | 3.1854 | 3.1917    |
| 7.8745        | 0.8740 | 215  | 6.4719          | 3.3769 | 0.2566 | 3.0335 | 3.0340    |
| 7.9458        | 0.8943 | 220  | 6.4389          | 3.3340 | 0.2057 | 3.0079 | 3.0027    |
| 7.7474        | 0.9146 | 225  | 6.4035          | 3.3362 | 0.1940 | 3.0041 | 3.0107    |
| 7.7163        | 0.9350 | 230  | 6.3660          | 3.2909 | 0.1677 | 3.0005 | 2.9987    |
| 7.8728        | 0.9553 | 235  | 6.3361          | 3.2088 | 0.1610 | 2.9360 | 2.9278    |
| 7.7745        | 0.9756 | 240  | 6.3165          | 3.1927 | 0.1512 | 2.9150 | 2.9103    |
| 7.2384        | 0.9959 | 245  | 6.2932          | 3.1423 | 0.1330 | 2.8801 | 2.8714    |
| 7.3485        | 1.0163 | 250  | 6.2603          | 3.1407 | 0.1053 | 2.8848 | 2.8752    |
| 7.3794        | 1.0366 | 255  | 6.2228          | 3.0919 | 0.1009 | 2.8536 | 2.8508    |
| 7.3728        | 1.0569 | 260  | 6.1948          | 3.1268 | 0.1007 | 2.8836 | 2.8856    |
| 7.0395        | 1.0772 | 265  | 6.1632          | 3.0872 | 0.1005 | 2.8595 | 2.8619    |
| 6.8601        | 1.0976 | 270  | 6.1160          | 3.1920 | 0.1046 | 2.9322 | 2.9324    |
| 6.9111        | 1.1179 | 275  | 6.0558          | 3.3381 | 0.1192 | 3.0477 | 3.0529    |
| 6.9279        | 1.1382 | 280  | 5.9855          | 3.4293 | 0.1258 | 3.1141 | 3.1218    |
| 6.7801        | 1.1585 | 285  | 5.9176          | 3.4325 | 0.1198 | 3.1306 | 3.1407    |
| 6.7719        | 1.1789 | 290  | 5.8450          | 3.4232 | 0.0948 | 3.1421 | 3.1511    |
| 6.4791        | 1.1992 | 295  | 5.7314          | 3.4339 | 0.0908 | 3.1751 | 3.1826    |
| 6.5032        | 1.2195 | 300  | 5.6052          | 3.5594 | 0.1161 | 3.2799 | 3.2848    |
| 6.3077        | 1.2398 | 305  | 5.4308          | 3.8246 | 0.1526 | 3.4515 | 3.4549    |
| 6.1347        | 1.2602 | 310  | 5.2257          | 4.1937 | 0.2871 | 3.6908 | 3.6980    |
| 6.3168        | 1.2805 | 315  | 5.0692          | 4.3388 | 0.3095 | 3.7995 | 3.8041    |
| 6.1956        | 1.3008 | 320  | 4.9395          | 4.5952 | 0.3955 | 3.9670 | 3.9778    |
| 6.1062        | 1.3211 | 325  | 4.8052          | 5.0407 | 0.5409 | 4.2583 | 4.2650    |
| 5.882         | 1.3415 | 330  | 4.6693          | 5.2838 | 0.5466 | 4.4464 | 4.4589    |
| 5.8539        | 1.3618 | 335  | 4.5528          | 5.5252 | 0.5439 | 4.5970 | 4.6022    |
| 5.4454        | 1.3821 | 340  | 4.4366          | 5.8765 | 0.5977 | 4.8259 | 4.8281    |
| 5.6131        | 1.4024 | 345  | 4.3597          | 6.1999 | 0.6578 | 5.0213 | 5.0229    |
| 5.3551        | 1.4228 | 350  | 4.3115          | 6.2530 | 0.6767 | 5.0826 | 5.0748    |
| 5.1463        | 1.4431 | 355  | 4.2637          | 6.2949 | 0.6428 | 5.0490 | 5.0560    |
| 5.1476        | 1.4634 | 360  | 4.2049          | 6.5195 | 0.7002 | 5.2668 | 5.2701    |
| 5.073         | 1.4837 | 365  | 4.1581          | 6.7561 | 0.8051 | 5.4618 | 5.4527    |
| 5.1701        | 1.5041 | 370  | 4.1223          | 6.7501 | 0.7622 | 5.5161 | 5.5117    |
| 5.1188        | 1.5244 | 375  | 4.0904          | 6.6304 | 0.7352 | 5.4031 | 5.3876    |
| 5.0913        | 1.5447 | 380  | 4.0580          | 6.5183 | 0.7388 | 5.3353 | 5.3260    |
| 5.1114        | 1.5650 | 385  | 4.0236          | 6.3151 | 0.7055 | 5.1585 | 5.1491    |
| 4.7303        | 1.5854 | 390  | 3.9958          | 6.3426 | 0.7244 | 5.1699 | 5.1643    |
| 4.9915        | 1.6057 | 395  | 3.9675          | 6.5585 | 0.7586 | 5.2496 | 5.2469    |
| 4.9144        | 1.6260 | 400  | 3.9462          | 6.5973 | 0.7327 | 5.2719 | 5.2706    |
| 4.7182        | 1.6463 | 405  | 3.9244          | 6.7598 | 0.8297 | 5.4072 | 5.3997    |
| 4.7528        | 1.6667 | 410  | 3.9015          | 6.6425 | 0.7974 | 5.4003 | 5.3979    |
| 4.6383        | 1.6870 | 415  | 3.8839          | 6.6935 | 0.7657 | 5.4553 | 5.4618    |
| 4.6681        | 1.7073 | 420  | 3.8663          | 6.6632 | 0.7815 | 5.4305 | 5.4345    |
| 4.5091        | 1.7276 | 425  | 3.8536          | 6.6817 | 0.8054 | 5.4240 | 5.4254    |
| 4.7573        | 1.7480 | 430  | 3.8444          | 6.7848 | 0.8458 | 5.3968 | 5.4036    |
| 4.5883        | 1.7683 | 435  | 3.8353          | 6.8255 | 0.8781 | 5.4295 | 5.4383    |
| 4.813         | 1.7886 | 440  | 3.8243          | 6.9426 | 0.8849 | 5.5732 | 5.5796    |
| 4.8158        | 1.8089 | 445  | 3.8155          | 6.9445 | 0.8937 | 5.6175 | 5.6261    |
| 4.8598        | 1.8293 | 450  | 3.8080          | 6.9469 | 0.9113 | 5.6107 | 5.6260    |
| 4.4413        | 1.8496 | 455  | 3.7979          | 7.0513 | 0.9088 | 5.6948 | 5.7109    |
| 4.5718        | 1.8699 | 460  | 3.7883          | 7.0571 | 0.8827 | 5.6223 | 5.6259    |
| 4.5612        | 1.8902 | 465  | 3.7809          | 7.2025 | 0.9906 | 5.7196 | 5.7318    |
| 4.4533        | 1.9106 | 470  | 3.7743          | 7.3264 | 1.0365 | 5.8895 | 5.8971    |
| 4.6964        | 1.9309 | 475  | 3.7670          | 7.2795 | 1.0343 | 5.8303 | 5.8344    |
| 4.4532        | 1.9512 | 480  | 3.7606          | 7.2770 | 1.0298 | 5.8453 | 5.8516    |
| 4.4845        | 1.9715 | 485  | 3.7572          | 7.1885 | 1.0745 | 5.8021 | 5.8125    |
| 4.5913        | 1.9919 | 490  | 3.7540          | 7.1857 | 1.0814 | 5.8054 | 5.8130    |
| 4.4704        | 2.0122 | 495  | 3.7503          | 7.2876 | 1.0921 | 5.9138 | 5.9150    |
| 4.4753        | 2.0325 | 500  | 3.7453          | 7.3807 | 1.1092 | 5.9596 | 5.9700    |
| 4.4391        | 2.0528 | 505  | 3.7409          | 7.4010 | 1.0870 | 5.9533 | 5.9628    |
| 4.4315        | 2.0732 | 510  | 3.7382          | 7.3382 | 1.1040 | 5.9099 | 5.9189    |
| 4.2783        | 2.0935 | 515  | 3.7349          | 7.2988 | 1.0887 | 5.8870 | 5.8919    |
| 4.3117        | 2.1138 | 520  | 3.7307          | 7.3069 | 1.0594 | 5.9009 | 5.9126    |
| 4.318         | 2.1341 | 525  | 3.7271          | 7.3261 | 1.0414 | 5.9149 | 5.9164    |
| 4.2801        | 2.1545 | 530  | 3.7242          | 7.3054 | 1.0500 | 5.8905 | 5.8894    |
| 4.6391        | 2.1748 | 535  | 3.7208          | 7.3645 | 1.0492 | 5.9397 | 5.9380    |
| 4.3956        | 2.1951 | 540  | 3.7156          | 7.4189 | 1.1015 | 5.9650 | 5.9696    |
| 4.421         | 2.2154 | 545  | 3.7123          | 7.3371 | 1.0676 | 5.8936 | 5.9028    |
| 4.4005        | 2.2358 | 550  | 3.7090          | 7.3309 | 1.0480 | 5.9009 | 5.8999    |
| 4.363         | 2.2561 | 555  | 3.7065          | 7.2813 | 1.0386 | 5.8488 | 5.8472    |
| 4.3898        | 2.2764 | 560  | 3.7041          | 7.2897 | 1.0231 | 5.8119 | 5.8203    |
| 4.4024        | 2.2967 | 565  | 3.7014          | 7.2729 | 0.9973 | 5.8061 | 5.8105    |
| 4.2522        | 2.3171 | 570  | 3.7001          | 7.3704 | 1.0285 | 5.8480 | 5.8460    |
| 4.2606        | 2.3374 | 575  | 3.6985          | 7.3538 | 1.0364 | 5.8260 | 5.8280    |
| 4.5745        | 2.3577 | 580  | 3.6977          | 7.3535 | 1.0351 | 5.8424 | 5.8435    |
| 4.5558        | 2.3780 | 585  | 3.6959          | 7.3292 | 1.0259 | 5.8504 | 5.8529    |
| 4.29          | 2.3984 | 590  | 3.6928          | 7.3560 | 1.0402 | 5.8870 | 5.8893    |
| 4.4577        | 2.4187 | 595  | 3.6897          | 7.3185 | 1.0389 | 5.8524 | 5.8577    |
| 4.3417        | 2.4390 | 600  | 3.6872          | 7.2916 | 1.0302 | 5.8332 | 5.8447    |
| 4.3844        | 2.4593 | 605  | 3.6861          | 7.2599 | 1.0272 | 5.8240 | 5.8345    |
| 4.2335        | 2.4797 | 610  | 3.6858          | 7.3136 | 1.0066 | 5.8726 | 5.8854    |
| 4.2669        | 2.5    | 615  | 3.6854          | 7.3474 | 1.0671 | 5.9062 | 5.9133    |
| 4.3353        | 2.5203 | 620  | 3.6842          | 7.3982 | 1.0725 | 5.9494 | 5.9494    |
| 4.1778        | 2.5407 | 625  | 3.6834          | 7.4003 | 1.0443 | 5.9376 | 5.9381    |
| 4.1977        | 2.5610 | 630  | 3.6823          | 7.4257 | 1.0581 | 5.9566 | 5.9641    |
| 4.1946        | 2.5813 | 635  | 3.6810          | 7.4195 | 1.0472 | 5.9493 | 5.9564    |
| 4.2247        | 2.6016 | 640  | 3.6798          | 7.4042 | 1.0559 | 5.9337 | 5.9440    |
| 4.0221        | 2.6220 | 645  | 3.6784          | 7.3767 | 1.0109 | 5.8896 | 5.8969    |
| 4.0861        | 2.6423 | 650  | 3.6780          | 7.3535 | 1.0284 | 5.8929 | 5.8947    |
| 4.5216        | 2.6626 | 655  | 3.6776          | 7.4155 | 1.0361 | 5.9195 | 5.9223    |
| 4.5452        | 2.6829 | 660  | 3.6759          | 7.3827 | 1.0340 | 5.8694 | 5.8700    |
| 4.2545        | 2.7033 | 665  | 3.6737          | 7.2863 | 1.0115 | 5.8254 | 5.8315    |
| 4.2745        | 2.7236 | 670  | 3.6717          | 7.3155 | 1.0193 | 5.8232 | 5.8272    |
| 4.0946        | 2.7439 | 675  | 3.6704          | 7.3085 | 0.9942 | 5.8310 | 5.8384    |
| 4.1751        | 2.7642 | 680  | 3.6698          | 7.2713 | 0.9714 | 5.7913 | 5.7933    |
| 4.2766        | 2.7846 | 685  | 3.6698          | 7.3288 | 0.9622 | 5.7999 | 5.8011    |
| 4.2975        | 2.8049 | 690  | 3.6691          | 7.3673 | 0.9866 | 5.8446 | 5.8462    |
| 4.259         | 2.8252 | 695  | 3.6677          | 7.3719 | 0.9800 | 5.8379 | 5.8402    |
| 4.1375        | 2.8455 | 700  | 3.6658          | 7.3096 | 0.9819 | 5.7909 | 5.7937    |
| 4.1123        | 2.8659 | 705  | 3.6639          | 7.3239 | 0.9947 | 5.7972 | 5.8001    |
| 4.3939        | 2.8862 | 710  | 3.6620          | 7.3352 | 1.0033 | 5.7864 | 5.7918    |
| 4.3558        | 2.9065 | 715  | 3.6605          | 7.2803 | 0.9570 | 5.7227 | 5.7274    |
| 4.2339        | 2.9268 | 720  | 3.6593          | 7.2910 | 0.9812 | 5.7466 | 5.7553    |
| 4.3709        | 2.9472 | 725  | 3.6581          | 7.3238 | 0.9818 | 5.7739 | 5.7823    |
| 4.2776        | 2.9675 | 730  | 3.6574          | 7.3445 | 1.0330 | 5.8094 | 5.8181    |
| 4.1297        | 2.9878 | 735  | 3.6567          | 7.3059 | 1.0122 | 5.7926 | 5.8050    |
| 4.0283        | 3.0081 | 740  | 3.6563          | 7.3800 | 1.0397 | 5.8484 | 5.8597    |
| 4.1927        | 3.0285 | 745  | 3.6557          | 7.4114 | 1.0440 | 5.8713 | 5.8851    |
| 4.211         | 3.0488 | 750  | 3.6550          | 7.4497 | 1.0687 | 5.8968 | 5.9149    |
| 4.2516        | 3.0691 | 755  | 3.6543          | 7.3851 | 1.0738 | 5.8845 | 5.8986    |
| 4.2483        | 3.0894 | 760  | 3.6537          | 7.3906 | 1.0650 | 5.8817 | 5.8902    |
| 4.1612        | 3.1098 | 765  | 3.6526          | 7.3874 | 1.0756 | 5.8775 | 5.8829    |
| 4.2832        | 3.1301 | 770  | 3.6523          | 7.3994 | 1.0785 | 5.8812 | 5.8828    |
| 4.1306        | 3.1504 | 775  | 3.6520          | 7.4652 | 1.0993 | 5.9060 | 5.9157    |
| 4.1866        | 3.1707 | 780  | 3.6514          | 7.4709 | 1.0999 | 5.9119 | 5.9281    |
| 4.2834        | 3.1911 | 785  | 3.6513          | 7.4435 | 1.1007 | 5.8919 | 5.9057    |
| 4.2565        | 3.2114 | 790  | 3.6509          | 7.4210 | 1.1029 | 5.8853 | 5.8970    |
| 4.1566        | 3.2317 | 795  | 3.6506          | 7.4071 | 1.0829 | 5.8674 | 5.8756    |
| 4.1584        | 3.2520 | 800  | 3.6506          | 7.4042 | 1.0822 | 5.8690 | 5.8797    |
| 4.2728        | 3.2724 | 805  | 3.6502          | 7.4338 | 1.0872 | 5.9021 | 5.9114    |
| 4.2469        | 3.2927 | 810  | 3.6495          | 7.4174 | 1.0871 | 5.8931 | 5.8974    |
| 4.1297        | 3.3130 | 815  | 3.6486          | 7.4335 | 1.0861 | 5.9093 | 5.9129    |
| 4.1833        | 3.3333 | 820  | 3.6479          | 7.3910 | 1.0857 | 5.8893 | 5.8916    |
| 4.1635        | 3.3537 | 825  | 3.6476          | 7.3714 | 1.0801 | 5.8781 | 5.8791    |
| 3.9877        | 3.3740 | 830  | 3.6473          | 7.4043 | 1.0756 | 5.8988 | 5.9025    |
| 4.1508        | 3.3943 | 835  | 3.6471          | 7.3886 | 1.1179 | 5.9008 | 5.9095    |
| 4.3126        | 3.4146 | 840  | 3.6462          | 7.4321 | 1.1149 | 5.9282 | 5.9401    |
| 4.0401        | 3.4350 | 845  | 3.6451          | 7.4416 | 1.1107 | 5.9291 | 5.9376    |
| 4.3478        | 3.4553 | 850  | 3.6443          | 7.4642 | 1.1471 | 5.9633 | 5.9683    |
| 4.2058        | 3.4756 | 855  | 3.6434          | 7.4925 | 1.1631 | 5.9774 | 5.9888    |
| 4.1838        | 3.4959 | 860  | 3.6429          | 7.4792 | 1.1740 | 5.9757 | 5.9832    |
| 4.2264        | 3.5163 | 865  | 3.6425          | 7.4400 | 1.1621 | 5.9491 | 5.9554    |
| 4.4029        | 3.5366 | 870  | 3.6423          | 7.4487 | 1.1656 | 5.9588 | 5.9652    |
| 4.2859        | 3.5569 | 875  | 3.6424          | 7.3773 | 1.1202 | 5.8960 | 5.8970    |
| 3.9724        | 3.5772 | 880  | 3.6421          | 7.4120 | 1.1085 | 5.8966 | 5.9021    |
| 4.1194        | 3.5976 | 885  | 3.6419          | 7.4185 | 1.1191 | 5.8907 | 5.8964    |
| 4.2119        | 3.6179 | 890  | 3.6418          | 7.4229 | 1.1225 | 5.8875 | 5.8906    |
| 4.31          | 3.6382 | 895  | 3.6417          | 7.4117 | 1.1135 | 5.8895 | 5.8924    |
| 4.1687        | 3.6585 | 900  | 3.6414          | 7.4160 | 1.1085 | 5.8995 | 5.9034    |
| 4.2521        | 3.6789 | 905  | 3.6410          | 7.4245 | 1.1163 | 5.9010 | 5.9052    |
| 4.2049        | 3.6992 | 910  | 3.6410          | 7.4805 | 1.1203 | 5.9386 | 5.9397    |
| 4.1337        | 3.7195 | 915  | 3.6410          | 7.4671 | 1.1161 | 5.9266 | 5.9346    |
| 4.2343        | 3.7398 | 920  | 3.6408          | 7.4772 | 1.1250 | 5.9271 | 5.9351    |
| 4.2839        | 3.7602 | 925  | 3.6408          | 7.4479 | 1.1431 | 5.9088 | 5.9203    |
| 4.0611        | 3.7805 | 930  | 3.6405          | 7.4513 | 1.1501 | 5.9157 | 5.9203    |
| 4.1894        | 3.8008 | 935  | 3.6402          | 7.4433 | 1.1523 | 5.9014 | 5.9078    |
| 4.3014        | 3.8211 | 940  | 3.6400          | 7.4433 | 1.1523 | 5.9014 | 5.9078    |
| 4.2223        | 3.8415 | 945  | 3.6397          | 7.4435 | 1.1523 | 5.9094 | 5.9158    |
| 4.344         | 3.8618 | 950  | 3.6396          | 7.4501 | 1.1584 | 5.9262 | 5.9271    |
| 4.1072        | 3.8821 | 955  | 3.6395          | 7.4460 | 1.1544 | 5.9222 | 5.9231    |
| 4.1193        | 3.9024 | 960  | 3.6394          | 7.4422 | 1.1544 | 5.9222 | 5.9231    |
| 4.0124        | 3.9228 | 965  | 3.6393          | 7.4291 | 1.1467 | 5.9093 | 5.9130    |
| 4.1924        | 3.9431 | 970  | 3.6393          | 7.4291 | 1.1467 | 5.9093 | 5.9130    |
| 4.2722        | 3.9634 | 975  | 3.6392          | 7.4493 | 1.1542 | 5.9231 | 5.9247    |
| 4.1369        | 3.9837 | 980  | 3.6392          | 7.4493 | 1.1542 | 5.9231 | 5.9247    |


### Framework versions

- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0