|
2024-07-30 02:13:30,544 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:13:30,544 Training Model |
|
2024-07-30 02:13:30,544 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:13:30,544 Translator( |
|
(encoder): EncoderLSTM( |
|
(embedding): Embedding(107, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 512, batch_first=True) |
|
) |
|
(decoder): DecoderLSTM( |
|
(embedding): Embedding(128, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 512, batch_first=True) |
|
(attention): DotProductAttention( |
|
(softmax): Softmax(dim=-1) |
|
(combined2hidden): Sequential( |
|
(0): Linear(in_features=1024, out_features=512, bias=True) |
|
(1): ReLU() |
|
) |
|
) |
|
(hidden2vocab): Linear(in_features=512, out_features=128, bias=True) |
|
(log_softmax): LogSoftmax(dim=-1) |
|
) |
|
) |
|
2024-07-30 02:13:30,545 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:13:30,545 Training Hyperparameters: |
|
2024-07-30 02:13:30,545 - max_epochs: 10 |
|
2024-07-30 02:13:30,545 - learning_rate: 0.001 |
|
2024-07-30 02:13:30,545 - batch_size: 128 |
|
2024-07-30 02:13:30,545 - patience: 5 |
|
2024-07-30 02:13:30,545 - scheduler_patience: 3 |
|
2024-07-30 02:13:30,545 - teacher_forcing_ratio: 0.5 |
|
2024-07-30 02:13:30,545 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:13:30,545 Computational Parameters: |
|
2024-07-30 02:13:30,545 - num_workers: 4 |
|
2024-07-30 02:13:30,545 - device: device(type='cuda', index=0) |
|
2024-07-30 02:13:30,545 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:13:30,545 Dataset Splits: |
|
2024-07-30 02:13:30,545 - train: 85949 data points |
|
2024-07-30 02:13:30,545 - dev: 12279 data points |
|
2024-07-30 02:13:30,545 - test: 24557 data points |
|
2024-07-30 02:13:30,545 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:13:30,545 EPOCH 1 |
|
2024-07-30 02:15:42,182 batch 67/672 - loss 3.19545212 - lr 0.0010 - time 131.64s |
|
2024-07-30 02:17:40,099 batch 134/672 - loss 3.02495554 - lr 0.0010 - time 249.55s |
|
2024-07-30 02:19:39,521 batch 201/672 - loss 2.92257840 - lr 0.0010 - time 368.98s |
|
2024-07-30 02:22:13,372 batch 268/672 - loss 2.85199871 - lr 0.0010 - time 522.83s |
|
2024-07-30 02:24:16,980 batch 335/672 - loss 2.79420793 - lr 0.0010 - time 646.43s |
|
2024-07-30 02:26:25,476 batch 402/672 - loss 2.74788210 - lr 0.0010 - time 774.93s |
|
2024-07-30 02:28:37,609 batch 469/672 - loss 2.70773431 - lr 0.0010 - time 907.06s |
|
2024-07-30 02:30:34,985 batch 536/672 - loss 2.67195398 - lr 0.0010 - time 1024.44s |
|
2024-07-30 02:32:46,538 batch 603/672 - loss 2.64084060 - lr 0.0010 - time 1155.99s |
|
2024-07-30 02:34:55,748 batch 670/672 - loss 2.61186988 - lr 0.0010 - time 1285.20s |
|
2024-07-30 02:34:58,934 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:34:58,937 EPOCH 1 DONE |
|
2024-07-30 02:35:33,168 TRAIN Loss: 2.6108 |
|
2024-07-30 02:35:33,168 DEV Loss: 4.0377 |
|
2024-07-30 02:35:33,168 DEV Perplexity: 56.6981 |
|
2024-07-30 02:35:33,168 New best score! |
|
2024-07-30 02:35:33,170 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:35:33,170 EPOCH 2 |
|
2024-07-30 02:37:40,712 batch 67/672 - loss 2.32208106 - lr 0.0010 - time 127.54s |
|
2024-07-30 02:39:39,545 batch 134/672 - loss 2.30324291 - lr 0.0010 - time 246.38s |
|
2024-07-30 02:41:50,177 batch 201/672 - loss 2.29119577 - lr 0.0010 - time 377.01s |
|
2024-07-30 02:44:22,124 batch 268/672 - loss 2.27651633 - lr 0.0010 - time 528.95s |
|
2024-07-30 02:46:29,564 batch 335/672 - loss 2.26064277 - lr 0.0010 - time 656.39s |
|
2024-07-30 02:48:27,268 batch 402/672 - loss 2.24953536 - lr 0.0010 - time 774.10s |
|
2024-07-30 02:50:23,982 batch 469/672 - loss 2.23849808 - lr 0.0010 - time 890.81s |
|
2024-07-30 02:52:40,137 batch 536/672 - loss 2.22690770 - lr 0.0010 - time 1026.97s |
|
2024-07-30 02:54:52,910 batch 603/672 - loss 2.21315394 - lr 0.0010 - time 1159.74s |
|
2024-07-30 02:57:00,732 batch 670/672 - loss 2.19986962 - lr 0.0010 - time 1287.56s |
|
2024-07-30 02:57:03,825 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:57:03,828 EPOCH 2 DONE |
|
2024-07-30 02:57:38,031 TRAIN Loss: 2.1993 |
|
2024-07-30 02:57:38,032 DEV Loss: 4.1666 |
|
2024-07-30 02:57:38,033 DEV Perplexity: 64.4964 |
|
2024-07-30 02:57:38,033 No improvement for 1 epoch(s) |
|
2024-07-30 02:57:38,033 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 02:57:38,033 EPOCH 3 |
|
2024-07-30 02:59:46,882 batch 67/672 - loss 2.07768523 - lr 0.0010 - time 128.85s |
|
2024-07-30 03:02:13,067 batch 134/672 - loss 2.06771447 - lr 0.0010 - time 275.03s |
|
2024-07-30 03:04:13,352 batch 201/672 - loss 2.05206243 - lr 0.0010 - time 395.32s |
|
2024-07-30 03:06:15,924 batch 268/672 - loss 2.03767699 - lr 0.0010 - time 517.89s |
|
2024-07-30 03:08:29,454 batch 335/672 - loss 2.02756568 - lr 0.0010 - time 651.42s |
|
2024-07-30 03:10:36,938 batch 402/672 - loss 2.01690815 - lr 0.0010 - time 778.90s |
|
2024-07-30 03:12:44,576 batch 469/672 - loss 2.00959916 - lr 0.0010 - time 906.54s |
|
2024-07-30 03:14:42,904 batch 536/672 - loss 1.99967818 - lr 0.0010 - time 1024.87s |
|
2024-07-30 03:17:05,177 batch 603/672 - loss 1.99148476 - lr 0.0010 - time 1167.14s |
|
2024-07-30 03:19:10,961 batch 670/672 - loss 1.98160288 - lr 0.0010 - time 1292.93s |
|
2024-07-30 03:19:13,988 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 03:19:13,990 EPOCH 3 DONE |
|
2024-07-30 03:19:48,048 TRAIN Loss: 1.9813 |
|
2024-07-30 03:19:48,050 DEV Loss: 4.2504 |
|
2024-07-30 03:19:48,050 DEV Perplexity: 70.1329 |
|
2024-07-30 03:19:48,050 No improvement for 2 epoch(s) |
|
2024-07-30 03:19:48,050 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 03:19:48,050 EPOCH 4 |
|
2024-07-30 03:21:58,453 batch 67/672 - loss 1.87205641 - lr 0.0010 - time 130.40s |
|
2024-07-30 03:23:54,350 batch 134/672 - loss 1.87312458 - lr 0.0010 - time 246.30s |
|
2024-07-30 03:26:26,003 batch 201/672 - loss 1.86491152 - lr 0.0010 - time 397.95s |
|
2024-07-30 03:28:31,716 batch 268/672 - loss 1.85794664 - lr 0.0010 - time 523.67s |
|
2024-07-30 03:30:58,523 batch 335/672 - loss 1.85268306 - lr 0.0010 - time 670.47s |
|
2024-07-30 03:32:55,289 batch 402/672 - loss 1.84701065 - lr 0.0010 - time 787.24s |
|
2024-07-30 03:35:17,440 batch 469/672 - loss 1.83774444 - lr 0.0010 - time 929.39s |
|
2024-07-30 03:37:17,765 batch 536/672 - loss 1.83106400 - lr 0.0010 - time 1049.71s |
|
2024-07-30 03:39:24,224 batch 603/672 - loss 1.82428703 - lr 0.0010 - time 1176.17s |
|
2024-07-30 03:41:19,788 batch 670/672 - loss 1.81979131 - lr 0.0010 - time 1291.74s |
|
2024-07-30 03:41:22,695 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 03:41:22,699 EPOCH 4 DONE |
|
2024-07-30 03:41:56,808 TRAIN Loss: 1.8197 |
|
2024-07-30 03:41:56,809 DEV Loss: 4.5206 |
|
2024-07-30 03:41:56,809 DEV Perplexity: 91.8923 |
|
2024-07-30 03:41:56,809 No improvement for 3 epoch(s) |
|
2024-07-30 03:41:56,809 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 03:41:56,809 EPOCH 5 |
|
2024-07-30 03:44:08,647 batch 67/672 - loss 1.75557149 - lr 0.0010 - time 131.84s |
|
2024-07-30 03:46:06,957 batch 134/672 - loss 1.74974602 - lr 0.0010 - time 250.15s |
|
2024-07-30 03:48:35,604 batch 201/672 - loss 1.74676394 - lr 0.0010 - time 398.79s |
|
2024-07-30 03:50:38,242 batch 268/672 - loss 1.74127575 - lr 0.0010 - time 521.43s |
|
2024-07-30 03:53:07,602 batch 335/672 - loss 1.73835572 - lr 0.0010 - time 670.79s |
|
2024-07-30 03:55:13,545 batch 402/672 - loss 1.73563331 - lr 0.0010 - time 796.74s |
|
2024-07-30 03:57:11,578 batch 469/672 - loss 1.73164746 - lr 0.0010 - time 914.77s |
|
2024-07-30 03:59:22,636 batch 536/672 - loss 1.72733042 - lr 0.0010 - time 1045.83s |
|
2024-07-30 04:01:27,047 batch 603/672 - loss 1.72134435 - lr 0.0010 - time 1170.24s |
|
2024-07-30 04:03:30,004 batch 670/672 - loss 1.71633585 - lr 0.0010 - time 1293.20s |
|
2024-07-30 04:03:33,232 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 04:03:33,234 EPOCH 5 DONE |
|
2024-07-30 04:04:07,476 TRAIN Loss: 1.7158 |
|
2024-07-30 04:04:07,478 DEV Loss: 4.7345 |
|
2024-07-30 04:04:07,478 DEV Perplexity: 113.8115 |
|
2024-07-30 04:04:07,478 No improvement for 4 epoch(s) |
|
2024-07-30 04:04:07,478 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 04:04:07,478 EPOCH 6 |
|
2024-07-30 04:06:07,000 batch 67/672 - loss 1.62654531 - lr 0.0001 - time 119.52s |
|
2024-07-30 04:08:25,932 batch 134/672 - loss 1.62444705 - lr 0.0001 - time 258.45s |
|
2024-07-30 04:10:20,762 batch 201/672 - loss 1.62080814 - lr 0.0001 - time 373.28s |
|
2024-07-30 04:12:32,261 batch 268/672 - loss 1.62108705 - lr 0.0001 - time 504.78s |
|
2024-07-30 04:14:45,293 batch 335/672 - loss 1.61820102 - lr 0.0001 - time 637.81s |
|
2024-07-30 04:16:51,180 batch 402/672 - loss 1.61746165 - lr 0.0001 - time 763.70s |
|
2024-07-30 04:18:59,569 batch 469/672 - loss 1.61459681 - lr 0.0001 - time 892.09s |
|
2024-07-30 04:21:23,775 batch 536/672 - loss 1.61302190 - lr 0.0001 - time 1036.30s |
|
2024-07-30 04:23:36,945 batch 603/672 - loss 1.60916015 - lr 0.0001 - time 1169.47s |
|
2024-07-30 04:25:42,320 batch 670/672 - loss 1.60702325 - lr 0.0001 - time 1294.84s |
|
2024-07-30 04:25:44,877 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 04:25:44,881 EPOCH 6 DONE |
|
2024-07-30 04:26:19,147 TRAIN Loss: 1.6068 |
|
2024-07-30 04:26:19,148 DEV Loss: 4.7702 |
|
2024-07-30 04:26:19,148 DEV Perplexity: 117.9441 |
|
2024-07-30 04:26:19,148 No improvement for 5 epoch(s) |
|
2024-07-30 04:26:19,148 Patience reached: Terminating model training due to early stopping |
|
2024-07-30 04:26:19,148 ---------------------------------------------------------------------------------------------------- |
|
2024-07-30 04:26:19,148 Finished Training |
|
2024-07-30 04:27:25,619 TEST Perplexity: 56.6321 |
|
2024-07-30 04:34:41,478 TEST BLEU = 3.34 40.7/3.8/1.3/0.6 (BP = 1.000 ratio = 1.000 hyp_len = 81 ref_len = 81) |
|
|