|
Starting training with learning rate of 1e-05 |
|
Epoch 1/200 training_loss: 8.1261 |
|
Epoch 1/200 clustering_loss: 9.4897 |
|
Epoch 1/200 target_entropy: 2.6206 |
|
Updated learning rate to: 5.95000000000005e-05 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_1.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_1.pt... |
|
Time cost: 34m26.5s |
|
--- |
|
Epoch 2/200 training_loss: 7.4766 |
|
Epoch 2/200 clustering_loss: 3.9344 |
|
Epoch 2/200 target_entropy: 2.5216 |
|
Updated learning rate to: 0.00010900000000000067 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_2.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_2.pt... |
|
Time cost: 32m40.8s |
|
--- |
|
Epoch 3/200 training_loss: 7.1787 |
|
Epoch 3/200 clustering_loss: 2.9921 |
|
Epoch 3/200 target_entropy: 2.1595 |
|
Updated learning rate to: 0.0001585000000000022 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_3.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_3.pt... |
|
Time cost: 32m46.2s |
|
--- |
|
Epoch 4/200 training_loss: 6.9961 |
|
Epoch 4/200 clustering_loss: 2.6688 |
|
Epoch 4/200 target_entropy: 1.8905 |
|
Updated learning rate to: 0.00020800000000000638 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_4.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_4.pt... |
|
Time cost: 32m48.6s |
|
--- |
|
Epoch 5/200 training_loss: 7.0223 |
|
Epoch 5/200 clustering_loss: 2.4584 |
|
Epoch 5/200 target_entropy: 1.7007 |
|
Updated learning rate to: 0.0002575000000000082 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_5.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_5.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_1.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_1_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_1.pt... |
|
Time cost: 32m45.9s |
|
--- |
|
Epoch 6/200 training_loss: 7.1984 |
|
Epoch 6/200 clustering_loss: 2.3618 |
|
Epoch 6/200 target_entropy: 1.5922 |
|
Updated learning rate to: 0.00030700000000000714 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_6.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_6.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_2.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_2_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_2.pt... |
|
Time cost: 32m45.1s |
|
--- |
|
Epoch 7/200 training_loss: 7.3917 |
|
Epoch 7/200 clustering_loss: 2.3260 |
|
Epoch 7/200 target_entropy: 1.5528 |
|
Updated learning rate to: 0.00035650000000000514 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_7.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_7.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_3.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_3_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_3.pt... |
|
Time cost: 32m49.7s |
|
--- |
|
Epoch 8/200 training_loss: 7.5620 |
|
Epoch 8/200 clustering_loss: 2.3115 |
|
Epoch 8/200 target_entropy: 1.5443 |
|
Updated learning rate to: 0.0004060000000000013 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_8.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_8.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_4.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_4_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_4.pt... |
|
Time cost: 32m48.4s |
|
--- |
|
Epoch 9/200 training_loss: 7.6907 |
|
Epoch 9/200 clustering_loss: 2.3062 |
|
Epoch 9/200 target_entropy: 1.5514 |
|
Updated learning rate to: 0.000455499999999997 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_9.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_9.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_5.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_5_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_5.pt... |
|
Time cost: 32m50.9s |
|
--- |
|
Epoch 10/200 training_loss: 7.7734 |
|
Epoch 10/200 clustering_loss: 2.3074 |
|
Epoch 10/200 target_entropy: 1.5652 |
|
Updated learning rate to: 0.0005049999999999972 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_10.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_10.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_6.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_6_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_6.pt... |
|
Time cost: 32m46.6s |
|
--- |
|
Epoch 11/200 training_loss: 7.8141 |
|
Epoch 11/200 clustering_loss: 2.3112 |
|
Epoch 11/200 target_entropy: 1.5798 |
|
Updated learning rate to: 0.0005544999999999904 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_11.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_11.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_7.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_7_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_7.pt... |
|
Time cost: 32m49.8s |
|
--- |
|
Epoch 12/200 training_loss: 7.8188 |
|
Epoch 12/200 clustering_loss: 2.3128 |
|
Epoch 12/200 target_entropy: 1.5934 |
|
Updated learning rate to: 0.0006039999999999798 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_12.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_12.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_8.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_8_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_8.pt... |
|
Time cost: 32m51.7s |
|
--- |
|
Epoch 13/200 training_loss: 7.7958 |
|
Epoch 13/200 clustering_loss: 2.3127 |
|
Epoch 13/200 target_entropy: 1.6056 |
|
Updated learning rate to: 0.0006534999999999674 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_13.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_13.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_9.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_9_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_9.pt... |
|
Time cost: 32m55.5s |
|
--- |
|
Epoch 14/200 training_loss: 7.7487 |
|
Epoch 14/200 clustering_loss: 2.3096 |
|
Epoch 14/200 target_entropy: 1.6114 |
|
Updated learning rate to: 0.0007029999999999472 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_14.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_14.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_10.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_10_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_10.pt... |
|
Time cost: 32m51.2s |
|
--- |
|
Epoch 15/200 training_loss: 7.6876 |
|
Epoch 15/200 clustering_loss: 2.3036 |
|
Epoch 15/200 target_entropy: 1.6106 |
|
Updated learning rate to: 0.0007524999999999336 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_15.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_15.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_11.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_11_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_11.pt... |
|
Time cost: 32m55.6s |
|
--- |
|
Epoch 16/200 training_loss: 7.6236 |
|
Epoch 16/200 clustering_loss: 2.2948 |
|
Epoch 16/200 target_entropy: 1.6045 |
|
Updated learning rate to: 0.0008019999999999188 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_16.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_16.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_12.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_12_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_12.pt... |
|
Time cost: 32m46.9s |
|
--- |
|
Epoch 17/200 training_loss: 7.5657 |
|
Epoch 17/200 clustering_loss: 2.2847 |
|
Epoch 17/200 target_entropy: 1.5952 |
|
Updated learning rate to: 0.0008514999999999057 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_17.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_17.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_13.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_13_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_13.pt... |
|
Time cost: 32m53.8s |
|
--- |
|
Epoch 18/200 training_loss: 7.5146 |
|
Epoch 18/200 clustering_loss: 2.2741 |
|
Epoch 18/200 target_entropy: 1.5852 |
|
Updated learning rate to: 0.0009009999999998955 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_18.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_18.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_14.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_14_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_14.pt... |
|
Time cost: 32m54.4s |
|
--- |
|
Epoch 19/200 training_loss: 7.4667 |
|
Epoch 19/200 clustering_loss: 2.2624 |
|
Epoch 19/200 target_entropy: 1.5733 |
|
Updated learning rate to: 0.0009504999999998828 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_19.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_19.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_15.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_15_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_15.pt... |
|
Time cost: 32m55.6s |
|
--- |
|
Epoch 20/200 training_loss: 7.4154 |
|
Epoch 20/200 clustering_loss: 2.2500 |
|
Epoch 20/200 target_entropy: 1.5607 |
|
Updated learning rate to: 0.001 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_20.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_20.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_16.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_16_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_16.pt... |
|
Time cost: 32m45.4s |
|
--- |
|
Epoch 21/200 training_loss: 7.3684 |
|
Epoch 21/200 clustering_loss: 2.2379 |
|
Epoch 21/200 target_entropy: 1.5490 |
|
Updated learning rate to: 0.000999924694227199 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_21.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_21.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_17.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_17_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_17.pt... |
|
Time cost: 32m57.5s |
|
--- |
|
Epoch 22/200 training_loss: 7.3258 |
|
Epoch 22/200 clustering_loss: 2.2265 |
|
Epoch 22/200 target_entropy: 1.5390 |
|
Updated learning rate to: 0.0009996987995949066 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_22.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_22.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_18.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_18_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_18.pt... |
|
Time cost: 33m00.0s |
|
--- |
|
Epoch 23/200 training_loss: 7.2837 |
|
Epoch 23/200 clustering_loss: 2.2139 |
|
Epoch 23/200 target_entropy: 1.5277 |
|
Updated learning rate to: 0.000999322384154599 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_23.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_23.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_19.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_19_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_19.pt... |
|
Time cost: 32m52.4s |
|
--- |
|
Epoch 24/200 training_loss: 7.2448 |
|
Epoch 24/200 clustering_loss: 2.2013 |
|
Epoch 24/200 target_entropy: 1.5161 |
|
Updated learning rate to: 0.00099879556130265 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_24.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_24.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_20.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_20_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_20.pt... |
|
Time cost: 32m56.8s |
|
--- |
|
Epoch 25/200 training_loss: 7.2082 |
|
Epoch 25/200 clustering_loss: 2.1867 |
|
Epoch 25/200 target_entropy: 1.5025 |
|
Updated learning rate to: 0.0009981184897461257 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_25.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_25.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_21.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_21_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_21.pt... |
|
Time cost: 32m52.0s |
|
--- |
|
Epoch 26/200 training_loss: 7.1700 |
|
Epoch 26/200 clustering_loss: 2.1710 |
|
Epoch 26/200 target_entropy: 1.4879 |
|
Updated learning rate to: 0.0009972913734550167 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_26.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_26.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_22.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_22_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_22.pt... |
|
Time cost: 32m47.9s |
|
--- |
|
Epoch 27/200 training_loss: 7.1307 |
|
Epoch 27/200 clustering_loss: 2.1540 |
|
Epoch 27/200 target_entropy: 1.4724 |
|
Updated learning rate to: 0.0009963144616007712 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_27.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_27.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_23.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_23_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_23.pt... |
|
Time cost: 32m51.6s |
|
--- |
|
Epoch 28/200 training_loss: 7.0912 |
|
Epoch 28/200 clustering_loss: 2.1361 |
|
Epoch 28/200 target_entropy: 1.4562 |
|
Updated learning rate to: 0.000995188048481222 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_28.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_28.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_24.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_24_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_24.pt... |
|
Time cost: 32m53.0s |
|
--- |
|
Epoch 29/200 training_loss: 7.0468 |
|
Epoch 29/200 clustering_loss: 2.1176 |
|
Epoch 29/200 target_entropy: 1.4384 |
|
Updated learning rate to: 0.0009939124734319395 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_29.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_29.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_25.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_25_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_25.pt... |
|
Time cost: 32m49.0s |
|
--- |
|
Epoch 30/200 training_loss: 7.0043 |
|
Epoch 30/200 clustering_loss: 2.0988 |
|
Epoch 30/200 target_entropy: 1.4201 |
|
Updated learning rate to: 0.000992488120724019 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_30.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_30.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_26.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_26_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_26.pt... |
|
Time cost: 32m49.3s |
|
--- |
|
Epoch 31/200 training_loss: 6.9610 |
|
Epoch 31/200 clustering_loss: 2.0794 |
|
Epoch 31/200 target_entropy: 1.4013 |
|
Updated learning rate to: 0.0009909154194482875 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_31.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_31.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_27.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_27_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_27.pt... |
|
Time cost: 32m50.5s |
|
--- |
|
Epoch 32/200 training_loss: 6.9211 |
|
Epoch 32/200 clustering_loss: 2.0616 |
|
Epoch 32/200 target_entropy: 1.3846 |
|
Updated learning rate to: 0.0009891948433860694 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_32.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_32.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_28.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_28_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_28.pt... |
|
Time cost: 32m46.0s |
|
--- |
|
Epoch 33/200 training_loss: 6.8787 |
|
Epoch 33/200 clustering_loss: 2.0458 |
|
Epoch 33/200 target_entropy: 1.3699 |
|
Updated learning rate to: 0.0009873269108664417 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_33.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_33.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_29.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_29_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_29.pt... |
|
Time cost: 32m46.5s |
|
--- |
|
Epoch 34/200 training_loss: 6.8365 |
|
Epoch 34/200 clustering_loss: 2.0278 |
|
Epoch 34/200 target_entropy: 1.3520 |
|
Updated learning rate to: 0.0009853121846100706 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_34.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_34.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_30.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_30_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_30.pt... |
|
Time cost: 32m44.9s |
|
--- |
|
Epoch 35/200 training_loss: 6.7977 |
|
Epoch 35/200 clustering_loss: 2.0109 |
|
Epoch 35/200 target_entropy: 1.3364 |
|
Updated learning rate to: 0.0009831512715597283 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_35.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_35.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_31.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_31_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_31.pt... |
|
Time cost: 32m49.3s |
|
--- |
|
Epoch 36/200 training_loss: 6.7625 |
|
Epoch 36/200 clustering_loss: 1.9944 |
|
Epoch 36/200 target_entropy: 1.3201 |
|
Updated learning rate to: 0.0009808448226974215 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_36.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_36.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_32.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_32_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_32.pt... |
|
Time cost: 32m57.9s |
|
--- |
|
Epoch 37/200 training_loss: 6.7270 |
|
Epoch 37/200 clustering_loss: 1.9801 |
|
Epoch 37/200 target_entropy: 1.3075 |
|
Updated learning rate to: 0.0009783935328482938 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_37.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_37.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_33.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_33_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_33.pt... |
|
Time cost: 32m46.0s |
|
--- |
|
Epoch 38/200 training_loss: 6.6932 |
|
Epoch 38/200 clustering_loss: 1.9670 |
|
Epoch 38/200 target_entropy: 1.2962 |
|
Updated learning rate to: 0.0009757981404712963 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_38.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_38.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_34.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_34_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_34.pt... |
|
Time cost: 32m51.8s |
|
--- |
|
Epoch 39/200 training_loss: 6.6603 |
|
Epoch 39/200 clustering_loss: 1.9545 |
|
Epoch 39/200 target_entropy: 1.2846 |
|
Updated learning rate to: 0.000973059427436721 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_39.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_39.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_35.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_35_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_35.pt... |
|
Time cost: 32m48.6s |
|
--- |
|
Epoch 40/200 training_loss: 6.6271 |
|
Epoch 40/200 clustering_loss: 1.9419 |
|
Epoch 40/200 target_entropy: 1.2724 |
|
Updated learning rate to: 0.0009701782187906851 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_40.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_40.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_36.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_36_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_36.pt... |
|
Time cost: 32m53.2s |
|
--- |
|
Epoch 41/200 training_loss: 6.5939 |
|
Epoch 41/200 clustering_loss: 1.9305 |
|
Epoch 41/200 target_entropy: 1.2625 |
|
Updated learning rate to: 0.0009671553825065633 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_41.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_41.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_37.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_37_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_37.pt... |
|
Time cost: 32m51.4s |
|
--- |
|
Epoch 42/200 training_loss: 6.5602 |
|
Epoch 42/200 clustering_loss: 1.9186 |
|
Epoch 42/200 target_entropy: 1.2518 |
|
Updated learning rate to: 0.0009639918292235034 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_42.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_42.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_38.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_38_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_38.pt... |
|
Time cost: 32m51.9s |
|
--- |
|
Epoch 43/200 training_loss: 6.5252 |
|
Epoch 43/200 clustering_loss: 1.9056 |
|
Epoch 43/200 target_entropy: 1.2392 |
|
Updated learning rate to: 0.0009606885119721099 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_43.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_43.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_39.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_39_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_39.pt... |
|
Time cost: 32m56.3s |
|
--- |
|
Epoch 44/200 training_loss: 6.4913 |
|
Epoch 44/200 clustering_loss: 1.8941 |
|
Epoch 44/200 target_entropy: 1.2289 |
|
Updated learning rate to: 0.0009572464258873344 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_44.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_44.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_40.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_40_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_40.pt... |
|
Time cost: 32m52.6s |
|
--- |
|
Epoch 45/200 training_loss: 6.4608 |
|
Epoch 45/200 clustering_loss: 1.8837 |
|
Epoch 45/200 target_entropy: 1.2193 |
|
Updated learning rate to: 0.0009536666079086723 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_45.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_45.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_41.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_41_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_41.pt... |
|
Time cost: 32m52.6s |
|
--- |
|
Epoch 46/200 training_loss: 6.4284 |
|
Epoch 46/200 clustering_loss: 1.8722 |
|
Epoch 46/200 target_entropy: 1.2082 |
|
Updated learning rate to: 0.000949950136467809 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_46.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_46.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_42.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_42_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_42.pt... |
|
Time cost: 32m50.6s |
|
--- |
|
Epoch 47/200 training_loss: 6.3976 |
|
Epoch 47/200 clustering_loss: 1.8617 |
|
Epoch 47/200 target_entropy: 1.1983 |
|
Updated learning rate to: 0.000946098131163723 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_47.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_47.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_43.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_43_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_43.pt... |
|
Time cost: 32m49.9s |
|
--- |
|
Epoch 48/200 training_loss: 6.3662 |
|
Epoch 48/200 clustering_loss: 1.8513 |
|
Epoch 48/200 target_entropy: 1.1888 |
|
Updated learning rate to: 0.000942111752425399 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_48.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_48.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_44.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_44_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_44.pt... |
|
Time cost: 32m49.7s |
|
--- |
|
Epoch 49/200 training_loss: 6.3298 |
|
Epoch 49/200 clustering_loss: 1.8407 |
|
Epoch 49/200 target_entropy: 1.1797 |
|
Updated learning rate to: 0.0009379922011622562 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_49.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_49.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_45.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_45_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_45.pt... |
|
Time cost: 32m44.8s |
|
--- |
|
Epoch 50/200 training_loss: 6.2932 |
|
Epoch 50/200 clustering_loss: 1.8305 |
|
Epoch 50/200 target_entropy: 1.1704 |
|
Updated learning rate to: 0.0009337407184023574 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_50.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_50.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_46.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_46_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_46.pt... |
|
Time cost: 32m41.2s |
|
--- |
|
Epoch 51/200 training_loss: 6.2595 |
|
Epoch 51/200 clustering_loss: 1.8189 |
|
Epoch 51/200 target_entropy: 1.1601 |
|
Updated learning rate to: 0.0009293585849185674 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_51.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_51.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_47.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_47_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_47.pt... |
|
Time cost: 32m48.9s |
|
--- |
|
Epoch 52/200 training_loss: 6.2249 |
|
Epoch 52/200 clustering_loss: 1.8090 |
|
Epoch 52/200 target_entropy: 1.1520 |
|
Updated learning rate to: 0.0009248471208426868 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_52.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_52.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_48.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_48_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_48.pt... |
|
Time cost: 32m43.2s |
|
--- |
|
Epoch 53/200 training_loss: 6.1902 |
|
Epoch 53/200 clustering_loss: 1.7985 |
|
Epoch 53/200 target_entropy: 1.1419 |
|
Updated learning rate to: 0.0009202076852677824 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_53.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_53.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_49.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_49_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_49.pt... |
|
Time cost: 32m40.9s |
|
--- |
|
Epoch 54/200 training_loss: 6.1581 |
|
Epoch 54/200 clustering_loss: 1.7882 |
|
Epoch 54/200 target_entropy: 1.1321 |
|
Updated learning rate to: 0.0009154416758387467 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_54.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_54.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_50.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_50_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_50.pt... |
|
Time cost: 32m43.9s |
|
--- |
|
Epoch 55/200 training_loss: 6.1263 |
|
Epoch 55/200 clustering_loss: 1.7771 |
|
Epoch 55/200 target_entropy: 1.1227 |
|
Updated learning rate to: 0.0009105505283312465 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_55.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_55.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_51.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_51_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_51.pt... |
|
Time cost: 32m46.3s |
|
--- |
|
Epoch 56/200 training_loss: 6.0939 |
|
Epoch 56/200 clustering_loss: 1.7677 |
|
Epoch 56/200 target_entropy: 1.1156 |
|
Updated learning rate to: 0.0009055357162192037 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_56.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_56.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_52.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_52_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_52.pt... |
|
Time cost: 32m53.9s |
|
--- |
|
Epoch 57/200 training_loss: 6.0627 |
|
Epoch 57/200 clustering_loss: 1.7583 |
|
Epoch 57/200 target_entropy: 1.1077 |
|
Updated learning rate to: 0.0009003987502308961 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_57.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_57.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_53.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_53_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_53.pt... |
|
Time cost: 32m47.7s |
|
--- |
|
Epoch 58/200 training_loss: 6.0349 |
|
Epoch 58/200 clustering_loss: 1.7496 |
|
Epoch 58/200 target_entropy: 1.1001 |
|
Updated learning rate to: 0.0008951411778938494 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_58.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_58.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_54.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_54_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_54.pt... |
|
Time cost: 32m49.9s |
|
--- |
|
Epoch 59/200 training_loss: 6.0045 |
|
Epoch 59/200 clustering_loss: 1.7423 |
|
Epoch 59/200 target_entropy: 1.0942 |
|
Updated learning rate to: 0.0008897645830686416 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_59.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_59.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_55.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_55_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_55.pt... |
|
Time cost: 32m48.7s |
|
--- |
|
Epoch 60/200 training_loss: 5.9759 |
|
Epoch 60/200 clustering_loss: 1.7339 |
|
Epoch 60/200 target_entropy: 1.0865 |
|
Updated learning rate to: 0.0008842705854717593 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_60.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_60.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_56.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_56_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_56.pt... |
|
Time cost: 32m47.5s |
|
--- |
|
Epoch 61/200 training_loss: 5.9496 |
|
Epoch 61/200 clustering_loss: 1.7259 |
|
Epoch 61/200 target_entropy: 1.0791 |
|
Updated learning rate to: 0.0008786608401876489 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_61.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_61.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_57.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_57_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_57.pt... |
|
Time cost: 32m49.1s |
|
--- |
|
Epoch 62/200 training_loss: 5.9221 |
|
Epoch 62/200 clustering_loss: 1.7184 |
|
Epoch 62/200 target_entropy: 1.0728 |
|
Updated learning rate to: 0.0008729370371701193 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_62.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_62.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_58.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_58_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_58.pt... |
|
Time cost: 32m46.4s |
|
--- |
|
Epoch 63/200 training_loss: 5.8921 |
|
Epoch 63/200 clustering_loss: 1.7093 |
|
Epoch 63/200 target_entropy: 1.0648 |
|
Updated learning rate to: 0.0008671009007332444 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_63.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_63.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_59.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_59_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_59.pt... |
|
Time cost: 32m47.4s |
|
--- |
|
Epoch 64/200 training_loss: 5.8614 |
|
Epoch 64/200 clustering_loss: 1.7010 |
|
Epoch 64/200 target_entropy: 1.0580 |
|
Updated learning rate to: 0.0008611541890318961 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_64.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_64.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_60.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_60_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_60.pt... |
|
Time cost: 32m51.3s |
|
--- |
|
Epoch 65/200 training_loss: 5.8343 |
|
Epoch 65/200 clustering_loss: 1.6930 |
|
Epoch 65/200 target_entropy: 1.0517 |
|
Updated learning rate to: 0.0008550986935321035 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_65.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_65.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_61.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_61_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_61.pt... |
|
Time cost: 32m48.6s |
|
--- |
|
Epoch 66/200 training_loss: 5.8052 |
|
Epoch 66/200 clustering_loss: 1.6836 |
|
Epoch 66/200 target_entropy: 1.0429 |
|
Updated learning rate to: 0.0008489362384713594 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_66.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_66.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_62.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_62_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_62.pt... |
|
Time cost: 32m43.5s |
|
--- |
|
Epoch 67/200 training_loss: 5.7791 |
|
Epoch 67/200 clustering_loss: 1.6755 |
|
Epoch 67/200 target_entropy: 1.0361 |
|
Updated learning rate to: 0.0008426686803090767 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_67.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_67.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_63.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_63_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_63.pt... |
|
Time cost: 32m50.4s |
|
--- |
|
Epoch 68/200 training_loss: 5.7531 |
|
Epoch 68/200 clustering_loss: 1.6695 |
|
Epoch 68/200 target_entropy: 1.0317 |
|
Updated learning rate to: 0.0008362979071673079 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_68.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_68.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_64.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_64_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_64.pt... |
|
Time cost: 32m49.1s |
|
--- |
|
Epoch 69/200 training_loss: 5.7266 |
|
Epoch 69/200 clustering_loss: 1.6634 |
|
Epoch 69/200 target_entropy: 1.0269 |
|
Updated learning rate to: 0.0008298258382619576 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_69.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_69.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_65.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_65_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_65.pt... |
|
Time cost: 32m49.9s |
|
--- |
|
Epoch 70/200 training_loss: 5.6989 |
|
Epoch 70/200 clustering_loss: 1.6552 |
|
Epoch 70/200 target_entropy: 1.0197 |
|
Updated learning rate to: 0.0008232544233245987 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_70.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_70.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_66.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_66_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_66.pt... |
|
Time cost: 32m47.9s |
|
--- |
|
Epoch 71/200 training_loss: 5.6738 |
|
Epoch 71/200 clustering_loss: 1.6476 |
|
Epoch 71/200 target_entropy: 1.0135 |
|
Updated learning rate to: 0.000816585642015118 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_71.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_71.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_67.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_67_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_67.pt... |
|
Time cost: 32m50.7s |
|
--- |
|
Epoch 72/200 training_loss: 5.6496 |
|
Epoch 72/200 clustering_loss: 1.6420 |
|
Epoch 72/200 target_entropy: 1.0090 |
|
Updated learning rate to: 0.0008098215033253394 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_72.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_72.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_68.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_68_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_68.pt... |
|
Time cost: 32m48.9s |
|
--- |
|
Epoch 73/200 training_loss: 5.6265 |
|
Epoch 73/200 clustering_loss: 1.6362 |
|
Epoch 73/200 target_entropy: 1.0047 |
|
Updated learning rate to: 0.0008029640449737957 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_73.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_73.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_69.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_69_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_69.pt... |
|
Time cost: 32m43.9s |
|
--- |
|
Epoch 74/200 training_loss: 5.6045 |
|
Epoch 74/200 clustering_loss: 1.6295 |
|
Epoch 74/200 target_entropy: 0.9993 |
|
Updated learning rate to: 0.0007960153327918694 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_74.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_74.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_70.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_70_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_70.pt... |
|
Time cost: 32m42.8s |
|
--- |
|
Epoch 75/200 training_loss: 5.5799 |
|
Epoch 75/200 clustering_loss: 1.6222 |
|
Epoch 75/200 target_entropy: 0.9929 |
|
Updated learning rate to: 0.0007889774601014634 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_75.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_75.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_71.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_71_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_71.pt... |
|
Time cost: 32m41.3s |
|
--- |
|
Epoch 76/200 training_loss: 5.5585 |
|
Epoch 76/200 clustering_loss: 1.6134 |
|
Epoch 76/200 target_entropy: 0.9850 |
|
Updated learning rate to: 0.0007818525470843588 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_76.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_76.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_72.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_72_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_72.pt... |
|
Time cost: 32m46.0s |
|
--- |
|
Epoch 77/200 training_loss: 5.5355 |
|
Epoch 77/200 clustering_loss: 1.6051 |
|
Epoch 77/200 target_entropy: 0.9783 |
|
Updated learning rate to: 0.000774642740143523 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_77.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_77.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_73.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_73_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_73.pt... |
|
Time cost: 32m40.6s |
|
--- |
|
Epoch 78/200 training_loss: 5.5163 |
|
Epoch 78/200 clustering_loss: 1.5971 |
|
Epoch 78/200 target_entropy: 0.9717 |
|
Updated learning rate to: 0.0007673502112564829 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_78.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_78.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_74.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_74_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_74.pt... |
|
Time cost: 32m45.5s |
|
--- |
|
Epoch 79/200 training_loss: 5.4933 |
|
Epoch 79/200 clustering_loss: 1.5894 |
|
Epoch 79/200 target_entropy: 0.9652 |
|
Updated learning rate to: 0.0007599771573210242 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_79.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_79.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_75.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_75_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_75.pt... |
|
Time cost: 32m40.3s |
|
--- |
|
Epoch 80/200 training_loss: 5.4753 |
|
Epoch 80/200 clustering_loss: 1.5846 |
|
Epoch 80/200 target_entropy: 0.9623 |
|
Updated learning rate to: 0.0007525257994933621 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_80.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_80.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_76.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_76_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_76.pt... |
|
Time cost: 32m42.5s |
|
--- |
|
Epoch 81/200 training_loss: 5.4581 |
|
Epoch 81/200 clustering_loss: 1.5785 |
|
Epoch 81/200 target_entropy: 0.9572 |
|
Updated learning rate to: 0.0007449983825190082 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_81.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_81.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_77.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_77_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_77.pt... |
|
Time cost: 32m48.3s |
|
--- |
|
Epoch 82/200 training_loss: 5.4399 |
|
Epoch 82/200 clustering_loss: 1.5714 |
|
Epoch 82/200 target_entropy: 0.9508 |
|
Updated learning rate to: 0.0007373971740565361 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_82.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_82.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_78.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_78_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_78.pt... |
|
Time cost: 32m45.7s |
|
--- |
|
Epoch 83/200 training_loss: 5.4244 |
|
Epoch 83/200 clustering_loss: 1.5651 |
|
Epoch 83/200 target_entropy: 0.9461 |
|
Updated learning rate to: 0.0007297244639944488 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_83.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_83.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_79.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_79_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_79.pt... |
|
Time cost: 32m41.1s |
|
--- |
|
Epoch 84/200 training_loss: 5.4067 |
|
Epoch 84/200 clustering_loss: 1.5590 |
|
Epoch 84/200 target_entropy: 0.9419 |
|
Updated learning rate to: 0.0007219825637613279 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_84.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_84.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_80.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_80_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_80.pt... |
|
Time cost: 32m45.9s |
|
--- |
|
Epoch 85/200 training_loss: 5.3857 |
|
Epoch 85/200 clustering_loss: 1.5531 |
|
Epoch 85/200 target_entropy: 0.9370 |
|
Updated learning rate to: 0.0007141738056295242 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_85.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_85.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_81.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_81_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_81.pt... |
|
Time cost: 32m41.0s |
|
--- |
|
Epoch 86/200 training_loss: 5.3673 |
|
Epoch 86/200 clustering_loss: 1.5452 |
|
Epoch 86/200 target_entropy: 0.9298 |
|
Updated learning rate to: 0.0007063005420125365 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_86.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_86.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_82.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_82_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_82.pt... |
|
Time cost: 32m43.3s |
|
--- |
|
Epoch 87/200 training_loss: 5.3494 |
|
Epoch 87/200 clustering_loss: 1.5374 |
|
Epoch 87/200 target_entropy: 0.9231 |
|
Updated learning rate to: 0.0006983651447563605 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_87.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_87.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_83.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_83_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_83.pt... |
|
Time cost: 32m43.7s |
|
--- |
|
Epoch 88/200 training_loss: 5.3332 |
|
Epoch 88/200 clustering_loss: 1.5325 |
|
Epoch 88/200 target_entropy: 0.9192 |
|
Updated learning rate to: 0.0006903700044249427 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_88.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_88.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_84.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_84_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_84.pt... |
|
Time cost: 32m45.6s |
|
--- |
|
Epoch 89/200 training_loss: 5.3172 |
|
Epoch 89/200 clustering_loss: 1.5269 |
|
Epoch 89/200 target_entropy: 0.9145 |
|
Updated learning rate to: 0.0006823175295800226 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_89.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_89.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_85.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_85_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_85.pt... |
|
Time cost: 32m49.0s |
|
--- |
|
Epoch 90/200 training_loss: 5.3011 |
|
Epoch 90/200 clustering_loss: 1.5228 |
|
Epoch 90/200 target_entropy: 0.9113 |
|
Updated learning rate to: 0.0006742101460555493 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_90.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_90.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_86.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_86_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_86.pt... |
|
Time cost: 32m42.8s |
|
--- |
|
Epoch 91/200 training_loss: 5.2841 |
|
Epoch 91/200 clustering_loss: 1.5174 |
|
Epoch 91/200 target_entropy: 0.9065 |
|
Updated learning rate to: 0.0006660502962268847 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_91.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_91.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_87.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_87_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_87.pt... |
|
Time cost: 32m41.5s |
|
--- |
|
Epoch 92/200 training_loss: 5.2683 |
|
Epoch 92/200 clustering_loss: 1.5130 |
|
Epoch 92/200 target_entropy: 0.9034 |
|
Updated learning rate to: 0.0006578404382750364 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_92.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_92.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_88.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_88_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_88.pt... |
|
Time cost: 32m46.2s |
|
--- |
|
Epoch 93/200 training_loss: 5.2526 |
|
Epoch 93/200 clustering_loss: 1.5077 |
|
Epoch 93/200 target_entropy: 0.8991 |
|
Updated learning rate to: 0.0006495830454461216 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_93.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_93.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_89.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_89_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_89.pt... |
|
Time cost: 32m44.1s |
|
--- |
|
Epoch 94/200 training_loss: 5.2367 |
|
Epoch 94/200 clustering_loss: 1.5020 |
|
Epoch 94/200 target_entropy: 0.8944 |
|
Updated learning rate to: 0.0006412806053062902 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_94.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_94.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_90.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_90_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_90.pt... |
|
Time cost: 32m34.5s |
|
--- |
|
Epoch 95/200 training_loss: 5.2201 |
|
Epoch 95/200 clustering_loss: 1.4970 |
|
Epoch 95/200 target_entropy: 0.8904 |
|
Updated learning rate to: 0.0006329356189923407 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_95.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_95.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_91.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_91_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_91.pt... |
|
Time cost: 32m41.7s |
|
--- |
|
Epoch 96/200 training_loss: 5.2060 |
|
Epoch 96/200 clustering_loss: 1.4939 |
|
Epoch 96/200 target_entropy: 0.8882 |
|
Updated learning rate to: 0.0006245506004582433 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_96.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_96.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_92.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_92_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_92.pt... |
|
Time cost: 32m39.5s |
|
--- |
|
Epoch 97/200 training_loss: 5.1900 |
|
Epoch 97/200 clustering_loss: 1.4904 |
|
Epoch 97/200 target_entropy: 0.8858 |
|
Updated learning rate to: 0.0006161280757177955 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_97.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_97.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_93.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_93_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_93.pt... |
|
Time cost: 32m33.8s |
|
--- |
|
Epoch 98/200 training_loss: 5.1746 |
|
Epoch 98/200 clustering_loss: 1.4861 |
|
Epoch 98/200 target_entropy: 0.8821 |
|
Updated learning rate to: 0.0006076705820836695 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_98.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_98.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_94.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_94_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_94.pt... |
|
Time cost: 32m43.7s |
|
--- |
|
Epoch 99/200 training_loss: 5.1612 |
|
Epoch 99/200 clustering_loss: 1.4820 |
|
Epoch 99/200 target_entropy: 0.8790 |
|
Updated learning rate to: 0.0005991806674030302 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_99.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_99.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_95.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_95_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_95.pt... |
|
Time cost: 32m41.1s |
|
--- |
|
Epoch 100/200 training_loss: 5.1454 |
|
Epoch 100/200 clustering_loss: 1.4776 |
|
Epoch 100/200 target_entropy: 0.8756 |
|
Updated learning rate to: 0.0005906608892899771 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_100.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_100.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_96.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_96_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_96.pt... |
|
Time cost: 32m41.4s |
|
--- |
|
Epoch 101/200 training_loss: 5.1332 |
|
Epoch 101/200 clustering_loss: 1.4733 |
|
Epoch 101/200 target_entropy: 0.8712 |
|
Updated learning rate to: 0.0005821138143550737 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_101.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_101.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_97.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_97_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_97.pt... |
|
Time cost: 32m44.1s |
|
--- |
|
Epoch 102/200 training_loss: 5.1228 |
|
Epoch 102/200 clustering_loss: 1.4695 |
|
Epoch 102/200 target_entropy: 0.8681 |
|
Updated learning rate to: 0.0005735420174321354 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_102.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_102.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_98.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_98_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_98.pt... |
|
Time cost: 32m46.7s |
|
--- |
|
Epoch 103/200 training_loss: 5.1139 |
|
Epoch 103/200 clustering_loss: 1.4660 |
|
Epoch 103/200 target_entropy: 0.8656 |
|
Updated learning rate to: 0.0005649480808025555 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_103.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_103.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_99.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_99_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_99.pt... |
|
Time cost: 32m45.2s |
|
--- |
|
Epoch 104/200 training_loss: 5.1051 |
|
Epoch 104/200 clustering_loss: 1.4636 |
|
Epoch 104/200 target_entropy: 0.8640 |
|
Updated learning rate to: 0.0005563345934173908 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_104.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_104.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_100.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_100_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_100.pt... |
|
Time cost: 32m50.9s |
|
--- |
|
Epoch 105/200 training_loss: 5.0939 |
|
Epoch 105/200 clustering_loss: 1.4606 |
|
Epoch 105/200 target_entropy: 0.8617 |
|
Updated learning rate to: 0.0005477041501174173 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_105.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_105.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_101.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_101_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_101.pt... |
|
Time cost: 32m49.5s |
|
--- |
|
Epoch 106/200 training_loss: 5.0848 |
|
Epoch 106/200 clustering_loss: 1.4586 |
|
Epoch 106/200 target_entropy: 0.8605 |
|
Updated learning rate to: 0.0005390593508514405 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_106.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_106.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_102.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_102_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_102.pt... |
|
Time cost: 32m42.9s |
|
--- |
|
Epoch 107/200 training_loss: 5.0760 |
|
Epoch 107/200 clustering_loss: 1.4559 |
|
Epoch 107/200 target_entropy: 0.8584 |
|
Updated learning rate to: 0.0005304027998930416 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_107.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_107.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_103.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_103_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_103.pt... |
|
Time cost: 32m48.3s |
|
--- |
|
Epoch 108/200 training_loss: 5.0666 |
|
Epoch 108/200 clustering_loss: 1.4531 |
|
Epoch 108/200 target_entropy: 0.8561 |
|
Updated learning rate to: 0.0005217371050560449 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_108.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_108.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_104.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_104_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_104.pt... |
|
Time cost: 32m38.6s |
|
--- |
|
Epoch 109/200 training_loss: 5.0573 |
|
Epoch 109/200 clustering_loss: 1.4504 |
|
Epoch 109/200 target_entropy: 0.8542 |
|
Updated learning rate to: 0.0005130648769088946 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_109.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_109.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_105.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_105_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_105.pt... |
|
Time cost: 32m47.2s |
|
--- |
|
Epoch 110/200 training_loss: 5.0511 |
|
Epoch 110/200 clustering_loss: 1.4485 |
|
Epoch 110/200 target_entropy: 0.8535 |
|
Updated learning rate to: 0.0005043887279882134 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_110.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_110.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_106.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_106_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_106.pt... |
|
Time cost: 32m44.1s |
|
--- |
|
Epoch 111/200 training_loss: 5.0434 |
|
Epoch 111/200 clustering_loss: 1.4464 |
|
Epoch 111/200 target_entropy: 0.8518 |
|
Updated learning rate to: 0.0004957112720117694 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_111.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_111.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_107.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_107_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_107.pt... |
|
Time cost: 32m41.4s |
|
--- |
|
Epoch 112/200 training_loss: 5.0358 |
|
Epoch 112/200 clustering_loss: 1.4435 |
|
Epoch 112/200 target_entropy: 0.8490 |
|
Updated learning rate to: 0.0004870351230910882 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_112.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_112.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_108.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_108_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_108.pt... |
|
Time cost: 32m47.2s |
|
--- |
|
Epoch 113/200 training_loss: 5.0317 |
|
Epoch 113/200 clustering_loss: 1.4416 |
|
Epoch 113/200 target_entropy: 0.8478 |
|
Updated learning rate to: 0.000478362894943939 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_113.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_113.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_109.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_109_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_109.pt... |
|
Time cost: 32m47.5s |
|
--- |
|
Epoch 114/200 training_loss: 5.0276 |
|
Epoch 114/200 clustering_loss: 1.4402 |
|
Epoch 114/200 target_entropy: 0.8469 |
|
Updated learning rate to: 0.0004696972001069459 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_114.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_114.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_110.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_110_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_110.pt... |
|
Time cost: 32m49.7s |
|
--- |
|
Epoch 115/200 training_loss: 5.0219 |
|
Epoch 115/200 clustering_loss: 1.4366 |
|
Epoch 115/200 target_entropy: 0.8440 |
|
Updated learning rate to: 0.00046104064914854873 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_115.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_115.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_111.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_111_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_111.pt... |
|
Time cost: 32m49.9s |
|
--- |
|
Epoch 116/200 training_loss: 5.0205 |
|
Epoch 116/200 clustering_loss: 1.4350 |
|
Epoch 116/200 target_entropy: 0.8430 |
|
Updated learning rate to: 0.0004523958498825716 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_116.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_116.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_112.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_112_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_112.pt... |
|
Time cost: 32m42.6s |
|
--- |
|
Epoch 117/200 training_loss: 5.0190 |
|
Epoch 117/200 clustering_loss: 1.4340 |
|
Epoch 117/200 target_entropy: 0.8424 |
|
Updated learning rate to: 0.0004437654065825967 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_117.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_117.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_113.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_113_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_113.pt... |
|
Time cost: 32m48.1s |
|
--- |
|
Epoch 118/200 training_loss: 5.0165 |
|
Epoch 118/200 clustering_loss: 1.4329 |
|
Epoch 118/200 target_entropy: 0.8417 |
|
Updated learning rate to: 0.00043515191919743203 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_118.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_118.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_114.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_114_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_114.pt... |
|
Time cost: 32m44.7s |
|
--- |
|
Epoch 119/200 training_loss: 5.0129 |
|
Epoch 119/200 clustering_loss: 1.4313 |
|
Epoch 119/200 target_entropy: 0.8407 |
|
Updated learning rate to: 0.0004265579825678553 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_119.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_119.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_115.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_115_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_115.pt... |
|
Time cost: 32m47.2s |
|
--- |
|
Epoch 120/200 training_loss: 5.0107 |
|
Epoch 120/200 clustering_loss: 1.4295 |
|
Epoch 120/200 target_entropy: 0.8397 |
|
Updated learning rate to: 0.0004179861856449166 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_120.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_120.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_116.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_116_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_116.pt... |
|
Time cost: 32m49.1s |
|
--- |
|
Epoch 121/200 training_loss: 5.0057 |
|
Epoch 121/200 clustering_loss: 1.4275 |
|
Epoch 121/200 target_entropy: 0.8385 |
|
Updated learning rate to: 0.0004094391107100128 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_121.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_121.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_117.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_117_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_117.pt... |
|
Time cost: 32m46.1s |
|
--- |
|
Epoch 122/200 training_loss: 5.0024 |
|
Epoch 122/200 clustering_loss: 1.4253 |
|
Epoch 122/200 target_entropy: 0.8366 |
|
Updated learning rate to: 0.0004009193325969589 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_122.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_122.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_118.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_118_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_118.pt... |
|
Time cost: 32m46.5s |
|
--- |
|
Epoch 123/200 training_loss: 4.9997 |
|
Epoch 123/200 clustering_loss: 1.4227 |
|
Epoch 123/200 target_entropy: 0.8344 |
|
Updated learning rate to: 0.000392429417916316 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_123.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_123.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_119.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_119_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_119.pt... |
|
Time cost: 32m45.7s |
|
--- |
|
Epoch 124/200 training_loss: 4.9964 |
|
Epoch 124/200 clustering_loss: 1.4198 |
|
Epoch 124/200 target_entropy: 0.8326 |
|
Updated learning rate to: 0.00038397192428219114 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_124.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_124.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_120.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_120_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_120.pt... |
|
Time cost: 32m45.7s |
|
--- |
|
Epoch 125/200 training_loss: 4.9932 |
|
Epoch 125/200 clustering_loss: 1.4181 |
|
Epoch 125/200 target_entropy: 0.8313 |
|
Updated learning rate to: 0.00037554939954174326 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_125.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_125.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_121.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_121_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_121.pt... |
|
Time cost: 32m42.8s |
|
--- |
|
Epoch 126/200 training_loss: 4.9890 |
|
Epoch 126/200 clustering_loss: 1.4155 |
|
Epoch 126/200 target_entropy: 0.8289 |
|
Updated learning rate to: 0.0003671643810076428 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_126.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_126.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_122.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_122_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_122.pt... |
|
Time cost: 32m48.7s |
|
--- |
|
Epoch 127/200 training_loss: 4.9887 |
|
Epoch 127/200 clustering_loss: 1.4142 |
|
Epoch 127/200 target_entropy: 0.8279 |
|
Updated learning rate to: 0.000358819394693693 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_127.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_127.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_123.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_123_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_123.pt... |
|
Time cost: 32m42.6s |
|
--- |
|
Epoch 128/200 training_loss: 4.9887 |
|
Epoch 128/200 clustering_loss: 1.4139 |
|
Epoch 128/200 target_entropy: 0.8283 |
|
Updated learning rate to: 0.00035051695455386297 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_128.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_128.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_124.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_124_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_124.pt... |
|
Time cost: 32m43.8s |
|
--- |
|
Epoch 129/200 training_loss: 4.9855 |
|
Epoch 129/200 clustering_loss: 1.4123 |
|
Epoch 129/200 target_entropy: 0.8272 |
|
Updated learning rate to: 0.0003422595617249509 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_129.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_129.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_125.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_125_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_125.pt... |
|
Time cost: 32m50.0s |
|
--- |
|
Epoch 130/200 training_loss: 4.9820 |
|
Epoch 130/200 clustering_loss: 1.4097 |
|
Epoch 130/200 target_entropy: 0.8253 |
|
Updated learning rate to: 0.0003340497037731065 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_130.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_130.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_126.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_126_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_126.pt... |
|
Time cost: 32m43.7s |
|
--- |
|
Epoch 131/200 training_loss: 4.9802 |
|
Epoch 131/200 clustering_loss: 1.4082 |
|
Epoch 131/200 target_entropy: 0.8238 |
|
Updated learning rate to: 0.00032588985394444353 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_131.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_131.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_127.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_127_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_127.pt... |
|
Time cost: 32m46.5s |
|
--- |
|
Epoch 132/200 training_loss: 4.9792 |
|
Epoch 132/200 clustering_loss: 1.4073 |
|
Epoch 132/200 target_entropy: 0.8233 |
|
Updated learning rate to: 0.0003177824704199702 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_132.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_132.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_128.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_128_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_128.pt... |
|
Time cost: 32m45.5s |
|
--- |
|
Epoch 133/200 training_loss: 4.9782 |
|
Epoch 133/200 clustering_loss: 1.4062 |
|
Epoch 133/200 target_entropy: 0.8224 |
|
Updated learning rate to: 0.00030972999557505004 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_133.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_133.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_129.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_129_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_129.pt... |
|
Time cost: 32m48.9s |
|
--- |
|
Epoch 134/200 training_loss: 4.9762 |
|
Epoch 134/200 clustering_loss: 1.4049 |
|
Epoch 134/200 target_entropy: 0.8214 |
|
Updated learning rate to: 0.00030173485524363216 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_134.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_134.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_130.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_130_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_130.pt... |
|
Time cost: 32m50.9s |
|
--- |
|
Epoch 135/200 training_loss: 4.9720 |
|
Epoch 135/200 clustering_loss: 1.4025 |
|
Epoch 135/200 target_entropy: 0.8194 |
|
Updated learning rate to: 0.0002937994579874572 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_135.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_135.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_131.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_131_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_131.pt... |
|
Time cost: 32m47.4s |
|
--- |
|
Epoch 136/200 training_loss: 4.9689 |
|
Epoch 136/200 clustering_loss: 1.4009 |
|
Epoch 136/200 target_entropy: 0.8184 |
|
Updated learning rate to: 0.00028592619437047245 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_136.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_136.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_132.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_132_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_132.pt... |
|
Time cost: 32m41.7s |
|
--- |
|
Epoch 137/200 training_loss: 4.9651 |
|
Epoch 137/200 clustering_loss: 1.3996 |
|
Epoch 137/200 target_entropy: 0.8174 |
|
Updated learning rate to: 0.00027811743623866735 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_137.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_137.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_133.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_133_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_133.pt... |
|
Time cost: 32m49.5s |
|
--- |
|
Epoch 138/200 training_loss: 4.9619 |
|
Epoch 138/200 clustering_loss: 1.3984 |
|
Epoch 138/200 target_entropy: 0.8165 |
|
Updated learning rate to: 0.0002703755360055489 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_138.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_138.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_134.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_134_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_134.pt... |
|
Time cost: 32m48.9s |
|
--- |
|
Epoch 139/200 training_loss: 4.9601 |
|
Epoch 139/200 clustering_loss: 1.3979 |
|
Epoch 139/200 target_entropy: 0.8165 |
|
Updated learning rate to: 0.0002627028259434617 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_139.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_139.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_135.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_135_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_135.pt... |
|
Time cost: 32m48.0s |
|
--- |
|
Epoch 140/200 training_loss: 4.9573 |
|
Epoch 140/200 clustering_loss: 1.3967 |
|
Epoch 140/200 target_entropy: 0.8155 |
|
Updated learning rate to: 0.0002551016174809902 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_140.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_140.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_136.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_136_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_136.pt... |
|
Time cost: 32m49.3s |
|
--- |
|
Epoch 141/200 training_loss: 4.9558 |
|
Epoch 141/200 clustering_loss: 1.3961 |
|
Epoch 141/200 target_entropy: 0.8150 |
|
Updated learning rate to: 0.0002475742005066348 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_141.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_141.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_137.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_137_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_137.pt... |
|
Time cost: 32m43.4s |
|
--- |
|
Epoch 142/200 training_loss: 4.9526 |
|
Epoch 142/200 clustering_loss: 1.3943 |
|
Epoch 142/200 target_entropy: 0.8134 |
|
Updated learning rate to: 0.00024012284267897229 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_142.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_142.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_138.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_138_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_138.pt... |
|
Time cost: 32m44.1s |
|
--- |
|
Epoch 143/200 training_loss: 4.9500 |
|
Epoch 143/200 clustering_loss: 1.3927 |
|
Epoch 143/200 target_entropy: 0.8120 |
|
Updated learning rate to: 0.00023274978874351465 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_143.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_143.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_139.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_139_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_139.pt... |
|
Time cost: 32m44.7s |
|
--- |
|
Epoch 144/200 training_loss: 4.9466 |
|
Epoch 144/200 clustering_loss: 1.3915 |
|
Epoch 144/200 target_entropy: 0.8111 |
|
Updated learning rate to: 0.00022545725985647556 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_144.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_144.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_140.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_140_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_140.pt... |
|
Time cost: 32m48.9s |
|
--- |
|
Epoch 145/200 training_loss: 4.9457 |
|
Epoch 145/200 clustering_loss: 1.3912 |
|
Epoch 145/200 target_entropy: 0.8111 |
|
Updated learning rate to: 0.0002182474529156374 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_145.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_145.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_141.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_141_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_141.pt... |
|
Time cost: 32m54.0s |
|
--- |
|
Epoch 146/200 training_loss: 4.9433 |
|
Epoch 146/200 clustering_loss: 1.3901 |
|
Epoch 146/200 target_entropy: 0.8104 |
|
Updated learning rate to: 0.00021112253989853376 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_146.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_146.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_142.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_142_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_142.pt... |
|
Time cost: 32m48.7s |
|
--- |
|
Epoch 147/200 training_loss: 4.9407 |
|
Epoch 147/200 clustering_loss: 1.3894 |
|
Epoch 147/200 target_entropy: 0.8098 |
|
Updated learning rate to: 0.0002040846672081273 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_147.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_147.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_143.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_143_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_143.pt... |
|
Time cost: 32m44.2s |
|
--- |
|
Epoch 148/200 training_loss: 4.9380 |
|
Epoch 148/200 clustering_loss: 1.3884 |
|
Epoch 148/200 target_entropy: 0.8089 |
|
Updated learning rate to: 0.0001971359550262029 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_148.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_148.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_144.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_144_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_144.pt... |
|
Time cost: 32m44.0s |
|
--- |
|
Epoch 149/200 training_loss: 4.9358 |
|
Epoch 149/200 clustering_loss: 1.3876 |
|
Epoch 149/200 target_entropy: 0.8084 |
|
Updated learning rate to: 0.00019027849667465672 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_149.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_149.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_145.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_145_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_145.pt... |
|
Time cost: 32m51.5s |
|
--- |
|
Epoch 150/200 training_loss: 4.9334 |
|
Epoch 150/200 clustering_loss: 1.3869 |
|
Epoch 150/200 target_entropy: 0.8078 |
|
Updated learning rate to: 0.00018351435798487573 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_150.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_150.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_146.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_146_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_146.pt... |
|
Time cost: 32m45.6s |
|
--- |
|
Epoch 151/200 training_loss: 4.9307 |
|
Epoch 151/200 clustering_loss: 1.3860 |
|
Epoch 151/200 target_entropy: 0.8070 |
|
Updated learning rate to: 0.00017684557667539566 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_151.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_151.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_147.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_147_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_147.pt... |
|
Time cost: 32m48.8s |
|
--- |
|
Epoch 152/200 training_loss: 4.9271 |
|
Epoch 152/200 clustering_loss: 1.3849 |
|
Epoch 152/200 target_entropy: 0.8060 |
|
Updated learning rate to: 0.00017027416173803704 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_152.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_152.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_148.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_148_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_148.pt... |
|
Time cost: 32m49.8s |
|
--- |
|
Epoch 153/200 training_loss: 4.9238 |
|
Epoch 153/200 clustering_loss: 1.3836 |
|
Epoch 153/200 target_entropy: 0.8050 |
|
Updated learning rate to: 0.0001638020928326858 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_153.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_153.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_149.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_149_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_149.pt... |
|
Time cost: 32m40.6s |
|
--- |
|
Epoch 154/200 training_loss: 4.9214 |
|
Epoch 154/200 clustering_loss: 1.3826 |
|
Epoch 154/200 target_entropy: 0.8042 |
|
Updated learning rate to: 0.00015743131969091803 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_154.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_154.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_150.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_150_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_150.pt... |
|
Time cost: 32m49.4s |
|
--- |
|
Epoch 155/200 training_loss: 4.9195 |
|
Epoch 155/200 clustering_loss: 1.3819 |
|
Epoch 155/200 target_entropy: 0.8037 |
|
Updated learning rate to: 0.00015116376152863475 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_155.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_155.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_151.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_151_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_151.pt... |
|
Time cost: 32m48.8s |
|
--- |
|
Epoch 156/200 training_loss: 4.9171 |
|
Epoch 156/200 clustering_loss: 1.3810 |
|
Epoch 156/200 target_entropy: 0.8029 |
|
Updated learning rate to: 0.0001450013064678913 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_156.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_156.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_152.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_152_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_152.pt... |
|
Time cost: 32m44.7s |
|
--- |
|
Epoch 157/200 training_loss: 4.9140 |
|
Epoch 157/200 clustering_loss: 1.3801 |
|
Epoch 157/200 target_entropy: 0.8022 |
|
Updated learning rate to: 0.00013894581096809722 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_157.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_157.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_153.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_153_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_153.pt... |
|
Time cost: 32m48.8s |
|
--- |
|
Epoch 158/200 training_loss: 4.9107 |
|
Epoch 158/200 clustering_loss: 1.3791 |
|
Epoch 158/200 target_entropy: 0.8013 |
|
Updated learning rate to: 0.0001329990992667496 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_158.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_158.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_154.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_154_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_154.pt... |
|
Time cost: 32m46.4s |
|
--- |
|
Epoch 159/200 training_loss: 4.9075 |
|
Epoch 159/200 clustering_loss: 1.3785 |
|
Epoch 159/200 target_entropy: 0.8010 |
|
Updated learning rate to: 0.00012716296282987442 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_159.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_159.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_155.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_155_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_155.pt... |
|
Time cost: 32m46.1s |
|
--- |
|
Epoch 160/200 training_loss: 4.9051 |
|
Epoch 160/200 clustering_loss: 1.3780 |
|
Epoch 160/200 target_entropy: 0.8006 |
|
Updated learning rate to: 0.00012143915981234681 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_160.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_160.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_156.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_156_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_156.pt... |
|
Time cost: 32m51.7s |
|
--- |
|
Epoch 161/200 training_loss: 4.9027 |
|
Epoch 161/200 clustering_loss: 1.3774 |
|
Epoch 161/200 target_entropy: 0.8001 |
|
Updated learning rate to: 0.00011582941452823614 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_161.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_161.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_157.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_157_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_157.pt... |
|
Time cost: 32m51.4s |
|
--- |
|
Epoch 162/200 training_loss: 4.9000 |
|
Epoch 162/200 clustering_loss: 1.3765 |
|
Epoch 162/200 target_entropy: 0.7993 |
|
Updated learning rate to: 0.00011033541693135373 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_162.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_162.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_158.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_158_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_158.pt... |
|
Time cost: 32m48.7s |
|
--- |
|
Epoch 163/200 training_loss: 4.8970 |
|
Epoch 163/200 clustering_loss: 1.3760 |
|
Epoch 163/200 target_entropy: 0.7989 |
|
Updated learning rate to: 0.00010495882210614648 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_163.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_163.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_159.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_159_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_159.pt... |
|
Time cost: 32m51.3s |
|
--- |
|
Epoch 164/200 training_loss: 4.8941 |
|
Epoch 164/200 clustering_loss: 1.3757 |
|
Epoch 164/200 target_entropy: 0.7985 |
|
Updated learning rate to: 9.970124976909917e-05 |
|
Saving model checkpoint models/capi_rope_vitreg4_b14_164.pt... |
|
Saving model checkpoint models/rope_vitreg4_b14_capi_164.pt... |
|
Removing checkpoint models/capi_rope_vitreg4_b14_160.pt... |
|
Removing checkpoint states models/capi_rope_vitreg4_b14_160_states... |
|
Removing checkpoint models/rope_vitreg4_b14_capi_160.pt... |
|
Time cost: 32m49.1s |
|
--- |
|
|