rope_vit_reg4_b14_capi / training.log
hassonofer's picture
Upload 2 files
dd3a1c8 verified
Starting training with learning rate of 1e-05
Epoch 1/200 training_loss: 8.1261
Epoch 1/200 clustering_loss: 9.4897
Epoch 1/200 target_entropy: 2.6206
Updated learning rate to: 5.95000000000005e-05
Saving model checkpoint models/capi_rope_vitreg4_b14_1.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_1.pt...
Time cost: 34m26.5s
---
Epoch 2/200 training_loss: 7.4766
Epoch 2/200 clustering_loss: 3.9344
Epoch 2/200 target_entropy: 2.5216
Updated learning rate to: 0.00010900000000000067
Saving model checkpoint models/capi_rope_vitreg4_b14_2.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_2.pt...
Time cost: 32m40.8s
---
Epoch 3/200 training_loss: 7.1787
Epoch 3/200 clustering_loss: 2.9921
Epoch 3/200 target_entropy: 2.1595
Updated learning rate to: 0.0001585000000000022
Saving model checkpoint models/capi_rope_vitreg4_b14_3.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_3.pt...
Time cost: 32m46.2s
---
Epoch 4/200 training_loss: 6.9961
Epoch 4/200 clustering_loss: 2.6688
Epoch 4/200 target_entropy: 1.8905
Updated learning rate to: 0.00020800000000000638
Saving model checkpoint models/capi_rope_vitreg4_b14_4.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_4.pt...
Time cost: 32m48.6s
---
Epoch 5/200 training_loss: 7.0223
Epoch 5/200 clustering_loss: 2.4584
Epoch 5/200 target_entropy: 1.7007
Updated learning rate to: 0.0002575000000000082
Saving model checkpoint models/capi_rope_vitreg4_b14_5.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_5.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_1.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_1_states...
Removing checkpoint models/rope_vitreg4_b14_capi_1.pt...
Time cost: 32m45.9s
---
Epoch 6/200 training_loss: 7.1984
Epoch 6/200 clustering_loss: 2.3618
Epoch 6/200 target_entropy: 1.5922
Updated learning rate to: 0.00030700000000000714
Saving model checkpoint models/capi_rope_vitreg4_b14_6.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_6.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_2.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_2_states...
Removing checkpoint models/rope_vitreg4_b14_capi_2.pt...
Time cost: 32m45.1s
---
Epoch 7/200 training_loss: 7.3917
Epoch 7/200 clustering_loss: 2.3260
Epoch 7/200 target_entropy: 1.5528
Updated learning rate to: 0.00035650000000000514
Saving model checkpoint models/capi_rope_vitreg4_b14_7.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_7.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_3.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_3_states...
Removing checkpoint models/rope_vitreg4_b14_capi_3.pt...
Time cost: 32m49.7s
---
Epoch 8/200 training_loss: 7.5620
Epoch 8/200 clustering_loss: 2.3115
Epoch 8/200 target_entropy: 1.5443
Updated learning rate to: 0.0004060000000000013
Saving model checkpoint models/capi_rope_vitreg4_b14_8.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_8.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_4.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_4_states...
Removing checkpoint models/rope_vitreg4_b14_capi_4.pt...
Time cost: 32m48.4s
---
Epoch 9/200 training_loss: 7.6907
Epoch 9/200 clustering_loss: 2.3062
Epoch 9/200 target_entropy: 1.5514
Updated learning rate to: 0.000455499999999997
Saving model checkpoint models/capi_rope_vitreg4_b14_9.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_9.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_5.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_5_states...
Removing checkpoint models/rope_vitreg4_b14_capi_5.pt...
Time cost: 32m50.9s
---
Epoch 10/200 training_loss: 7.7734
Epoch 10/200 clustering_loss: 2.3074
Epoch 10/200 target_entropy: 1.5652
Updated learning rate to: 0.0005049999999999972
Saving model checkpoint models/capi_rope_vitreg4_b14_10.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_10.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_6.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_6_states...
Removing checkpoint models/rope_vitreg4_b14_capi_6.pt...
Time cost: 32m46.6s
---
Epoch 11/200 training_loss: 7.8141
Epoch 11/200 clustering_loss: 2.3112
Epoch 11/200 target_entropy: 1.5798
Updated learning rate to: 0.0005544999999999904
Saving model checkpoint models/capi_rope_vitreg4_b14_11.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_11.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_7.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_7_states...
Removing checkpoint models/rope_vitreg4_b14_capi_7.pt...
Time cost: 32m49.8s
---
Epoch 12/200 training_loss: 7.8188
Epoch 12/200 clustering_loss: 2.3128
Epoch 12/200 target_entropy: 1.5934
Updated learning rate to: 0.0006039999999999798
Saving model checkpoint models/capi_rope_vitreg4_b14_12.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_12.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_8.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_8_states...
Removing checkpoint models/rope_vitreg4_b14_capi_8.pt...
Time cost: 32m51.7s
---
Epoch 13/200 training_loss: 7.7958
Epoch 13/200 clustering_loss: 2.3127
Epoch 13/200 target_entropy: 1.6056
Updated learning rate to: 0.0006534999999999674
Saving model checkpoint models/capi_rope_vitreg4_b14_13.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_13.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_9.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_9_states...
Removing checkpoint models/rope_vitreg4_b14_capi_9.pt...
Time cost: 32m55.5s
---
Epoch 14/200 training_loss: 7.7487
Epoch 14/200 clustering_loss: 2.3096
Epoch 14/200 target_entropy: 1.6114
Updated learning rate to: 0.0007029999999999472
Saving model checkpoint models/capi_rope_vitreg4_b14_14.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_14.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_10.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_10_states...
Removing checkpoint models/rope_vitreg4_b14_capi_10.pt...
Time cost: 32m51.2s
---
Epoch 15/200 training_loss: 7.6876
Epoch 15/200 clustering_loss: 2.3036
Epoch 15/200 target_entropy: 1.6106
Updated learning rate to: 0.0007524999999999336
Saving model checkpoint models/capi_rope_vitreg4_b14_15.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_15.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_11.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_11_states...
Removing checkpoint models/rope_vitreg4_b14_capi_11.pt...
Time cost: 32m55.6s
---
Epoch 16/200 training_loss: 7.6236
Epoch 16/200 clustering_loss: 2.2948
Epoch 16/200 target_entropy: 1.6045
Updated learning rate to: 0.0008019999999999188
Saving model checkpoint models/capi_rope_vitreg4_b14_16.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_16.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_12.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_12_states...
Removing checkpoint models/rope_vitreg4_b14_capi_12.pt...
Time cost: 32m46.9s
---
Epoch 17/200 training_loss: 7.5657
Epoch 17/200 clustering_loss: 2.2847
Epoch 17/200 target_entropy: 1.5952
Updated learning rate to: 0.0008514999999999057
Saving model checkpoint models/capi_rope_vitreg4_b14_17.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_17.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_13.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_13_states...
Removing checkpoint models/rope_vitreg4_b14_capi_13.pt...
Time cost: 32m53.8s
---
Epoch 18/200 training_loss: 7.5146
Epoch 18/200 clustering_loss: 2.2741
Epoch 18/200 target_entropy: 1.5852
Updated learning rate to: 0.0009009999999998955
Saving model checkpoint models/capi_rope_vitreg4_b14_18.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_18.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_14.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_14_states...
Removing checkpoint models/rope_vitreg4_b14_capi_14.pt...
Time cost: 32m54.4s
---
Epoch 19/200 training_loss: 7.4667
Epoch 19/200 clustering_loss: 2.2624
Epoch 19/200 target_entropy: 1.5733
Updated learning rate to: 0.0009504999999998828
Saving model checkpoint models/capi_rope_vitreg4_b14_19.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_19.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_15.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_15_states...
Removing checkpoint models/rope_vitreg4_b14_capi_15.pt...
Time cost: 32m55.6s
---
Epoch 20/200 training_loss: 7.4154
Epoch 20/200 clustering_loss: 2.2500
Epoch 20/200 target_entropy: 1.5607
Updated learning rate to: 0.001
Saving model checkpoint models/capi_rope_vitreg4_b14_20.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_20.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_16.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_16_states...
Removing checkpoint models/rope_vitreg4_b14_capi_16.pt...
Time cost: 32m45.4s
---
Epoch 21/200 training_loss: 7.3684
Epoch 21/200 clustering_loss: 2.2379
Epoch 21/200 target_entropy: 1.5490
Updated learning rate to: 0.000999924694227199
Saving model checkpoint models/capi_rope_vitreg4_b14_21.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_21.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_17.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_17_states...
Removing checkpoint models/rope_vitreg4_b14_capi_17.pt...
Time cost: 32m57.5s
---
Epoch 22/200 training_loss: 7.3258
Epoch 22/200 clustering_loss: 2.2265
Epoch 22/200 target_entropy: 1.5390
Updated learning rate to: 0.0009996987995949066
Saving model checkpoint models/capi_rope_vitreg4_b14_22.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_22.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_18.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_18_states...
Removing checkpoint models/rope_vitreg4_b14_capi_18.pt...
Time cost: 33m00.0s
---
Epoch 23/200 training_loss: 7.2837
Epoch 23/200 clustering_loss: 2.2139
Epoch 23/200 target_entropy: 1.5277
Updated learning rate to: 0.000999322384154599
Saving model checkpoint models/capi_rope_vitreg4_b14_23.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_23.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_19.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_19_states...
Removing checkpoint models/rope_vitreg4_b14_capi_19.pt...
Time cost: 32m52.4s
---
Epoch 24/200 training_loss: 7.2448
Epoch 24/200 clustering_loss: 2.2013
Epoch 24/200 target_entropy: 1.5161
Updated learning rate to: 0.00099879556130265
Saving model checkpoint models/capi_rope_vitreg4_b14_24.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_24.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_20.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_20_states...
Removing checkpoint models/rope_vitreg4_b14_capi_20.pt...
Time cost: 32m56.8s
---
Epoch 25/200 training_loss: 7.2082
Epoch 25/200 clustering_loss: 2.1867
Epoch 25/200 target_entropy: 1.5025
Updated learning rate to: 0.0009981184897461257
Saving model checkpoint models/capi_rope_vitreg4_b14_25.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_25.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_21.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_21_states...
Removing checkpoint models/rope_vitreg4_b14_capi_21.pt...
Time cost: 32m52.0s
---
Epoch 26/200 training_loss: 7.1700
Epoch 26/200 clustering_loss: 2.1710
Epoch 26/200 target_entropy: 1.4879
Updated learning rate to: 0.0009972913734550167
Saving model checkpoint models/capi_rope_vitreg4_b14_26.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_26.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_22.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_22_states...
Removing checkpoint models/rope_vitreg4_b14_capi_22.pt...
Time cost: 32m47.9s
---
Epoch 27/200 training_loss: 7.1307
Epoch 27/200 clustering_loss: 2.1540
Epoch 27/200 target_entropy: 1.4724
Updated learning rate to: 0.0009963144616007712
Saving model checkpoint models/capi_rope_vitreg4_b14_27.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_27.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_23.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_23_states...
Removing checkpoint models/rope_vitreg4_b14_capi_23.pt...
Time cost: 32m51.6s
---
Epoch 28/200 training_loss: 7.0912
Epoch 28/200 clustering_loss: 2.1361
Epoch 28/200 target_entropy: 1.4562
Updated learning rate to: 0.000995188048481222
Saving model checkpoint models/capi_rope_vitreg4_b14_28.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_28.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_24.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_24_states...
Removing checkpoint models/rope_vitreg4_b14_capi_24.pt...
Time cost: 32m53.0s
---
Epoch 29/200 training_loss: 7.0468
Epoch 29/200 clustering_loss: 2.1176
Epoch 29/200 target_entropy: 1.4384
Updated learning rate to: 0.0009939124734319395
Saving model checkpoint models/capi_rope_vitreg4_b14_29.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_29.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_25.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_25_states...
Removing checkpoint models/rope_vitreg4_b14_capi_25.pt...
Time cost: 32m49.0s
---
Epoch 30/200 training_loss: 7.0043
Epoch 30/200 clustering_loss: 2.0988
Epoch 30/200 target_entropy: 1.4201
Updated learning rate to: 0.000992488120724019
Saving model checkpoint models/capi_rope_vitreg4_b14_30.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_30.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_26.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_26_states...
Removing checkpoint models/rope_vitreg4_b14_capi_26.pt...
Time cost: 32m49.3s
---
Epoch 31/200 training_loss: 6.9610
Epoch 31/200 clustering_loss: 2.0794
Epoch 31/200 target_entropy: 1.4013
Updated learning rate to: 0.0009909154194482875
Saving model checkpoint models/capi_rope_vitreg4_b14_31.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_31.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_27.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_27_states...
Removing checkpoint models/rope_vitreg4_b14_capi_27.pt...
Time cost: 32m50.5s
---
Epoch 32/200 training_loss: 6.9211
Epoch 32/200 clustering_loss: 2.0616
Epoch 32/200 target_entropy: 1.3846
Updated learning rate to: 0.0009891948433860694
Saving model checkpoint models/capi_rope_vitreg4_b14_32.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_32.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_28.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_28_states...
Removing checkpoint models/rope_vitreg4_b14_capi_28.pt...
Time cost: 32m46.0s
---
Epoch 33/200 training_loss: 6.8787
Epoch 33/200 clustering_loss: 2.0458
Epoch 33/200 target_entropy: 1.3699
Updated learning rate to: 0.0009873269108664417
Saving model checkpoint models/capi_rope_vitreg4_b14_33.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_33.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_29.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_29_states...
Removing checkpoint models/rope_vitreg4_b14_capi_29.pt...
Time cost: 32m46.5s
---
Epoch 34/200 training_loss: 6.8365
Epoch 34/200 clustering_loss: 2.0278
Epoch 34/200 target_entropy: 1.3520
Updated learning rate to: 0.0009853121846100706
Saving model checkpoint models/capi_rope_vitreg4_b14_34.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_34.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_30.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_30_states...
Removing checkpoint models/rope_vitreg4_b14_capi_30.pt...
Time cost: 32m44.9s
---
Epoch 35/200 training_loss: 6.7977
Epoch 35/200 clustering_loss: 2.0109
Epoch 35/200 target_entropy: 1.3364
Updated learning rate to: 0.0009831512715597283
Saving model checkpoint models/capi_rope_vitreg4_b14_35.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_35.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_31.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_31_states...
Removing checkpoint models/rope_vitreg4_b14_capi_31.pt...
Time cost: 32m49.3s
---
Epoch 36/200 training_loss: 6.7625
Epoch 36/200 clustering_loss: 1.9944
Epoch 36/200 target_entropy: 1.3201
Updated learning rate to: 0.0009808448226974215
Saving model checkpoint models/capi_rope_vitreg4_b14_36.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_36.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_32.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_32_states...
Removing checkpoint models/rope_vitreg4_b14_capi_32.pt...
Time cost: 32m57.9s
---
Epoch 37/200 training_loss: 6.7270
Epoch 37/200 clustering_loss: 1.9801
Epoch 37/200 target_entropy: 1.3075
Updated learning rate to: 0.0009783935328482938
Saving model checkpoint models/capi_rope_vitreg4_b14_37.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_37.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_33.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_33_states...
Removing checkpoint models/rope_vitreg4_b14_capi_33.pt...
Time cost: 32m46.0s
---
Epoch 38/200 training_loss: 6.6932
Epoch 38/200 clustering_loss: 1.9670
Epoch 38/200 target_entropy: 1.2962
Updated learning rate to: 0.0009757981404712963
Saving model checkpoint models/capi_rope_vitreg4_b14_38.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_38.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_34.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_34_states...
Removing checkpoint models/rope_vitreg4_b14_capi_34.pt...
Time cost: 32m51.8s
---
Epoch 39/200 training_loss: 6.6603
Epoch 39/200 clustering_loss: 1.9545
Epoch 39/200 target_entropy: 1.2846
Updated learning rate to: 0.000973059427436721
Saving model checkpoint models/capi_rope_vitreg4_b14_39.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_39.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_35.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_35_states...
Removing checkpoint models/rope_vitreg4_b14_capi_35.pt...
Time cost: 32m48.6s
---
Epoch 40/200 training_loss: 6.6271
Epoch 40/200 clustering_loss: 1.9419
Epoch 40/200 target_entropy: 1.2724
Updated learning rate to: 0.0009701782187906851
Saving model checkpoint models/capi_rope_vitreg4_b14_40.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_40.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_36.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_36_states...
Removing checkpoint models/rope_vitreg4_b14_capi_36.pt...
Time cost: 32m53.2s
---
Epoch 41/200 training_loss: 6.5939
Epoch 41/200 clustering_loss: 1.9305
Epoch 41/200 target_entropy: 1.2625
Updated learning rate to: 0.0009671553825065633
Saving model checkpoint models/capi_rope_vitreg4_b14_41.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_41.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_37.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_37_states...
Removing checkpoint models/rope_vitreg4_b14_capi_37.pt...
Time cost: 32m51.4s
---
Epoch 42/200 training_loss: 6.5602
Epoch 42/200 clustering_loss: 1.9186
Epoch 42/200 target_entropy: 1.2518
Updated learning rate to: 0.0009639918292235034
Saving model checkpoint models/capi_rope_vitreg4_b14_42.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_42.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_38.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_38_states...
Removing checkpoint models/rope_vitreg4_b14_capi_38.pt...
Time cost: 32m51.9s
---
Epoch 43/200 training_loss: 6.5252
Epoch 43/200 clustering_loss: 1.9056
Epoch 43/200 target_entropy: 1.2392
Updated learning rate to: 0.0009606885119721099
Saving model checkpoint models/capi_rope_vitreg4_b14_43.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_43.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_39.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_39_states...
Removing checkpoint models/rope_vitreg4_b14_capi_39.pt...
Time cost: 32m56.3s
---
Epoch 44/200 training_loss: 6.4913
Epoch 44/200 clustering_loss: 1.8941
Epoch 44/200 target_entropy: 1.2289
Updated learning rate to: 0.0009572464258873344
Saving model checkpoint models/capi_rope_vitreg4_b14_44.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_44.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_40.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_40_states...
Removing checkpoint models/rope_vitreg4_b14_capi_40.pt...
Time cost: 32m52.6s
---
Epoch 45/200 training_loss: 6.4608
Epoch 45/200 clustering_loss: 1.8837
Epoch 45/200 target_entropy: 1.2193
Updated learning rate to: 0.0009536666079086723
Saving model checkpoint models/capi_rope_vitreg4_b14_45.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_45.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_41.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_41_states...
Removing checkpoint models/rope_vitreg4_b14_capi_41.pt...
Time cost: 32m52.6s
---
Epoch 46/200 training_loss: 6.4284
Epoch 46/200 clustering_loss: 1.8722
Epoch 46/200 target_entropy: 1.2082
Updated learning rate to: 0.000949950136467809
Saving model checkpoint models/capi_rope_vitreg4_b14_46.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_46.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_42.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_42_states...
Removing checkpoint models/rope_vitreg4_b14_capi_42.pt...
Time cost: 32m50.6s
---
Epoch 47/200 training_loss: 6.3976
Epoch 47/200 clustering_loss: 1.8617
Epoch 47/200 target_entropy: 1.1983
Updated learning rate to: 0.000946098131163723
Saving model checkpoint models/capi_rope_vitreg4_b14_47.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_47.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_43.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_43_states...
Removing checkpoint models/rope_vitreg4_b14_capi_43.pt...
Time cost: 32m49.9s
---
Epoch 48/200 training_loss: 6.3662
Epoch 48/200 clustering_loss: 1.8513
Epoch 48/200 target_entropy: 1.1888
Updated learning rate to: 0.000942111752425399
Saving model checkpoint models/capi_rope_vitreg4_b14_48.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_48.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_44.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_44_states...
Removing checkpoint models/rope_vitreg4_b14_capi_44.pt...
Time cost: 32m49.7s
---
Epoch 49/200 training_loss: 6.3298
Epoch 49/200 clustering_loss: 1.8407
Epoch 49/200 target_entropy: 1.1797
Updated learning rate to: 0.0009379922011622562
Saving model checkpoint models/capi_rope_vitreg4_b14_49.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_49.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_45.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_45_states...
Removing checkpoint models/rope_vitreg4_b14_capi_45.pt...
Time cost: 32m44.8s
---
Epoch 50/200 training_loss: 6.2932
Epoch 50/200 clustering_loss: 1.8305
Epoch 50/200 target_entropy: 1.1704
Updated learning rate to: 0.0009337407184023574
Saving model checkpoint models/capi_rope_vitreg4_b14_50.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_50.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_46.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_46_states...
Removing checkpoint models/rope_vitreg4_b14_capi_46.pt...
Time cost: 32m41.2s
---
Epoch 51/200 training_loss: 6.2595
Epoch 51/200 clustering_loss: 1.8189
Epoch 51/200 target_entropy: 1.1601
Updated learning rate to: 0.0009293585849185674
Saving model checkpoint models/capi_rope_vitreg4_b14_51.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_51.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_47.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_47_states...
Removing checkpoint models/rope_vitreg4_b14_capi_47.pt...
Time cost: 32m48.9s
---
Epoch 52/200 training_loss: 6.2249
Epoch 52/200 clustering_loss: 1.8090
Epoch 52/200 target_entropy: 1.1520
Updated learning rate to: 0.0009248471208426868
Saving model checkpoint models/capi_rope_vitreg4_b14_52.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_52.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_48.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_48_states...
Removing checkpoint models/rope_vitreg4_b14_capi_48.pt...
Time cost: 32m43.2s
---
Epoch 53/200 training_loss: 6.1902
Epoch 53/200 clustering_loss: 1.7985
Epoch 53/200 target_entropy: 1.1419
Updated learning rate to: 0.0009202076852677824
Saving model checkpoint models/capi_rope_vitreg4_b14_53.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_53.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_49.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_49_states...
Removing checkpoint models/rope_vitreg4_b14_capi_49.pt...
Time cost: 32m40.9s
---
Epoch 54/200 training_loss: 6.1581
Epoch 54/200 clustering_loss: 1.7882
Epoch 54/200 target_entropy: 1.1321
Updated learning rate to: 0.0009154416758387467
Saving model checkpoint models/capi_rope_vitreg4_b14_54.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_54.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_50.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_50_states...
Removing checkpoint models/rope_vitreg4_b14_capi_50.pt...
Time cost: 32m43.9s
---
Epoch 55/200 training_loss: 6.1263
Epoch 55/200 clustering_loss: 1.7771
Epoch 55/200 target_entropy: 1.1227
Updated learning rate to: 0.0009105505283312465
Saving model checkpoint models/capi_rope_vitreg4_b14_55.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_55.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_51.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_51_states...
Removing checkpoint models/rope_vitreg4_b14_capi_51.pt...
Time cost: 32m46.3s
---
Epoch 56/200 training_loss: 6.0939
Epoch 56/200 clustering_loss: 1.7677
Epoch 56/200 target_entropy: 1.1156
Updated learning rate to: 0.0009055357162192037
Saving model checkpoint models/capi_rope_vitreg4_b14_56.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_56.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_52.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_52_states...
Removing checkpoint models/rope_vitreg4_b14_capi_52.pt...
Time cost: 32m53.9s
---
Epoch 57/200 training_loss: 6.0627
Epoch 57/200 clustering_loss: 1.7583
Epoch 57/200 target_entropy: 1.1077
Updated learning rate to: 0.0009003987502308961
Saving model checkpoint models/capi_rope_vitreg4_b14_57.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_57.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_53.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_53_states...
Removing checkpoint models/rope_vitreg4_b14_capi_53.pt...
Time cost: 32m47.7s
---
Epoch 58/200 training_loss: 6.0349
Epoch 58/200 clustering_loss: 1.7496
Epoch 58/200 target_entropy: 1.1001
Updated learning rate to: 0.0008951411778938494
Saving model checkpoint models/capi_rope_vitreg4_b14_58.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_58.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_54.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_54_states...
Removing checkpoint models/rope_vitreg4_b14_capi_54.pt...
Time cost: 32m49.9s
---
Epoch 59/200 training_loss: 6.0045
Epoch 59/200 clustering_loss: 1.7423
Epoch 59/200 target_entropy: 1.0942
Updated learning rate to: 0.0008897645830686416
Saving model checkpoint models/capi_rope_vitreg4_b14_59.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_59.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_55.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_55_states...
Removing checkpoint models/rope_vitreg4_b14_capi_55.pt...
Time cost: 32m48.7s
---
Epoch 60/200 training_loss: 5.9759
Epoch 60/200 clustering_loss: 1.7339
Epoch 60/200 target_entropy: 1.0865
Updated learning rate to: 0.0008842705854717593
Saving model checkpoint models/capi_rope_vitreg4_b14_60.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_60.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_56.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_56_states...
Removing checkpoint models/rope_vitreg4_b14_capi_56.pt...
Time cost: 32m47.5s
---
Epoch 61/200 training_loss: 5.9496
Epoch 61/200 clustering_loss: 1.7259
Epoch 61/200 target_entropy: 1.0791
Updated learning rate to: 0.0008786608401876489
Saving model checkpoint models/capi_rope_vitreg4_b14_61.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_61.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_57.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_57_states...
Removing checkpoint models/rope_vitreg4_b14_capi_57.pt...
Time cost: 32m49.1s
---
Epoch 62/200 training_loss: 5.9221
Epoch 62/200 clustering_loss: 1.7184
Epoch 62/200 target_entropy: 1.0728
Updated learning rate to: 0.0008729370371701193
Saving model checkpoint models/capi_rope_vitreg4_b14_62.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_62.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_58.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_58_states...
Removing checkpoint models/rope_vitreg4_b14_capi_58.pt...
Time cost: 32m46.4s
---
Epoch 63/200 training_loss: 5.8921
Epoch 63/200 clustering_loss: 1.7093
Epoch 63/200 target_entropy: 1.0648
Updated learning rate to: 0.0008671009007332444
Saving model checkpoint models/capi_rope_vitreg4_b14_63.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_63.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_59.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_59_states...
Removing checkpoint models/rope_vitreg4_b14_capi_59.pt...
Time cost: 32m47.4s
---
Epoch 64/200 training_loss: 5.8614
Epoch 64/200 clustering_loss: 1.7010
Epoch 64/200 target_entropy: 1.0580
Updated learning rate to: 0.0008611541890318961
Saving model checkpoint models/capi_rope_vitreg4_b14_64.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_64.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_60.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_60_states...
Removing checkpoint models/rope_vitreg4_b14_capi_60.pt...
Time cost: 32m51.3s
---
Epoch 65/200 training_loss: 5.8343
Epoch 65/200 clustering_loss: 1.6930
Epoch 65/200 target_entropy: 1.0517
Updated learning rate to: 0.0008550986935321035
Saving model checkpoint models/capi_rope_vitreg4_b14_65.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_65.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_61.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_61_states...
Removing checkpoint models/rope_vitreg4_b14_capi_61.pt...
Time cost: 32m48.6s
---
Epoch 66/200 training_loss: 5.8052
Epoch 66/200 clustering_loss: 1.6836
Epoch 66/200 target_entropy: 1.0429
Updated learning rate to: 0.0008489362384713594
Saving model checkpoint models/capi_rope_vitreg4_b14_66.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_66.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_62.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_62_states...
Removing checkpoint models/rope_vitreg4_b14_capi_62.pt...
Time cost: 32m43.5s
---
Epoch 67/200 training_loss: 5.7791
Epoch 67/200 clustering_loss: 1.6755
Epoch 67/200 target_entropy: 1.0361
Updated learning rate to: 0.0008426686803090767
Saving model checkpoint models/capi_rope_vitreg4_b14_67.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_67.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_63.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_63_states...
Removing checkpoint models/rope_vitreg4_b14_capi_63.pt...
Time cost: 32m50.4s
---
Epoch 68/200 training_loss: 5.7531
Epoch 68/200 clustering_loss: 1.6695
Epoch 68/200 target_entropy: 1.0317
Updated learning rate to: 0.0008362979071673079
Saving model checkpoint models/capi_rope_vitreg4_b14_68.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_68.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_64.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_64_states...
Removing checkpoint models/rope_vitreg4_b14_capi_64.pt...
Time cost: 32m49.1s
---
Epoch 69/200 training_loss: 5.7266
Epoch 69/200 clustering_loss: 1.6634
Epoch 69/200 target_entropy: 1.0269
Updated learning rate to: 0.0008298258382619576
Saving model checkpoint models/capi_rope_vitreg4_b14_69.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_69.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_65.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_65_states...
Removing checkpoint models/rope_vitreg4_b14_capi_65.pt...
Time cost: 32m49.9s
---
Epoch 70/200 training_loss: 5.6989
Epoch 70/200 clustering_loss: 1.6552
Epoch 70/200 target_entropy: 1.0197
Updated learning rate to: 0.0008232544233245987
Saving model checkpoint models/capi_rope_vitreg4_b14_70.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_70.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_66.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_66_states...
Removing checkpoint models/rope_vitreg4_b14_capi_66.pt...
Time cost: 32m47.9s
---
Epoch 71/200 training_loss: 5.6738
Epoch 71/200 clustering_loss: 1.6476
Epoch 71/200 target_entropy: 1.0135
Updated learning rate to: 0.000816585642015118
Saving model checkpoint models/capi_rope_vitreg4_b14_71.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_71.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_67.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_67_states...
Removing checkpoint models/rope_vitreg4_b14_capi_67.pt...
Time cost: 32m50.7s
---
Epoch 72/200 training_loss: 5.6496
Epoch 72/200 clustering_loss: 1.6420
Epoch 72/200 target_entropy: 1.0090
Updated learning rate to: 0.0008098215033253394
Saving model checkpoint models/capi_rope_vitreg4_b14_72.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_72.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_68.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_68_states...
Removing checkpoint models/rope_vitreg4_b14_capi_68.pt...
Time cost: 32m48.9s
---
Epoch 73/200 training_loss: 5.6265
Epoch 73/200 clustering_loss: 1.6362
Epoch 73/200 target_entropy: 1.0047
Updated learning rate to: 0.0008029640449737957
Saving model checkpoint models/capi_rope_vitreg4_b14_73.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_73.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_69.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_69_states...
Removing checkpoint models/rope_vitreg4_b14_capi_69.pt...
Time cost: 32m43.9s
---
Epoch 74/200 training_loss: 5.6045
Epoch 74/200 clustering_loss: 1.6295
Epoch 74/200 target_entropy: 0.9993
Updated learning rate to: 0.0007960153327918694
Saving model checkpoint models/capi_rope_vitreg4_b14_74.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_74.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_70.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_70_states...
Removing checkpoint models/rope_vitreg4_b14_capi_70.pt...
Time cost: 32m42.8s
---
Epoch 75/200 training_loss: 5.5799
Epoch 75/200 clustering_loss: 1.6222
Epoch 75/200 target_entropy: 0.9929
Updated learning rate to: 0.0007889774601014634
Saving model checkpoint models/capi_rope_vitreg4_b14_75.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_75.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_71.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_71_states...
Removing checkpoint models/rope_vitreg4_b14_capi_71.pt...
Time cost: 32m41.3s
---
Epoch 76/200 training_loss: 5.5585
Epoch 76/200 clustering_loss: 1.6134
Epoch 76/200 target_entropy: 0.9850
Updated learning rate to: 0.0007818525470843588
Saving model checkpoint models/capi_rope_vitreg4_b14_76.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_76.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_72.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_72_states...
Removing checkpoint models/rope_vitreg4_b14_capi_72.pt...
Time cost: 32m46.0s
---
Epoch 77/200 training_loss: 5.5355
Epoch 77/200 clustering_loss: 1.6051
Epoch 77/200 target_entropy: 0.9783
Updated learning rate to: 0.000774642740143523
Saving model checkpoint models/capi_rope_vitreg4_b14_77.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_77.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_73.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_73_states...
Removing checkpoint models/rope_vitreg4_b14_capi_73.pt...
Time cost: 32m40.6s
---
Epoch 78/200 training_loss: 5.5163
Epoch 78/200 clustering_loss: 1.5971
Epoch 78/200 target_entropy: 0.9717
Updated learning rate to: 0.0007673502112564829
Saving model checkpoint models/capi_rope_vitreg4_b14_78.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_78.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_74.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_74_states...
Removing checkpoint models/rope_vitreg4_b14_capi_74.pt...
Time cost: 32m45.5s
---
Epoch 79/200 training_loss: 5.4933
Epoch 79/200 clustering_loss: 1.5894
Epoch 79/200 target_entropy: 0.9652
Updated learning rate to: 0.0007599771573210242
Saving model checkpoint models/capi_rope_vitreg4_b14_79.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_79.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_75.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_75_states...
Removing checkpoint models/rope_vitreg4_b14_capi_75.pt...
Time cost: 32m40.3s
---
Epoch 80/200 training_loss: 5.4753
Epoch 80/200 clustering_loss: 1.5846
Epoch 80/200 target_entropy: 0.9623
Updated learning rate to: 0.0007525257994933621
Saving model checkpoint models/capi_rope_vitreg4_b14_80.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_80.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_76.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_76_states...
Removing checkpoint models/rope_vitreg4_b14_capi_76.pt...
Time cost: 32m42.5s
---
Epoch 81/200 training_loss: 5.4581
Epoch 81/200 clustering_loss: 1.5785
Epoch 81/200 target_entropy: 0.9572
Updated learning rate to: 0.0007449983825190082
Saving model checkpoint models/capi_rope_vitreg4_b14_81.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_81.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_77.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_77_states...
Removing checkpoint models/rope_vitreg4_b14_capi_77.pt...
Time cost: 32m48.3s
---
Epoch 82/200 training_loss: 5.4399
Epoch 82/200 clustering_loss: 1.5714
Epoch 82/200 target_entropy: 0.9508
Updated learning rate to: 0.0007373971740565361
Saving model checkpoint models/capi_rope_vitreg4_b14_82.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_82.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_78.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_78_states...
Removing checkpoint models/rope_vitreg4_b14_capi_78.pt...
Time cost: 32m45.7s
---
Epoch 83/200 training_loss: 5.4244
Epoch 83/200 clustering_loss: 1.5651
Epoch 83/200 target_entropy: 0.9461
Updated learning rate to: 0.0007297244639944488
Saving model checkpoint models/capi_rope_vitreg4_b14_83.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_83.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_79.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_79_states...
Removing checkpoint models/rope_vitreg4_b14_capi_79.pt...
Time cost: 32m41.1s
---
Epoch 84/200 training_loss: 5.4067
Epoch 84/200 clustering_loss: 1.5590
Epoch 84/200 target_entropy: 0.9419
Updated learning rate to: 0.0007219825637613279
Saving model checkpoint models/capi_rope_vitreg4_b14_84.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_84.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_80.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_80_states...
Removing checkpoint models/rope_vitreg4_b14_capi_80.pt...
Time cost: 32m45.9s
---
Epoch 85/200 training_loss: 5.3857
Epoch 85/200 clustering_loss: 1.5531
Epoch 85/200 target_entropy: 0.9370
Updated learning rate to: 0.0007141738056295242
Saving model checkpoint models/capi_rope_vitreg4_b14_85.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_85.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_81.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_81_states...
Removing checkpoint models/rope_vitreg4_b14_capi_81.pt...
Time cost: 32m41.0s
---
Epoch 86/200 training_loss: 5.3673
Epoch 86/200 clustering_loss: 1.5452
Epoch 86/200 target_entropy: 0.9298
Updated learning rate to: 0.0007063005420125365
Saving model checkpoint models/capi_rope_vitreg4_b14_86.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_86.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_82.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_82_states...
Removing checkpoint models/rope_vitreg4_b14_capi_82.pt...
Time cost: 32m43.3s
---
Epoch 87/200 training_loss: 5.3494
Epoch 87/200 clustering_loss: 1.5374
Epoch 87/200 target_entropy: 0.9231
Updated learning rate to: 0.0006983651447563605
Saving model checkpoint models/capi_rope_vitreg4_b14_87.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_87.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_83.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_83_states...
Removing checkpoint models/rope_vitreg4_b14_capi_83.pt...
Time cost: 32m43.7s
---
Epoch 88/200 training_loss: 5.3332
Epoch 88/200 clustering_loss: 1.5325
Epoch 88/200 target_entropy: 0.9192
Updated learning rate to: 0.0006903700044249427
Saving model checkpoint models/capi_rope_vitreg4_b14_88.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_88.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_84.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_84_states...
Removing checkpoint models/rope_vitreg4_b14_capi_84.pt...
Time cost: 32m45.6s
---
Epoch 89/200 training_loss: 5.3172
Epoch 89/200 clustering_loss: 1.5269
Epoch 89/200 target_entropy: 0.9145
Updated learning rate to: 0.0006823175295800226
Saving model checkpoint models/capi_rope_vitreg4_b14_89.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_89.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_85.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_85_states...
Removing checkpoint models/rope_vitreg4_b14_capi_85.pt...
Time cost: 32m49.0s
---
Epoch 90/200 training_loss: 5.3011
Epoch 90/200 clustering_loss: 1.5228
Epoch 90/200 target_entropy: 0.9113
Updated learning rate to: 0.0006742101460555493
Saving model checkpoint models/capi_rope_vitreg4_b14_90.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_90.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_86.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_86_states...
Removing checkpoint models/rope_vitreg4_b14_capi_86.pt...
Time cost: 32m42.8s
---
Epoch 91/200 training_loss: 5.2841
Epoch 91/200 clustering_loss: 1.5174
Epoch 91/200 target_entropy: 0.9065
Updated learning rate to: 0.0006660502962268847
Saving model checkpoint models/capi_rope_vitreg4_b14_91.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_91.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_87.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_87_states...
Removing checkpoint models/rope_vitreg4_b14_capi_87.pt...
Time cost: 32m41.5s
---
Epoch 92/200 training_loss: 5.2683
Epoch 92/200 clustering_loss: 1.5130
Epoch 92/200 target_entropy: 0.9034
Updated learning rate to: 0.0006578404382750364
Saving model checkpoint models/capi_rope_vitreg4_b14_92.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_92.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_88.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_88_states...
Removing checkpoint models/rope_vitreg4_b14_capi_88.pt...
Time cost: 32m46.2s
---
Epoch 93/200 training_loss: 5.2526
Epoch 93/200 clustering_loss: 1.5077
Epoch 93/200 target_entropy: 0.8991
Updated learning rate to: 0.0006495830454461216
Saving model checkpoint models/capi_rope_vitreg4_b14_93.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_93.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_89.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_89_states...
Removing checkpoint models/rope_vitreg4_b14_capi_89.pt...
Time cost: 32m44.1s
---
Epoch 94/200 training_loss: 5.2367
Epoch 94/200 clustering_loss: 1.5020
Epoch 94/200 target_entropy: 0.8944
Updated learning rate to: 0.0006412806053062902
Saving model checkpoint models/capi_rope_vitreg4_b14_94.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_94.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_90.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_90_states...
Removing checkpoint models/rope_vitreg4_b14_capi_90.pt...
Time cost: 32m34.5s
---
Epoch 95/200 training_loss: 5.2201
Epoch 95/200 clustering_loss: 1.4970
Epoch 95/200 target_entropy: 0.8904
Updated learning rate to: 0.0006329356189923407
Saving model checkpoint models/capi_rope_vitreg4_b14_95.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_95.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_91.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_91_states...
Removing checkpoint models/rope_vitreg4_b14_capi_91.pt...
Time cost: 32m41.7s
---
Epoch 96/200 training_loss: 5.2060
Epoch 96/200 clustering_loss: 1.4939
Epoch 96/200 target_entropy: 0.8882
Updated learning rate to: 0.0006245506004582433
Saving model checkpoint models/capi_rope_vitreg4_b14_96.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_96.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_92.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_92_states...
Removing checkpoint models/rope_vitreg4_b14_capi_92.pt...
Time cost: 32m39.5s
---
Epoch 97/200 training_loss: 5.1900
Epoch 97/200 clustering_loss: 1.4904
Epoch 97/200 target_entropy: 0.8858
Updated learning rate to: 0.0006161280757177955
Saving model checkpoint models/capi_rope_vitreg4_b14_97.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_97.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_93.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_93_states...
Removing checkpoint models/rope_vitreg4_b14_capi_93.pt...
Time cost: 32m33.8s
---
Epoch 98/200 training_loss: 5.1746
Epoch 98/200 clustering_loss: 1.4861
Epoch 98/200 target_entropy: 0.8821
Updated learning rate to: 0.0006076705820836695
Saving model checkpoint models/capi_rope_vitreg4_b14_98.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_98.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_94.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_94_states...
Removing checkpoint models/rope_vitreg4_b14_capi_94.pt...
Time cost: 32m43.7s
---
Epoch 99/200 training_loss: 5.1612
Epoch 99/200 clustering_loss: 1.4820
Epoch 99/200 target_entropy: 0.8790
Updated learning rate to: 0.0005991806674030302
Saving model checkpoint models/capi_rope_vitreg4_b14_99.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_99.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_95.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_95_states...
Removing checkpoint models/rope_vitreg4_b14_capi_95.pt...
Time cost: 32m41.1s
---
Epoch 100/200 training_loss: 5.1454
Epoch 100/200 clustering_loss: 1.4776
Epoch 100/200 target_entropy: 0.8756
Updated learning rate to: 0.0005906608892899771
Saving model checkpoint models/capi_rope_vitreg4_b14_100.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_100.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_96.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_96_states...
Removing checkpoint models/rope_vitreg4_b14_capi_96.pt...
Time cost: 32m41.4s
---
Epoch 101/200 training_loss: 5.1332
Epoch 101/200 clustering_loss: 1.4733
Epoch 101/200 target_entropy: 0.8712
Updated learning rate to: 0.0005821138143550737
Saving model checkpoint models/capi_rope_vitreg4_b14_101.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_101.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_97.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_97_states...
Removing checkpoint models/rope_vitreg4_b14_capi_97.pt...
Time cost: 32m44.1s
---
Epoch 102/200 training_loss: 5.1228
Epoch 102/200 clustering_loss: 1.4695
Epoch 102/200 target_entropy: 0.8681
Updated learning rate to: 0.0005735420174321354
Saving model checkpoint models/capi_rope_vitreg4_b14_102.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_102.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_98.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_98_states...
Removing checkpoint models/rope_vitreg4_b14_capi_98.pt...
Time cost: 32m46.7s
---
Epoch 103/200 training_loss: 5.1139
Epoch 103/200 clustering_loss: 1.4660
Epoch 103/200 target_entropy: 0.8656
Updated learning rate to: 0.0005649480808025555
Saving model checkpoint models/capi_rope_vitreg4_b14_103.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_103.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_99.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_99_states...
Removing checkpoint models/rope_vitreg4_b14_capi_99.pt...
Time cost: 32m45.2s
---
Epoch 104/200 training_loss: 5.1051
Epoch 104/200 clustering_loss: 1.4636
Epoch 104/200 target_entropy: 0.8640
Updated learning rate to: 0.0005563345934173908
Saving model checkpoint models/capi_rope_vitreg4_b14_104.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_104.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_100.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_100_states...
Removing checkpoint models/rope_vitreg4_b14_capi_100.pt...
Time cost: 32m50.9s
---
Epoch 105/200 training_loss: 5.0939
Epoch 105/200 clustering_loss: 1.4606
Epoch 105/200 target_entropy: 0.8617
Updated learning rate to: 0.0005477041501174173
Saving model checkpoint models/capi_rope_vitreg4_b14_105.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_105.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_101.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_101_states...
Removing checkpoint models/rope_vitreg4_b14_capi_101.pt...
Time cost: 32m49.5s
---
Epoch 106/200 training_loss: 5.0848
Epoch 106/200 clustering_loss: 1.4586
Epoch 106/200 target_entropy: 0.8605
Updated learning rate to: 0.0005390593508514405
Saving model checkpoint models/capi_rope_vitreg4_b14_106.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_106.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_102.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_102_states...
Removing checkpoint models/rope_vitreg4_b14_capi_102.pt...
Time cost: 32m42.9s
---
Epoch 107/200 training_loss: 5.0760
Epoch 107/200 clustering_loss: 1.4559
Epoch 107/200 target_entropy: 0.8584
Updated learning rate to: 0.0005304027998930416
Saving model checkpoint models/capi_rope_vitreg4_b14_107.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_107.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_103.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_103_states...
Removing checkpoint models/rope_vitreg4_b14_capi_103.pt...
Time cost: 32m48.3s
---
Epoch 108/200 training_loss: 5.0666
Epoch 108/200 clustering_loss: 1.4531
Epoch 108/200 target_entropy: 0.8561
Updated learning rate to: 0.0005217371050560449
Saving model checkpoint models/capi_rope_vitreg4_b14_108.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_108.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_104.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_104_states...
Removing checkpoint models/rope_vitreg4_b14_capi_104.pt...
Time cost: 32m38.6s
---
Epoch 109/200 training_loss: 5.0573
Epoch 109/200 clustering_loss: 1.4504
Epoch 109/200 target_entropy: 0.8542
Updated learning rate to: 0.0005130648769088946
Saving model checkpoint models/capi_rope_vitreg4_b14_109.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_109.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_105.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_105_states...
Removing checkpoint models/rope_vitreg4_b14_capi_105.pt...
Time cost: 32m47.2s
---
Epoch 110/200 training_loss: 5.0511
Epoch 110/200 clustering_loss: 1.4485
Epoch 110/200 target_entropy: 0.8535
Updated learning rate to: 0.0005043887279882134
Saving model checkpoint models/capi_rope_vitreg4_b14_110.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_110.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_106.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_106_states...
Removing checkpoint models/rope_vitreg4_b14_capi_106.pt...
Time cost: 32m44.1s
---
Epoch 111/200 training_loss: 5.0434
Epoch 111/200 clustering_loss: 1.4464
Epoch 111/200 target_entropy: 0.8518
Updated learning rate to: 0.0004957112720117694
Saving model checkpoint models/capi_rope_vitreg4_b14_111.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_111.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_107.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_107_states...
Removing checkpoint models/rope_vitreg4_b14_capi_107.pt...
Time cost: 32m41.4s
---
Epoch 112/200 training_loss: 5.0358
Epoch 112/200 clustering_loss: 1.4435
Epoch 112/200 target_entropy: 0.8490
Updated learning rate to: 0.0004870351230910882
Saving model checkpoint models/capi_rope_vitreg4_b14_112.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_112.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_108.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_108_states...
Removing checkpoint models/rope_vitreg4_b14_capi_108.pt...
Time cost: 32m47.2s
---
Epoch 113/200 training_loss: 5.0317
Epoch 113/200 clustering_loss: 1.4416
Epoch 113/200 target_entropy: 0.8478
Updated learning rate to: 0.000478362894943939
Saving model checkpoint models/capi_rope_vitreg4_b14_113.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_113.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_109.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_109_states...
Removing checkpoint models/rope_vitreg4_b14_capi_109.pt...
Time cost: 32m47.5s
---
Epoch 114/200 training_loss: 5.0276
Epoch 114/200 clustering_loss: 1.4402
Epoch 114/200 target_entropy: 0.8469
Updated learning rate to: 0.0004696972001069459
Saving model checkpoint models/capi_rope_vitreg4_b14_114.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_114.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_110.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_110_states...
Removing checkpoint models/rope_vitreg4_b14_capi_110.pt...
Time cost: 32m49.7s
---
Epoch 115/200 training_loss: 5.0219
Epoch 115/200 clustering_loss: 1.4366
Epoch 115/200 target_entropy: 0.8440
Updated learning rate to: 0.00046104064914854873
Saving model checkpoint models/capi_rope_vitreg4_b14_115.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_115.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_111.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_111_states...
Removing checkpoint models/rope_vitreg4_b14_capi_111.pt...
Time cost: 32m49.9s
---
Epoch 116/200 training_loss: 5.0205
Epoch 116/200 clustering_loss: 1.4350
Epoch 116/200 target_entropy: 0.8430
Updated learning rate to: 0.0004523958498825716
Saving model checkpoint models/capi_rope_vitreg4_b14_116.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_116.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_112.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_112_states...
Removing checkpoint models/rope_vitreg4_b14_capi_112.pt...
Time cost: 32m42.6s
---
Epoch 117/200 training_loss: 5.0190
Epoch 117/200 clustering_loss: 1.4340
Epoch 117/200 target_entropy: 0.8424
Updated learning rate to: 0.0004437654065825967
Saving model checkpoint models/capi_rope_vitreg4_b14_117.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_117.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_113.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_113_states...
Removing checkpoint models/rope_vitreg4_b14_capi_113.pt...
Time cost: 32m48.1s
---
Epoch 118/200 training_loss: 5.0165
Epoch 118/200 clustering_loss: 1.4329
Epoch 118/200 target_entropy: 0.8417
Updated learning rate to: 0.00043515191919743203
Saving model checkpoint models/capi_rope_vitreg4_b14_118.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_118.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_114.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_114_states...
Removing checkpoint models/rope_vitreg4_b14_capi_114.pt...
Time cost: 32m44.7s
---
Epoch 119/200 training_loss: 5.0129
Epoch 119/200 clustering_loss: 1.4313
Epoch 119/200 target_entropy: 0.8407
Updated learning rate to: 0.0004265579825678553
Saving model checkpoint models/capi_rope_vitreg4_b14_119.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_119.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_115.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_115_states...
Removing checkpoint models/rope_vitreg4_b14_capi_115.pt...
Time cost: 32m47.2s
---
Epoch 120/200 training_loss: 5.0107
Epoch 120/200 clustering_loss: 1.4295
Epoch 120/200 target_entropy: 0.8397
Updated learning rate to: 0.0004179861856449166
Saving model checkpoint models/capi_rope_vitreg4_b14_120.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_120.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_116.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_116_states...
Removing checkpoint models/rope_vitreg4_b14_capi_116.pt...
Time cost: 32m49.1s
---
Epoch 121/200 training_loss: 5.0057
Epoch 121/200 clustering_loss: 1.4275
Epoch 121/200 target_entropy: 0.8385
Updated learning rate to: 0.0004094391107100128
Saving model checkpoint models/capi_rope_vitreg4_b14_121.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_121.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_117.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_117_states...
Removing checkpoint models/rope_vitreg4_b14_capi_117.pt...
Time cost: 32m46.1s
---
Epoch 122/200 training_loss: 5.0024
Epoch 122/200 clustering_loss: 1.4253
Epoch 122/200 target_entropy: 0.8366
Updated learning rate to: 0.0004009193325969589
Saving model checkpoint models/capi_rope_vitreg4_b14_122.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_122.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_118.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_118_states...
Removing checkpoint models/rope_vitreg4_b14_capi_118.pt...
Time cost: 32m46.5s
---
Epoch 123/200 training_loss: 4.9997
Epoch 123/200 clustering_loss: 1.4227
Epoch 123/200 target_entropy: 0.8344
Updated learning rate to: 0.000392429417916316
Saving model checkpoint models/capi_rope_vitreg4_b14_123.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_123.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_119.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_119_states...
Removing checkpoint models/rope_vitreg4_b14_capi_119.pt...
Time cost: 32m45.7s
---
Epoch 124/200 training_loss: 4.9964
Epoch 124/200 clustering_loss: 1.4198
Epoch 124/200 target_entropy: 0.8326
Updated learning rate to: 0.00038397192428219114
Saving model checkpoint models/capi_rope_vitreg4_b14_124.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_124.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_120.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_120_states...
Removing checkpoint models/rope_vitreg4_b14_capi_120.pt...
Time cost: 32m45.7s
---
Epoch 125/200 training_loss: 4.9932
Epoch 125/200 clustering_loss: 1.4181
Epoch 125/200 target_entropy: 0.8313
Updated learning rate to: 0.00037554939954174326
Saving model checkpoint models/capi_rope_vitreg4_b14_125.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_125.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_121.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_121_states...
Removing checkpoint models/rope_vitreg4_b14_capi_121.pt...
Time cost: 32m42.8s
---
Epoch 126/200 training_loss: 4.9890
Epoch 126/200 clustering_loss: 1.4155
Epoch 126/200 target_entropy: 0.8289
Updated learning rate to: 0.0003671643810076428
Saving model checkpoint models/capi_rope_vitreg4_b14_126.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_126.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_122.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_122_states...
Removing checkpoint models/rope_vitreg4_b14_capi_122.pt...
Time cost: 32m48.7s
---
Epoch 127/200 training_loss: 4.9887
Epoch 127/200 clustering_loss: 1.4142
Epoch 127/200 target_entropy: 0.8279
Updated learning rate to: 0.000358819394693693
Saving model checkpoint models/capi_rope_vitreg4_b14_127.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_127.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_123.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_123_states...
Removing checkpoint models/rope_vitreg4_b14_capi_123.pt...
Time cost: 32m42.6s
---
Epoch 128/200 training_loss: 4.9887
Epoch 128/200 clustering_loss: 1.4139
Epoch 128/200 target_entropy: 0.8283
Updated learning rate to: 0.00035051695455386297
Saving model checkpoint models/capi_rope_vitreg4_b14_128.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_128.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_124.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_124_states...
Removing checkpoint models/rope_vitreg4_b14_capi_124.pt...
Time cost: 32m43.8s
---
Epoch 129/200 training_loss: 4.9855
Epoch 129/200 clustering_loss: 1.4123
Epoch 129/200 target_entropy: 0.8272
Updated learning rate to: 0.0003422595617249509
Saving model checkpoint models/capi_rope_vitreg4_b14_129.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_129.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_125.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_125_states...
Removing checkpoint models/rope_vitreg4_b14_capi_125.pt...
Time cost: 32m50.0s
---
Epoch 130/200 training_loss: 4.9820
Epoch 130/200 clustering_loss: 1.4097
Epoch 130/200 target_entropy: 0.8253
Updated learning rate to: 0.0003340497037731065
Saving model checkpoint models/capi_rope_vitreg4_b14_130.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_130.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_126.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_126_states...
Removing checkpoint models/rope_vitreg4_b14_capi_126.pt...
Time cost: 32m43.7s
---
Epoch 131/200 training_loss: 4.9802
Epoch 131/200 clustering_loss: 1.4082
Epoch 131/200 target_entropy: 0.8238
Updated learning rate to: 0.00032588985394444353
Saving model checkpoint models/capi_rope_vitreg4_b14_131.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_131.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_127.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_127_states...
Removing checkpoint models/rope_vitreg4_b14_capi_127.pt...
Time cost: 32m46.5s
---
Epoch 132/200 training_loss: 4.9792
Epoch 132/200 clustering_loss: 1.4073
Epoch 132/200 target_entropy: 0.8233
Updated learning rate to: 0.0003177824704199702
Saving model checkpoint models/capi_rope_vitreg4_b14_132.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_132.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_128.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_128_states...
Removing checkpoint models/rope_vitreg4_b14_capi_128.pt...
Time cost: 32m45.5s
---
Epoch 133/200 training_loss: 4.9782
Epoch 133/200 clustering_loss: 1.4062
Epoch 133/200 target_entropy: 0.8224
Updated learning rate to: 0.00030972999557505004
Saving model checkpoint models/capi_rope_vitreg4_b14_133.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_133.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_129.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_129_states...
Removing checkpoint models/rope_vitreg4_b14_capi_129.pt...
Time cost: 32m48.9s
---
Epoch 134/200 training_loss: 4.9762
Epoch 134/200 clustering_loss: 1.4049
Epoch 134/200 target_entropy: 0.8214
Updated learning rate to: 0.00030173485524363216
Saving model checkpoint models/capi_rope_vitreg4_b14_134.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_134.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_130.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_130_states...
Removing checkpoint models/rope_vitreg4_b14_capi_130.pt...
Time cost: 32m50.9s
---
Epoch 135/200 training_loss: 4.9720
Epoch 135/200 clustering_loss: 1.4025
Epoch 135/200 target_entropy: 0.8194
Updated learning rate to: 0.0002937994579874572
Saving model checkpoint models/capi_rope_vitreg4_b14_135.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_135.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_131.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_131_states...
Removing checkpoint models/rope_vitreg4_b14_capi_131.pt...
Time cost: 32m47.4s
---
Epoch 136/200 training_loss: 4.9689
Epoch 136/200 clustering_loss: 1.4009
Epoch 136/200 target_entropy: 0.8184
Updated learning rate to: 0.00028592619437047245
Saving model checkpoint models/capi_rope_vitreg4_b14_136.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_136.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_132.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_132_states...
Removing checkpoint models/rope_vitreg4_b14_capi_132.pt...
Time cost: 32m41.7s
---
Epoch 137/200 training_loss: 4.9651
Epoch 137/200 clustering_loss: 1.3996
Epoch 137/200 target_entropy: 0.8174
Updated learning rate to: 0.00027811743623866735
Saving model checkpoint models/capi_rope_vitreg4_b14_137.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_137.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_133.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_133_states...
Removing checkpoint models/rope_vitreg4_b14_capi_133.pt...
Time cost: 32m49.5s
---
Epoch 138/200 training_loss: 4.9619
Epoch 138/200 clustering_loss: 1.3984
Epoch 138/200 target_entropy: 0.8165
Updated learning rate to: 0.0002703755360055489
Saving model checkpoint models/capi_rope_vitreg4_b14_138.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_138.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_134.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_134_states...
Removing checkpoint models/rope_vitreg4_b14_capi_134.pt...
Time cost: 32m48.9s
---
Epoch 139/200 training_loss: 4.9601
Epoch 139/200 clustering_loss: 1.3979
Epoch 139/200 target_entropy: 0.8165
Updated learning rate to: 0.0002627028259434617
Saving model checkpoint models/capi_rope_vitreg4_b14_139.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_139.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_135.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_135_states...
Removing checkpoint models/rope_vitreg4_b14_capi_135.pt...
Time cost: 32m48.0s
---
Epoch 140/200 training_loss: 4.9573
Epoch 140/200 clustering_loss: 1.3967
Epoch 140/200 target_entropy: 0.8155
Updated learning rate to: 0.0002551016174809902
Saving model checkpoint models/capi_rope_vitreg4_b14_140.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_140.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_136.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_136_states...
Removing checkpoint models/rope_vitreg4_b14_capi_136.pt...
Time cost: 32m49.3s
---
Epoch 141/200 training_loss: 4.9558
Epoch 141/200 clustering_loss: 1.3961
Epoch 141/200 target_entropy: 0.8150
Updated learning rate to: 0.0002475742005066348
Saving model checkpoint models/capi_rope_vitreg4_b14_141.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_141.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_137.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_137_states...
Removing checkpoint models/rope_vitreg4_b14_capi_137.pt...
Time cost: 32m43.4s
---
Epoch 142/200 training_loss: 4.9526
Epoch 142/200 clustering_loss: 1.3943
Epoch 142/200 target_entropy: 0.8134
Updated learning rate to: 0.00024012284267897229
Saving model checkpoint models/capi_rope_vitreg4_b14_142.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_142.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_138.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_138_states...
Removing checkpoint models/rope_vitreg4_b14_capi_138.pt...
Time cost: 32m44.1s
---
Epoch 143/200 training_loss: 4.9500
Epoch 143/200 clustering_loss: 1.3927
Epoch 143/200 target_entropy: 0.8120
Updated learning rate to: 0.00023274978874351465
Saving model checkpoint models/capi_rope_vitreg4_b14_143.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_143.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_139.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_139_states...
Removing checkpoint models/rope_vitreg4_b14_capi_139.pt...
Time cost: 32m44.7s
---
Epoch 144/200 training_loss: 4.9466
Epoch 144/200 clustering_loss: 1.3915
Epoch 144/200 target_entropy: 0.8111
Updated learning rate to: 0.00022545725985647556
Saving model checkpoint models/capi_rope_vitreg4_b14_144.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_144.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_140.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_140_states...
Removing checkpoint models/rope_vitreg4_b14_capi_140.pt...
Time cost: 32m48.9s
---
Epoch 145/200 training_loss: 4.9457
Epoch 145/200 clustering_loss: 1.3912
Epoch 145/200 target_entropy: 0.8111
Updated learning rate to: 0.0002182474529156374
Saving model checkpoint models/capi_rope_vitreg4_b14_145.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_145.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_141.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_141_states...
Removing checkpoint models/rope_vitreg4_b14_capi_141.pt...
Time cost: 32m54.0s
---
Epoch 146/200 training_loss: 4.9433
Epoch 146/200 clustering_loss: 1.3901
Epoch 146/200 target_entropy: 0.8104
Updated learning rate to: 0.00021112253989853376
Saving model checkpoint models/capi_rope_vitreg4_b14_146.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_146.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_142.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_142_states...
Removing checkpoint models/rope_vitreg4_b14_capi_142.pt...
Time cost: 32m48.7s
---
Epoch 147/200 training_loss: 4.9407
Epoch 147/200 clustering_loss: 1.3894
Epoch 147/200 target_entropy: 0.8098
Updated learning rate to: 0.0002040846672081273
Saving model checkpoint models/capi_rope_vitreg4_b14_147.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_147.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_143.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_143_states...
Removing checkpoint models/rope_vitreg4_b14_capi_143.pt...
Time cost: 32m44.2s
---
Epoch 148/200 training_loss: 4.9380
Epoch 148/200 clustering_loss: 1.3884
Epoch 148/200 target_entropy: 0.8089
Updated learning rate to: 0.0001971359550262029
Saving model checkpoint models/capi_rope_vitreg4_b14_148.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_148.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_144.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_144_states...
Removing checkpoint models/rope_vitreg4_b14_capi_144.pt...
Time cost: 32m44.0s
---
Epoch 149/200 training_loss: 4.9358
Epoch 149/200 clustering_loss: 1.3876
Epoch 149/200 target_entropy: 0.8084
Updated learning rate to: 0.00019027849667465672
Saving model checkpoint models/capi_rope_vitreg4_b14_149.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_149.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_145.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_145_states...
Removing checkpoint models/rope_vitreg4_b14_capi_145.pt...
Time cost: 32m51.5s
---
Epoch 150/200 training_loss: 4.9334
Epoch 150/200 clustering_loss: 1.3869
Epoch 150/200 target_entropy: 0.8078
Updated learning rate to: 0.00018351435798487573
Saving model checkpoint models/capi_rope_vitreg4_b14_150.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_150.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_146.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_146_states...
Removing checkpoint models/rope_vitreg4_b14_capi_146.pt...
Time cost: 32m45.6s
---
Epoch 151/200 training_loss: 4.9307
Epoch 151/200 clustering_loss: 1.3860
Epoch 151/200 target_entropy: 0.8070
Updated learning rate to: 0.00017684557667539566
Saving model checkpoint models/capi_rope_vitreg4_b14_151.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_151.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_147.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_147_states...
Removing checkpoint models/rope_vitreg4_b14_capi_147.pt...
Time cost: 32m48.8s
---
Epoch 152/200 training_loss: 4.9271
Epoch 152/200 clustering_loss: 1.3849
Epoch 152/200 target_entropy: 0.8060
Updated learning rate to: 0.00017027416173803704
Saving model checkpoint models/capi_rope_vitreg4_b14_152.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_152.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_148.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_148_states...
Removing checkpoint models/rope_vitreg4_b14_capi_148.pt...
Time cost: 32m49.8s
---
Epoch 153/200 training_loss: 4.9238
Epoch 153/200 clustering_loss: 1.3836
Epoch 153/200 target_entropy: 0.8050
Updated learning rate to: 0.0001638020928326858
Saving model checkpoint models/capi_rope_vitreg4_b14_153.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_153.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_149.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_149_states...
Removing checkpoint models/rope_vitreg4_b14_capi_149.pt...
Time cost: 32m40.6s
---
Epoch 154/200 training_loss: 4.9214
Epoch 154/200 clustering_loss: 1.3826
Epoch 154/200 target_entropy: 0.8042
Updated learning rate to: 0.00015743131969091803
Saving model checkpoint models/capi_rope_vitreg4_b14_154.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_154.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_150.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_150_states...
Removing checkpoint models/rope_vitreg4_b14_capi_150.pt...
Time cost: 32m49.4s
---
Epoch 155/200 training_loss: 4.9195
Epoch 155/200 clustering_loss: 1.3819
Epoch 155/200 target_entropy: 0.8037
Updated learning rate to: 0.00015116376152863475
Saving model checkpoint models/capi_rope_vitreg4_b14_155.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_155.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_151.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_151_states...
Removing checkpoint models/rope_vitreg4_b14_capi_151.pt...
Time cost: 32m48.8s
---
Epoch 156/200 training_loss: 4.9171
Epoch 156/200 clustering_loss: 1.3810
Epoch 156/200 target_entropy: 0.8029
Updated learning rate to: 0.0001450013064678913
Saving model checkpoint models/capi_rope_vitreg4_b14_156.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_156.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_152.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_152_states...
Removing checkpoint models/rope_vitreg4_b14_capi_152.pt...
Time cost: 32m44.7s
---
Epoch 157/200 training_loss: 4.9140
Epoch 157/200 clustering_loss: 1.3801
Epoch 157/200 target_entropy: 0.8022
Updated learning rate to: 0.00013894581096809722
Saving model checkpoint models/capi_rope_vitreg4_b14_157.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_157.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_153.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_153_states...
Removing checkpoint models/rope_vitreg4_b14_capi_153.pt...
Time cost: 32m48.8s
---
Epoch 158/200 training_loss: 4.9107
Epoch 158/200 clustering_loss: 1.3791
Epoch 158/200 target_entropy: 0.8013
Updated learning rate to: 0.0001329990992667496
Saving model checkpoint models/capi_rope_vitreg4_b14_158.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_158.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_154.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_154_states...
Removing checkpoint models/rope_vitreg4_b14_capi_154.pt...
Time cost: 32m46.4s
---
Epoch 159/200 training_loss: 4.9075
Epoch 159/200 clustering_loss: 1.3785
Epoch 159/200 target_entropy: 0.8010
Updated learning rate to: 0.00012716296282987442
Saving model checkpoint models/capi_rope_vitreg4_b14_159.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_159.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_155.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_155_states...
Removing checkpoint models/rope_vitreg4_b14_capi_155.pt...
Time cost: 32m46.1s
---
Epoch 160/200 training_loss: 4.9051
Epoch 160/200 clustering_loss: 1.3780
Epoch 160/200 target_entropy: 0.8006
Updated learning rate to: 0.00012143915981234681
Saving model checkpoint models/capi_rope_vitreg4_b14_160.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_160.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_156.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_156_states...
Removing checkpoint models/rope_vitreg4_b14_capi_156.pt...
Time cost: 32m51.7s
---
Epoch 161/200 training_loss: 4.9027
Epoch 161/200 clustering_loss: 1.3774
Epoch 161/200 target_entropy: 0.8001
Updated learning rate to: 0.00011582941452823614
Saving model checkpoint models/capi_rope_vitreg4_b14_161.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_161.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_157.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_157_states...
Removing checkpoint models/rope_vitreg4_b14_capi_157.pt...
Time cost: 32m51.4s
---
Epoch 162/200 training_loss: 4.9000
Epoch 162/200 clustering_loss: 1.3765
Epoch 162/200 target_entropy: 0.7993
Updated learning rate to: 0.00011033541693135373
Saving model checkpoint models/capi_rope_vitreg4_b14_162.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_162.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_158.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_158_states...
Removing checkpoint models/rope_vitreg4_b14_capi_158.pt...
Time cost: 32m48.7s
---
Epoch 163/200 training_loss: 4.8970
Epoch 163/200 clustering_loss: 1.3760
Epoch 163/200 target_entropy: 0.7989
Updated learning rate to: 0.00010495882210614648
Saving model checkpoint models/capi_rope_vitreg4_b14_163.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_163.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_159.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_159_states...
Removing checkpoint models/rope_vitreg4_b14_capi_159.pt...
Time cost: 32m51.3s
---
Epoch 164/200 training_loss: 4.8941
Epoch 164/200 clustering_loss: 1.3757
Epoch 164/200 target_entropy: 0.7985
Updated learning rate to: 9.970124976909917e-05
Saving model checkpoint models/capi_rope_vitreg4_b14_164.pt...
Saving model checkpoint models/rope_vitreg4_b14_capi_164.pt...
Removing checkpoint models/capi_rope_vitreg4_b14_160.pt...
Removing checkpoint states models/capi_rope_vitreg4_b14_160_states...
Removing checkpoint models/rope_vitreg4_b14_capi_160.pt...
Time cost: 32m49.1s
---