gpt_image_clef2 / README.md
Beeseey's picture
Upload model
996e2b0
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: gpt_image_clef2
    results: []

gpt_image_clef2

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.2611
  • Train Rouge: 0.4475
  • Validation Loss: 1.1578
  • Validation Rouge: 0.3944
  • Epoch: 25

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0005, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0005, 'decay_steps': 2554800, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.99, 'epsilon': 0.2, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Train Rouge Validation Loss Validation Rouge Epoch
1.5255 0.4213 1.0251 0.4284 0
1.1805 0.4673 0.9779 0.4442 1
1.1394 0.4800 0.9561 0.4509 2
1.1168 0.4871 0.9369 0.4595 3
1.1036 0.4915 0.9314 0.4623 4
1.0971 0.4936 0.9283 0.4624 5
1.0946 0.4947 0.9315 0.4617 6
1.0962 0.4947 0.9323 0.4614 7
1.1001 0.4943 0.9405 0.4586 8
1.1065 0.4933 0.9501 0.4560 9
1.1146 0.4913 0.9614 0.4498 10
1.1240 0.4890 0.9726 0.4471 11
1.1341 0.4864 0.9852 0.4429 12
1.1451 0.4836 0.9982 0.4389 13
1.1564 0.4799 1.0160 0.4319 14
1.1680 0.4766 1.0273 0.4296 15
1.1793 0.4732 1.0405 0.4267 16
1.1901 0.4699 1.0556 0.4235 17
1.2007 0.4666 1.0692 0.4184 18
1.2108 0.4632 1.0796 0.4168 19
1.2207 0.4603 1.0998 0.4093 20
1.2299 0.4574 1.1135 0.4057 21
1.2386 0.4547 1.1297 0.4026 22
1.2469 0.4519 1.1396 0.4013 23
1.2540 0.4497 1.1467 0.3960 24
1.2611 0.4475 1.1578 0.3944 25

Framework versions

  • Transformers 4.28.1
  • TensorFlow 2.10.1
  • Datasets 2.11.0
  • Tokenizers 0.13.3