amolinao87 commited on
Commit
d4054d2
·
verified ·
1 Parent(s): e5d3907

Model save

Browse files
Files changed (2) hide show
  1. README.md +5 -6
  2. tokenizer_config.json +7 -0
README.md CHANGED
@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [bigcode/starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b) on the None dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.4160
23
 
24
  ## Model description
25
 
@@ -38,22 +38,21 @@ More information needed
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
- - learning_rate: 5e-05
42
  - train_batch_size: 2
43
  - eval_batch_size: 2
44
  - seed: 42
45
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
47
- - num_epochs: 3
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | 1.3303 | 1.0 | 60 | 0.4534 |
55
- | 1.2079 | 2.0 | 120 | 0.4264 |
56
- | 1.2022 | 3.0 | 180 | 0.4160 |
57
 
58
 
59
  ### Framework versions
 
19
 
20
  This model is a fine-tuned version of [bigcode/starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b) on the None dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.4791
23
 
24
  ## Model description
25
 
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
+ - learning_rate: 3e-05
42
  - train_batch_size: 2
43
  - eval_batch_size: 2
44
  - seed: 42
45
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
47
+ - num_epochs: 2
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 1.9751 | 1.0 | 60 | 0.4862 |
55
+ | 1.4621 | 2.0 | 120 | 0.4791 |
 
56
 
57
 
58
  ### Framework versions
tokenizer_config.json CHANGED
@@ -350,9 +350,16 @@
350
  "clean_up_tokenization_spaces": true,
351
  "eos_token": "<|endoftext|>",
352
  "extra_special_tokens": {},
 
353
  "model_max_length": 1000000000000000019884624838656,
 
354
  "pad_token": "<|endoftext|>",
 
 
 
355
  "tokenizer_class": "GPT2Tokenizer",
 
 
356
  "unk_token": "<|endoftext|>",
357
  "vocab_size": 49152
358
  }
 
350
  "clean_up_tokenization_spaces": true,
351
  "eos_token": "<|endoftext|>",
352
  "extra_special_tokens": {},
353
+ "max_length": 1024,
354
  "model_max_length": 1000000000000000019884624838656,
355
+ "pad_to_multiple_of": null,
356
  "pad_token": "<|endoftext|>",
357
+ "pad_token_type_id": 0,
358
+ "padding_side": "right",
359
+ "stride": 0,
360
  "tokenizer_class": "GPT2Tokenizer",
361
+ "truncation_side": "right",
362
+ "truncation_strategy": "longest_first",
363
  "unk_token": "<|endoftext|>",
364
  "vocab_size": 49152
365
  }