Safetensors
English
bert_hash
custom_code
davidmezzetti commited on
Commit
7dd49e2
·
verified ·
1 Parent(s): 00820a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ Hash token embeddings:
23
 
24
  These models are pre-trained on the same training corpus as BERT (with a copy of Wikipedia from 2025) as recommended in the paper [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962).
25
 
26
- Here is a subset of GLUE scores on the dev set using the [script provided by Hugging Face Transformers](https://github.com/huggingface/transformers/blob/main/examples/pytorch/text-classification/run_glue.py) with the following parameters.
27
 
28
  ```bash
29
  python run_glue.py --model_name_or_path <model path> --task_name <task name> --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 32 --learning_rate 1e-4 --num_train_epochs 4 --output_dir outputs --trust-remote-code True
 
23
 
24
  These models are pre-trained on the same training corpus as BERT (with a copy of Wikipedia from 2025) as recommended in the paper [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962).
25
 
26
+ Below is a subset of GLUE scores on the dev set using the [script provided by Hugging Face Transformers](https://github.com/huggingface/transformers/blob/main/examples/pytorch/text-classification/run_glue.py) with the following parameters.
27
 
28
  ```bash
29
  python run_glue.py --model_name_or_path <model path> --task_name <task name> --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 32 --learning_rate 1e-4 --num_train_epochs 4 --output_dir outputs --trust-remote-code True