Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -96,4 +96,4 @@ Answer:To make an omelette, start by cracking two eggs into a bowl and whisking
 The model was trained on the [`Dahoas/synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise). We split the original dataset into the train (first 32000 examples) and validation (the remaining 1144 examples) subsets.
-We finetune the model for 4 epoches. This took 8xA100 80GB 5 hours, where we set `batch_size_per_gpu` to `2` (so global batch size is 16), and learning rate to `0.00001` (with linear decay to zero at the last trainig step). You can find a Weights and Biases record [here](https://wandb.ai/chuanli11/ft-synthetic-instruct-gptj-pairwise-pythia2.8b?workspace=user-chuanli11).


96
97	The model was trained on the [`Dahoas/synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise). We split the original dataset into the train (first 32000 examples) and validation (the remaining 1144 examples) subsets.
98
99	+ We finetune the model for 4 epoches. This took 8xA100 80GB 5 hours, where we set `batch_size_per_gpu` to `2` (so global batch size is 16), and learning rate to `0.00001` (with linear decay to zero at the last trainig step). You can find a Weights and Biases record [here](https://wandb.ai/chuanli11/public-ft-synthetic-instruct-gptj-pairwise-pythia2-8b).