mars_130m_1 / README.md
KaiyueWen's picture
Upload folder using huggingface_hub
a0e55c1 verified

Model Card

Best configuration

Hyperparameter Value
beta1 0.9
beta2 0.95
epsilon 9.999999999999999e-26
gamma 0.025
learning_rate 0.016
max_grad_norm 1.0
min_lr_ratio 0
train_batch_size 128
warmup 2000
weight_decay 0.1