rjurney's picture
Latest training run, just 4 epochs, optimizations all pulled except for FP16, save and eval at epochs to avoid over-fitting
e659e59 unverified