Add IFEval score to metrics
#2
by
lewtun HF Staff - opened
This adds the prompt-level-loose accuracy metric from Google's IFEval benchmark: https://arxiv.org/abs/2311.07911
Thanks for adding would be nice to see a reference point (if that score is good/bad) compared to models of similar size etc
abacaj changed pull request status to
merged