Spaces:
Running
on
CPU Upgrade
Search and filtering do not work
Still the same. It is awfully quiet here, also on github. It seems to me they are not interested in any community interaction, just updating the leaderboard with newest models.
We are working on it. The new version shall come out soon.
@forrest-vectara Yes, great. Thanks for letting us know! Looking forward to it.
Are you able to add some more info about how the models score on something like llm arena? Or is that upon us to make that connection between hallucination and usefullness?
It is fixed now. https://huggingface.co/spaces/vectara/leaderboard
Are you able to add some more info about how the models score on something like llm arena? Or is that upon us to make that connection between hallucination and usefullness?
Right now we wanna focus on hallucination. but we can add the link to other leaderboards for each LLM.
Nice work!
If I would want to use the data for analysis. I will add the scores for different benchmarks and i would like to know the test date. Especially now companies like OpenAi are changing the LLMs behind the names. Like the update on GPT-4o, which went bad.
Maybe I will go and use the data myself for this kind of analysis. Can i go and run this bench myself?
Thank you!