| Wandb runs: https://wandb.ai/eleutherai/pythia-rlhf/runs/s0qdwbg6?workspace=user-yongzx | |
| Evaluation results: | |
| | Task |Version|Filter| Metric |Value | |Stderr| | |
| |-------------|-------|------|--------|-----:|---|-----:| | |
| |arc_challenge|Yaml |none |acc |0.1758|± |0.0111| | |
| | | |none |acc_norm|0.2176|± |0.0121| | |
| |arc_easy |Yaml |none |acc |0.3742|± |0.0099| | |
| | | |none |acc_norm|0.3565|± |0.0098| | |
| |logiqa |Yaml |none |acc |0.2058|± |0.0159| | |
| | | |none |acc_norm|0.2412|± |0.0168| | |
| |piqa |Yaml |none |acc |0.5958|± |0.0114| | |
| | | |none |acc_norm|0.5941|± |0.0115| | |
| |sciq |Yaml |none |acc |0.5930|± |0.0155| | |
| | | |none |acc_norm|0.5720|± |0.0157| | |
| |winogrande |Yaml |none |acc |0.5154|± |0.0140| | |
| |wsc |Yaml |none |acc |0.3654|± |0.0474| | |
| |lambada_openai|Yaml |none |perplexity|730.2552|± |46.8739| | |
| | | |none |acc | 0.1316|± | 0.0047| |