|
INFO: 2024-07-12 18:01:30,693: llmtf.base.evaluator: Starting eval on ['darumeru/multiq', 'darumeru/parus', 'darumeru/rcb', 'darumeru/ruopenbookqa', 'darumeru/rutie', 'darumeru/ruworldtree', 'darumeru/rwsd', 'darumeru/use', 'russiannlp/rucola_custom'] |
|
INFO: 2024-07-12 18:01:30,694: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:01:30,694: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:01:32,050: llmtf.base.evaluator: Starting eval on ['darumeru/rummlu'] |
|
INFO: 2024-07-12 18:01:32,050: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:01:32,050: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:01:34,410: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/rummlu'] |
|
INFO: 2024-07-12 18:01:34,410: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009] |
|
INFO: 2024-07-12 18:01:34,410: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 18:01:36,446: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/enmmlu'] |
|
INFO: 2024-07-12 18:01:36,447: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009] |
|
INFO: 2024-07-12 18:01:36,447: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 18:01:38,409: llmtf.base.evaluator: Starting eval on ['daru/treewayabstractive'] |
|
INFO: 2024-07-12 18:01:38,410: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:01:38,410: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:01:40,005: llmtf.base.evaluator: Starting eval on ['daru/treewayextractive'] |
|
INFO: 2024-07-12 18:01:40,005: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009] |
|
INFO: 2024-07-12 18:01:40,005: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 18:01:42,158: llmtf.base.evaluator: Starting eval on ['darumeru/cp_sent_ru', 'darumeru/cp_sent_en', 'darumeru/cp_para_ru', 'darumeru/cp_para_en'] |
|
INFO: 2024-07-12 18:01:42,159: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:01:42,159: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:01:46,371: llmtf.base.darumeru/cp_sent_ru: Loading Dataset: 4.21s |
|
INFO: 2024-07-12 18:01:51,991: llmtf.base.daru/treewayextractive: Loading Dataset: 11.98s |
|
INFO: 2024-07-12 18:01:52,097: llmtf.base.darumeru/MultiQ: Loading Dataset: 21.40s |
|
INFO: 2024-07-12 18:01:56,033: llmtf.base.daru/treewayabstractive: Loading Dataset: 17.62s |
|
INFO: 2024-07-12 18:03:01,203: llmtf.base.darumeru/ruMMLU: Loading Dataset: 89.15s |
|
INFO: 2024-07-12 18:05:11,610: llmtf.base.nlpcoreteam/enMMLU: Loading Dataset: 215.16s |
|
INFO: 2024-07-12 18:05:59,560: llmtf.base.nlpcoreteam/ruMMLU: Loading Dataset: 265.15s |
|
INFO: 2024-07-12 18:09:58,897: llmtf.base.darumeru/MultiQ: Processing Dataset: 486.80s |
|
INFO: 2024-07-12 18:09:58,899: llmtf.base.darumeru/MultiQ: Results for darumeru/MultiQ: |
|
INFO: 2024-07-12 18:09:58,904: llmtf.base.darumeru/MultiQ: {'f1': 0.4751424669445708, 'em': 0.361376673040153} |
|
INFO: 2024-07-12 18:09:58,915: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:09:58,915: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:10:02,091: llmtf.base.darumeru/PARus: Loading Dataset: 3.17s |
|
INFO: 2024-07-12 18:10:16,926: llmtf.base.darumeru/PARus: Processing Dataset: 14.83s |
|
INFO: 2024-07-12 18:10:16,928: llmtf.base.darumeru/PARus: Results for darumeru/PARus: |
|
INFO: 2024-07-12 18:10:16,955: llmtf.base.darumeru/PARus: {'acc': 0.85} |
|
INFO: 2024-07-12 18:10:16,957: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:10:16,957: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:10:20,545: llmtf.base.darumeru/RCB: Loading Dataset: 3.59s |
|
INFO: 2024-07-12 18:10:44,092: llmtf.base.darumeru/RCB: Processing Dataset: 23.53s |
|
INFO: 2024-07-12 18:10:44,095: llmtf.base.darumeru/RCB: Results for darumeru/RCB: |
|
INFO: 2024-07-12 18:10:44,101: llmtf.base.darumeru/RCB: {'acc': 0.5363636363636364, 'f1_macro': 0.44535519125683054} |
|
INFO: 2024-07-12 18:10:44,103: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:10:44,103: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:10:58,615: llmtf.base.darumeru/ruOpenBookQA: Loading Dataset: 14.51s |
|
INFO: 2024-07-12 18:11:48,068: llmtf.base.darumeru/cp_sent_ru: Processing Dataset: 601.70s |
|
INFO: 2024-07-12 18:11:48,071: llmtf.base.darumeru/cp_sent_ru: Results for darumeru/cp_sent_ru: |
|
INFO: 2024-07-12 18:11:48,089: llmtf.base.darumeru/cp_sent_ru: {'symbol_per_token': 2.8297086173143398, 'len': 0.9954845547335207, 'lcs': 0.9794343032397163} |
|
INFO: 2024-07-12 18:11:48,092: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:11:48,092: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:11:52,723: llmtf.base.darumeru/cp_sent_en: Loading Dataset: 4.63s |
|
INFO: 2024-07-12 18:13:29,039: llmtf.base.darumeru/ruOpenBookQA: Processing Dataset: 150.41s |
|
INFO: 2024-07-12 18:13:29,041: llmtf.base.darumeru/ruOpenBookQA: Results for darumeru/ruOpenBookQA: |
|
INFO: 2024-07-12 18:13:29,068: llmtf.base.darumeru/ruOpenBookQA: {'acc': 0.772766323024055, 'f1_macro': 0.7722841236005651} |
|
INFO: 2024-07-12 18:13:29,084: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:13:29,085: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:13:36,311: llmtf.base.darumeru/ruTiE: Loading Dataset: 7.23s |
|
INFO: 2024-07-12 18:18:09,770: llmtf.base.darumeru/ruTiE: Processing Dataset: 273.46s |
|
INFO: 2024-07-12 18:18:09,786: llmtf.base.darumeru/ruTiE: Results for darumeru/ruTiE: |
|
INFO: 2024-07-12 18:18:09,815: llmtf.base.darumeru/ruTiE: {'acc': 0.4441860465116279} |
|
INFO: 2024-07-12 18:18:09,819: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:18:09,819: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:18:12,807: llmtf.base.darumeru/ruWorldTree: Loading Dataset: 2.99s |
|
INFO: 2024-07-12 18:18:21,237: llmtf.base.darumeru/ruWorldTree: Processing Dataset: 8.43s |
|
INFO: 2024-07-12 18:18:21,239: llmtf.base.darumeru/ruWorldTree: Results for darumeru/ruWorldTree: |
|
INFO: 2024-07-12 18:18:21,245: llmtf.base.darumeru/ruWorldTree: {'acc': 0.8761904761904762, 'f1_macro': 0.873910411622276} |
|
INFO: 2024-07-12 18:18:21,247: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:18:21,247: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:18:26,163: llmtf.base.darumeru/RWSD: Loading Dataset: 4.92s |
|
INFO: 2024-07-12 18:18:46,629: llmtf.base.darumeru/RWSD: Processing Dataset: 20.46s |
|
INFO: 2024-07-12 18:18:46,632: llmtf.base.darumeru/RWSD: Results for darumeru/RWSD: |
|
INFO: 2024-07-12 18:18:46,636: llmtf.base.darumeru/RWSD: {'acc': 0.5980392156862745} |
|
INFO: 2024-07-12 18:18:46,638: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:18:46,638: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:19:01,474: llmtf.base.darumeru/USE: Loading Dataset: 14.84s |
|
INFO: 2024-07-12 18:20:03,268: llmtf.base.darumeru/cp_sent_en: Processing Dataset: 490.54s |
|
INFO: 2024-07-12 18:20:03,284: llmtf.base.darumeru/cp_sent_en: Results for darumeru/cp_sent_en: |
|
INFO: 2024-07-12 18:20:03,288: llmtf.base.darumeru/cp_sent_en: {'symbol_per_token': 4.423187111501028, 'len': 0.9993497445495427, 'lcs': 0.989692627049547} |
|
INFO: 2024-07-12 18:20:03,291: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:20:03,291: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:20:06,654: llmtf.base.darumeru/cp_para_ru: Loading Dataset: 3.36s |
|
INFO: 2024-07-12 18:22:42,618: llmtf.base.daru/treewayextractive: Processing Dataset: 1250.63s |
|
INFO: 2024-07-12 18:22:42,622: llmtf.base.daru/treewayextractive: Results for daru/treewayextractive: |
|
INFO: 2024-07-12 18:22:42,871: llmtf.base.daru/treewayextractive: {'r-prec': 0.4058313131313131} |
|
INFO: 2024-07-12 18:22:43,266: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:22:43,275: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree |
|
0.685 0.406 0.418 0.850 0.491 0.598 0.999 0.995 0.773 0.444 0.875 |
|
INFO: 2024-07-12 18:23:20,386: llmtf.base.darumeru/ruMMLU: Processing Dataset: 1219.18s |
|
INFO: 2024-07-12 18:23:20,389: llmtf.base.darumeru/ruMMLU: Results for darumeru/ruMMLU: |
|
INFO: 2024-07-12 18:23:20,398: llmtf.base.darumeru/ruMMLU: {'acc': 0.5095280854035718} |
|
INFO: 2024-07-12 18:23:20,482: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:23:20,493: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree |
|
0.669 0.406 0.418 0.850 0.491 0.598 0.999 0.995 0.510 0.773 0.444 0.875 |
|
INFO: 2024-07-12 18:23:57,588: llmtf.base.nlpcoreteam/enMMLU: Processing Dataset: 1125.98s |
|
INFO: 2024-07-12 18:23:57,591: llmtf.base.nlpcoreteam/enMMLU: Results for nlpcoreteam/enMMLU: |
|
INFO: 2024-07-12 18:23:57,631: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
abstract_algebra 0.380000 |
|
anatomy 0.674074 |
|
astronomy 0.756579 |
|
business_ethics 0.720000 |
|
clinical_knowledge 0.762264 |
|
college_biology 0.798611 |
|
college_chemistry 0.490000 |
|
college_computer_science 0.580000 |
|
college_mathematics 0.380000 |
|
college_medicine 0.664740 |
|
college_physics 0.490196 |
|
computer_security 0.760000 |
|
conceptual_physics 0.570213 |
|
econometrics 0.543860 |
|
electrical_engineering 0.662069 |
|
elementary_mathematics 0.447090 |
|
formal_logic 0.515873 |
|
global_facts 0.430000 |
|
high_school_biology 0.812903 |
|
high_school_chemistry 0.532020 |
|
high_school_computer_science 0.710000 |
|
high_school_european_history 0.763636 |
|
high_school_geography 0.853535 |
|
high_school_government_and_politics 0.927461 |
|
high_school_macroeconomics 0.684615 |
|
high_school_mathematics 0.403704 |
|
high_school_microeconomics 0.785714 |
|
high_school_physics 0.397351 |
|
high_school_psychology 0.849541 |
|
high_school_statistics 0.569444 |
|
high_school_us_history 0.833333 |
|
high_school_world_history 0.848101 |
|
human_aging 0.730942 |
|
human_sexuality 0.770992 |
|
international_law 0.834711 |
|
jurisprudence 0.768519 |
|
logical_fallacies 0.736196 |
|
machine_learning 0.526786 |
|
management 0.844660 |
|
marketing 0.897436 |
|
medical_genetics 0.840000 |
|
miscellaneous 0.841635 |
|
moral_disputes 0.742775 |
|
moral_scenarios 0.509497 |
|
nutrition 0.751634 |
|
philosophy 0.717042 |
|
prehistory 0.740741 |
|
professional_accounting 0.500000 |
|
professional_law 0.471969 |
|
professional_medicine 0.727941 |
|
professional_psychology 0.712418 |
|
public_relations 0.700000 |
|
security_studies 0.734694 |
|
sociology 0.855721 |
|
us_foreign_policy 0.830000 |
|
virology 0.512048 |
|
world_religions 0.807018 |
|
INFO: 2024-07-12 18:23:57,639: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
STEM 0.570387 |
|
humanities 0.714570 |
|
other (business, health, misc.) 0.706955 |
|
social sciences 0.770713 |
|
INFO: 2024-07-12 18:23:57,646: llmtf.base.nlpcoreteam/enMMLU: {'acc': 0.6906562565670807} |
|
INFO: 2024-07-12 18:23:57,720: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:23:57,732: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU |
|
0.671 0.406 0.418 0.850 0.491 0.598 0.999 0.995 0.510 0.773 0.444 0.875 0.691 |
|
INFO: 2024-07-12 18:24:22,499: llmtf.base.darumeru/USE: Processing Dataset: 321.02s |
|
INFO: 2024-07-12 18:24:22,502: llmtf.base.darumeru/USE: Results for darumeru/USE: |
|
INFO: 2024-07-12 18:24:22,507: llmtf.base.darumeru/USE: {'grade_norm': 0.1784313725490196} |
|
INFO: 2024-07-12 18:24:22,513: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009] |
|
INFO: 2024-07-12 18:24:22,513: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 18:24:42,930: llmtf.base.russiannlp/rucola_custom: Loading Dataset: 20.42s |
|
INFO: 2024-07-12 18:27:31,417: llmtf.base.daru/treewayabstractive: Processing Dataset: 1535.38s |
|
INFO: 2024-07-12 18:27:31,420: llmtf.base.daru/treewayabstractive: Results for daru/treewayabstractive: |
|
INFO: 2024-07-12 18:27:31,441: llmtf.base.daru/treewayabstractive: {'rouge1': 0.3622126549632883, 'rouge2': 0.13608707005583875} |
|
INFO: 2024-07-12 18:27:31,446: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:27:31,480: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU |
|
0.606 0.249 0.406 0.418 0.850 0.491 0.598 0.178 0.999 0.995 0.510 0.773 0.444 0.875 0.691 |
|
INFO: 2024-07-12 18:28:08,813: llmtf.base.russiannlp/rucola_custom: Processing Dataset: 205.88s |
|
INFO: 2024-07-12 18:28:08,814: llmtf.base.russiannlp/rucola_custom: Results for russiannlp/rucola_custom: |
|
INFO: 2024-07-12 18:28:08,826: llmtf.base.russiannlp/rucola_custom: {'acc': 0.725870111230714, 'mcc': 0.33130646993388546} |
|
INFO: 2024-07-12 18:28:08,837: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:28:08,846: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU russiannlp/rucola_custom |
|
0.600 0.249 0.406 0.418 0.850 0.491 0.598 0.178 0.999 0.995 0.510 0.773 0.444 0.875 0.691 0.529 |
|
INFO: 2024-07-12 18:32:03,833: llmtf.base.nlpcoreteam/ruMMLU: Processing Dataset: 1564.27s |
|
INFO: 2024-07-12 18:32:03,851: llmtf.base.nlpcoreteam/ruMMLU: Results for nlpcoreteam/ruMMLU: |
|
INFO: 2024-07-12 18:32:03,891: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
abstract_algebra 0.330000 |
|
anatomy 0.488889 |
|
astronomy 0.671053 |
|
business_ethics 0.680000 |
|
clinical_knowledge 0.573585 |
|
college_biology 0.548611 |
|
college_chemistry 0.420000 |
|
college_computer_science 0.500000 |
|
college_mathematics 0.370000 |
|
college_medicine 0.502890 |
|
college_physics 0.343137 |
|
computer_security 0.710000 |
|
conceptual_physics 0.540426 |
|
econometrics 0.447368 |
|
electrical_engineering 0.551724 |
|
elementary_mathematics 0.370370 |
|
formal_logic 0.404762 |
|
global_facts 0.400000 |
|
high_school_biology 0.667742 |
|
high_school_chemistry 0.399015 |
|
high_school_computer_science 0.680000 |
|
high_school_european_history 0.769697 |
|
high_school_geography 0.691919 |
|
high_school_government_and_politics 0.678756 |
|
high_school_macroeconomics 0.558974 |
|
high_school_mathematics 0.362963 |
|
high_school_microeconomics 0.563025 |
|
high_school_physics 0.370861 |
|
high_school_psychology 0.680734 |
|
high_school_statistics 0.449074 |
|
high_school_us_history 0.676471 |
|
high_school_world_history 0.729958 |
|
human_aging 0.569507 |
|
human_sexuality 0.641221 |
|
international_law 0.776860 |
|
jurisprudence 0.601852 |
|
logical_fallacies 0.539877 |
|
machine_learning 0.392857 |
|
management 0.650485 |
|
marketing 0.722222 |
|
medical_genetics 0.670000 |
|
miscellaneous 0.652618 |
|
moral_disputes 0.618497 |
|
moral_scenarios 0.316201 |
|
nutrition 0.611111 |
|
philosophy 0.607717 |
|
prehistory 0.589506 |
|
professional_accounting 0.368794 |
|
professional_law 0.380704 |
|
professional_medicine 0.500000 |
|
professional_psychology 0.522876 |
|
public_relations 0.590909 |
|
security_studies 0.681633 |
|
sociology 0.686567 |
|
us_foreign_policy 0.760000 |
|
virology 0.451807 |
|
world_religions 0.701754 |
|
INFO: 2024-07-12 18:32:03,898: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
STEM 0.482102 |
|
humanities 0.593374 |
|
other (business, health, misc.) 0.560136 |
|
social sciences 0.625332 |
|
INFO: 2024-07-12 18:32:03,906: llmtf.base.nlpcoreteam/ruMMLU: {'acc': 0.5652359229033166} |
|
INFO: 2024-07-12 18:32:03,987: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:32:04,050: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.598 0.249 0.406 0.418 0.850 0.491 0.598 0.178 0.999 0.995 0.510 0.773 0.444 0.875 0.691 0.565 0.529 |
|
INFO: 2024-07-12 18:33:10,002: llmtf.base.darumeru/cp_para_ru: Processing Dataset: 783.35s |
|
INFO: 2024-07-12 18:33:10,005: llmtf.base.darumeru/cp_para_ru: Results for darumeru/cp_para_ru: |
|
INFO: 2024-07-12 18:33:10,010: llmtf.base.darumeru/cp_para_ru: {'symbol_per_token': 2.9701257991830787, 'len': 0.9961165277352745, 'lcs': 0.9405669328599758} |
|
INFO: 2024-07-12 18:33:10,012: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [128009, 198, 271] |
|
INFO: 2024-07-12 18:33:10,012: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 18:33:14,554: llmtf.base.darumeru/cp_para_en: Loading Dataset: 4.54s |
|
INFO: 2024-07-12 18:43:55,911: llmtf.base.darumeru/cp_para_en: Processing Dataset: 641.36s |
|
INFO: 2024-07-12 18:43:55,933: llmtf.base.darumeru/cp_para_en: Results for darumeru/cp_para_en: |
|
INFO: 2024-07-12 18:43:55,938: llmtf.base.darumeru/cp_para_en: {'symbol_per_token': 4.485251430232633, 'len': 0.9993046186982656, 'lcs': 0.9638998501661067} |
|
INFO: 2024-07-12 18:43:55,940: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 18:43:55,954: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_para_en darumeru/cp_para_ru darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.638 0.249 0.406 0.418 0.850 0.491 0.598 0.178 0.964 0.941 0.999 0.995 0.510 0.773 0.444 0.875 0.691 0.565 0.529 |
|
|