|
INFO: 2024-07-12 12:48:42,054: llmtf.base.evaluator: Starting eval on ['darumeru/multiq', 'darumeru/parus', 'darumeru/rcb', 'darumeru/ruopenbookqa', 'darumeru/rutie', 'darumeru/ruworldtree', 'darumeru/rwsd', 'darumeru/use', 'russiannlp/rucola_custom'] |
|
INFO: 2024-07-12 12:48:42,055: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:48:42,055: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:48:43,444: llmtf.base.evaluator: Starting eval on ['darumeru/rummlu'] |
|
INFO: 2024-07-12 12:48:43,444: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:48:43,445: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:48:46,084: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/rummlu'] |
|
INFO: 2024-07-12 12:48:46,084: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000] |
|
INFO: 2024-07-12 12:48:46,084: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 12:48:47,926: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/enmmlu'] |
|
INFO: 2024-07-12 12:48:47,926: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000] |
|
INFO: 2024-07-12 12:48:47,927: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 12:48:49,072: llmtf.base.evaluator: Starting eval on ['daru/treewayabstractive'] |
|
INFO: 2024-07-12 12:48:49,072: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:48:49,072: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:48:51,325: llmtf.base.evaluator: Starting eval on ['daru/treewayextractive'] |
|
INFO: 2024-07-12 12:48:51,325: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000] |
|
INFO: 2024-07-12 12:48:51,325: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 12:48:53,579: llmtf.base.evaluator: Starting eval on ['darumeru/cp_sent_ru', 'darumeru/cp_sent_en', 'darumeru/cp_para_ru', 'darumeru/cp_para_en'] |
|
INFO: 2024-07-12 12:48:53,580: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:48:53,580: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:48:55,110: llmtf.base.darumeru/MultiQ: Loading Dataset: 13.06s |
|
INFO: 2024-07-12 12:48:56,870: llmtf.base.daru/treewayabstractive: Loading Dataset: 7.80s |
|
INFO: 2024-07-12 12:48:57,066: llmtf.base.darumeru/cp_sent_ru: Loading Dataset: 3.49s |
|
INFO: 2024-07-12 12:48:59,493: llmtf.base.daru/treewayextractive: Loading Dataset: 8.17s |
|
INFO: 2024-07-12 12:49:35,613: llmtf.base.darumeru/ruMMLU: Loading Dataset: 52.17s |
|
INFO: 2024-07-12 12:52:06,068: llmtf.base.nlpcoreteam/enMMLU: Loading Dataset: 198.14s |
|
INFO: 2024-07-12 12:52:12,454: llmtf.base.nlpcoreteam/ruMMLU: Loading Dataset: 206.37s |
|
INFO: 2024-07-12 12:57:36,064: llmtf.base.darumeru/MultiQ: Processing Dataset: 520.95s |
|
INFO: 2024-07-12 12:57:36,065: llmtf.base.darumeru/MultiQ: Results for darumeru/MultiQ: |
|
INFO: 2024-07-12 12:57:36,070: llmtf.base.darumeru/MultiQ: {'f1': 0.5698089335328944, 'em': 0.5019120458891013} |
|
INFO: 2024-07-12 12:57:36,081: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:57:36,081: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:57:38,886: llmtf.base.darumeru/PARus: Loading Dataset: 2.80s |
|
INFO: 2024-07-12 12:57:56,888: llmtf.base.darumeru/PARus: Processing Dataset: 18.00s |
|
INFO: 2024-07-12 12:57:56,890: llmtf.base.darumeru/PARus: Results for darumeru/PARus: |
|
INFO: 2024-07-12 12:57:56,902: llmtf.base.darumeru/PARus: {'acc': 0.83} |
|
INFO: 2024-07-12 12:57:56,904: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:57:56,904: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:58:00,330: llmtf.base.darumeru/RCB: Loading Dataset: 3.43s |
|
INFO: 2024-07-12 12:58:26,702: llmtf.base.darumeru/RCB: Processing Dataset: 26.37s |
|
INFO: 2024-07-12 12:58:26,716: llmtf.base.darumeru/RCB: Results for darumeru/RCB: |
|
INFO: 2024-07-12 12:58:26,730: llmtf.base.darumeru/RCB: {'acc': 0.5318181818181819, 'f1_macro': 0.4819804386277897} |
|
INFO: 2024-07-12 12:58:26,731: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 12:58:26,732: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 12:58:35,277: llmtf.base.darumeru/ruOpenBookQA: Loading Dataset: 8.54s |
|
INFO: 2024-07-12 13:00:31,609: llmtf.base.darumeru/cp_sent_ru: Processing Dataset: 694.54s |
|
INFO: 2024-07-12 13:00:31,613: llmtf.base.darumeru/cp_sent_ru: Results for darumeru/cp_sent_ru: |
|
INFO: 2024-07-12 13:00:31,645: llmtf.base.darumeru/cp_sent_ru: {'symbol_per_token': 2.3698479637205674, 'len': 0.998929777089993, 'lcs': 0.9815584658287272} |
|
INFO: 2024-07-12 13:00:31,648: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:00:31,648: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:00:34,944: llmtf.base.darumeru/cp_sent_en: Loading Dataset: 3.30s |
|
INFO: 2024-07-12 13:01:29,607: llmtf.base.darumeru/ruOpenBookQA: Processing Dataset: 174.33s |
|
INFO: 2024-07-12 13:01:29,610: llmtf.base.darumeru/ruOpenBookQA: Results for darumeru/ruOpenBookQA: |
|
INFO: 2024-07-12 13:01:29,622: llmtf.base.darumeru/ruOpenBookQA: {'acc': 0.7538659793814433, 'f1_macro': 0.7551200071805053} |
|
INFO: 2024-07-12 13:01:29,638: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:01:29,639: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:01:33,958: llmtf.base.darumeru/ruTiE: Loading Dataset: 4.32s |
|
INFO: 2024-07-12 13:02:27,340: llmtf.base.daru/treewayextractive: Processing Dataset: 807.84s |
|
INFO: 2024-07-12 13:02:27,342: llmtf.base.daru/treewayextractive: Results for daru/treewayextractive: |
|
INFO: 2024-07-12 13:02:27,599: llmtf.base.daru/treewayextractive: {'r-prec': 0.4038567821067821} |
|
INFO: 2024-07-12 13:02:28,175: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:02:28,182: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/cp_sent_ru darumeru/ruOpenBookQA |
|
0.672 0.404 0.536 0.830 0.507 0.999 0.754 |
|
INFO: 2024-07-12 13:05:57,127: llmtf.base.darumeru/ruTiE: Processing Dataset: 263.17s |
|
INFO: 2024-07-12 13:05:57,131: llmtf.base.darumeru/ruTiE: Results for darumeru/ruTiE: |
|
INFO: 2024-07-12 13:05:57,160: llmtf.base.darumeru/ruTiE: {'acc': 0.5395348837209303} |
|
INFO: 2024-07-12 13:05:57,163: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:05:57,163: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:05:59,613: llmtf.base.darumeru/ruWorldTree: Loading Dataset: 2.45s |
|
INFO: 2024-07-12 13:06:09,846: llmtf.base.darumeru/ruWorldTree: Processing Dataset: 10.23s |
|
INFO: 2024-07-12 13:06:09,848: llmtf.base.darumeru/ruWorldTree: Results for darumeru/ruWorldTree: |
|
INFO: 2024-07-12 13:06:09,854: llmtf.base.darumeru/ruWorldTree: {'acc': 0.8761904761904762, 'f1_macro': 0.8761420630173862} |
|
INFO: 2024-07-12 13:06:09,855: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:06:09,855: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:06:12,444: llmtf.base.darumeru/RWSD: Loading Dataset: 2.59s |
|
INFO: 2024-07-12 13:06:36,116: llmtf.base.darumeru/RWSD: Processing Dataset: 23.67s |
|
INFO: 2024-07-12 13:06:36,132: llmtf.base.darumeru/RWSD: Results for darumeru/RWSD: |
|
INFO: 2024-07-12 13:06:36,136: llmtf.base.darumeru/RWSD: {'acc': 0.6078431372549019} |
|
INFO: 2024-07-12 13:06:36,138: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:06:36,138: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:06:43,790: llmtf.base.darumeru/USE: Loading Dataset: 7.65s |
|
INFO: 2024-07-12 13:09:40,078: llmtf.base.darumeru/cp_sent_en: Processing Dataset: 545.13s |
|
INFO: 2024-07-12 13:09:40,098: llmtf.base.darumeru/cp_sent_en: Results for darumeru/cp_sent_en: |
|
INFO: 2024-07-12 13:09:40,103: llmtf.base.darumeru/cp_sent_en: {'symbol_per_token': 3.8994152226580563, 'len': 0.9995035620835028, 'lcs': 0.9936840637058483} |
|
INFO: 2024-07-12 13:09:40,105: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:09:40,106: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:09:42,693: llmtf.base.darumeru/cp_para_ru: Loading Dataset: 2.59s |
|
INFO: 2024-07-12 13:12:52,217: llmtf.base.darumeru/ruMMLU: Processing Dataset: 1396.60s |
|
INFO: 2024-07-12 13:12:52,219: llmtf.base.darumeru/ruMMLU: Results for darumeru/ruMMLU: |
|
INFO: 2024-07-12 13:12:52,228: llmtf.base.darumeru/ruMMLU: {'acc': 0.48737902823505935} |
|
INFO: 2024-07-12 13:12:52,316: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:12:52,364: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree |
|
0.685 0.404 0.536 0.830 0.507 0.608 1.000 0.999 0.487 0.754 0.540 0.876 |
|
INFO: 2024-07-12 13:13:11,907: llmtf.base.darumeru/USE: Processing Dataset: 388.12s |
|
INFO: 2024-07-12 13:13:11,911: llmtf.base.darumeru/USE: Results for darumeru/USE: |
|
INFO: 2024-07-12 13:13:11,929: llmtf.base.darumeru/USE: {'grade_norm': 0.11764705882352941} |
|
INFO: 2024-07-12 13:13:11,936: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000] |
|
INFO: 2024-07-12 13:13:11,936: llmtf.base.hfmodel: Updated generation_config.stop_strings: [] |
|
INFO: 2024-07-12 13:13:21,392: llmtf.base.nlpcoreteam/enMMLU: Processing Dataset: 1275.32s |
|
INFO: 2024-07-12 13:13:21,395: llmtf.base.nlpcoreteam/enMMLU: Results for nlpcoreteam/enMMLU: |
|
INFO: 2024-07-12 13:13:21,436: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
abstract_algebra 0.310000 |
|
anatomy 0.644444 |
|
astronomy 0.677632 |
|
business_ethics 0.650000 |
|
clinical_knowledge 0.720755 |
|
college_biology 0.763889 |
|
college_chemistry 0.480000 |
|
college_computer_science 0.570000 |
|
college_mathematics 0.400000 |
|
college_medicine 0.676301 |
|
college_physics 0.372549 |
|
computer_security 0.760000 |
|
conceptual_physics 0.591489 |
|
econometrics 0.473684 |
|
electrical_engineering 0.551724 |
|
elementary_mathematics 0.396825 |
|
formal_logic 0.492063 |
|
global_facts 0.320000 |
|
high_school_biology 0.780645 |
|
high_school_chemistry 0.487685 |
|
high_school_computer_science 0.680000 |
|
high_school_european_history 0.806061 |
|
high_school_geography 0.787879 |
|
high_school_government_and_politics 0.891192 |
|
high_school_macroeconomics 0.643590 |
|
high_school_mathematics 0.355556 |
|
high_school_microeconomics 0.663866 |
|
high_school_physics 0.364238 |
|
high_school_psychology 0.834862 |
|
high_school_statistics 0.486111 |
|
high_school_us_history 0.838235 |
|
high_school_world_history 0.835443 |
|
human_aging 0.708520 |
|
human_sexuality 0.763359 |
|
international_law 0.809917 |
|
jurisprudence 0.750000 |
|
logical_fallacies 0.791411 |
|
machine_learning 0.491071 |
|
management 0.834951 |
|
marketing 0.880342 |
|
medical_genetics 0.740000 |
|
miscellaneous 0.822478 |
|
moral_disputes 0.728324 |
|
moral_scenarios 0.271508 |
|
nutrition 0.722222 |
|
philosophy 0.710611 |
|
prehistory 0.762346 |
|
professional_accounting 0.492908 |
|
professional_law 0.481095 |
|
professional_medicine 0.713235 |
|
professional_psychology 0.638889 |
|
public_relations 0.645455 |
|
security_studies 0.742857 |
|
sociology 0.840796 |
|
us_foreign_policy 0.840000 |
|
virology 0.530120 |
|
world_religions 0.818713 |
|
INFO: 2024-07-12 13:13:21,443: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
STEM 0.528856 |
|
humanities 0.699671 |
|
other (business, health, misc.) 0.675448 |
|
social sciences 0.730536 |
|
INFO: 2024-07-12 13:13:21,460: llmtf.base.nlpcoreteam/enMMLU: {'acc': 0.6586279387925342} |
|
INFO: 2024-07-12 13:13:21,531: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:13:21,577: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU |
|
0.640 0.404 0.536 0.830 0.507 0.608 0.118 1.000 0.999 0.487 0.754 0.540 0.876 0.659 |
|
INFO: 2024-07-12 13:13:23,597: llmtf.base.russiannlp/rucola_custom: Loading Dataset: 11.66s |
|
INFO: 2024-07-12 13:17:32,181: llmtf.base.russiannlp/rucola_custom: Processing Dataset: 248.58s |
|
INFO: 2024-07-12 13:17:32,186: llmtf.base.russiannlp/rucola_custom: Results for russiannlp/rucola_custom: |
|
INFO: 2024-07-12 13:17:32,198: llmtf.base.russiannlp/rucola_custom: {'acc': 0.736275565123789, 'mcc': 0.37026925316854403} |
|
INFO: 2024-07-12 13:17:32,210: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:17:32,235: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU russiannlp/rucola_custom |
|
0.634 0.404 0.536 0.830 0.507 0.608 0.118 1.000 0.999 0.487 0.754 0.540 0.876 0.659 0.553 |
|
INFO: 2024-07-12 13:22:29,666: llmtf.base.nlpcoreteam/ruMMLU: Processing Dataset: 1817.21s |
|
INFO: 2024-07-12 13:22:29,672: llmtf.base.nlpcoreteam/ruMMLU: Results for nlpcoreteam/ruMMLU: |
|
INFO: 2024-07-12 13:22:29,713: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
abstract_algebra 0.300000 |
|
anatomy 0.392593 |
|
astronomy 0.565789 |
|
business_ethics 0.560000 |
|
clinical_knowledge 0.554717 |
|
college_biology 0.465278 |
|
college_chemistry 0.410000 |
|
college_computer_science 0.500000 |
|
college_mathematics 0.370000 |
|
college_medicine 0.560694 |
|
college_physics 0.333333 |
|
computer_security 0.580000 |
|
conceptual_physics 0.472340 |
|
econometrics 0.403509 |
|
electrical_engineering 0.503448 |
|
elementary_mathematics 0.362434 |
|
formal_logic 0.357143 |
|
global_facts 0.320000 |
|
high_school_biology 0.609677 |
|
high_school_chemistry 0.389163 |
|
high_school_computer_science 0.640000 |
|
high_school_european_history 0.672727 |
|
high_school_geography 0.671717 |
|
high_school_government_and_politics 0.652850 |
|
high_school_macroeconomics 0.515385 |
|
high_school_mathematics 0.318519 |
|
high_school_microeconomics 0.521008 |
|
high_school_physics 0.337748 |
|
high_school_psychology 0.656881 |
|
high_school_statistics 0.430556 |
|
high_school_us_history 0.725490 |
|
high_school_world_history 0.691983 |
|
human_aging 0.520179 |
|
human_sexuality 0.610687 |
|
international_law 0.710744 |
|
jurisprudence 0.592593 |
|
logical_fallacies 0.503067 |
|
machine_learning 0.446429 |
|
management 0.669903 |
|
marketing 0.735043 |
|
medical_genetics 0.540000 |
|
miscellaneous 0.607918 |
|
moral_disputes 0.580925 |
|
moral_scenarios 0.188827 |
|
nutrition 0.611111 |
|
philosophy 0.575563 |
|
prehistory 0.527778 |
|
professional_accounting 0.397163 |
|
professional_law 0.365059 |
|
professional_medicine 0.437500 |
|
professional_psychology 0.493464 |
|
public_relations 0.545455 |
|
security_studies 0.595918 |
|
sociology 0.681592 |
|
us_foreign_policy 0.680000 |
|
virology 0.433735 |
|
world_religions 0.748538 |
|
INFO: 2024-07-12 13:22:29,720: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
STEM 0.446373 |
|
humanities 0.556957 |
|
other (business, health, misc.) 0.524325 |
|
social sciences 0.585705 |
|
INFO: 2024-07-12 13:22:29,744: llmtf.base.nlpcoreteam/ruMMLU: {'acc': 0.5283401236619901} |
|
INFO: 2024-07-12 13:22:29,827: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:22:29,843: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.627 0.404 0.536 0.830 0.507 0.608 0.118 1.000 0.999 0.487 0.754 0.540 0.876 0.659 0.528 0.553 |
|
INFO: 2024-07-12 13:24:30,400: llmtf.base.daru/treewayabstractive: Processing Dataset: 2133.53s |
|
INFO: 2024-07-12 13:24:30,406: llmtf.base.daru/treewayabstractive: Results for daru/treewayabstractive: |
|
INFO: 2024-07-12 13:24:30,411: llmtf.base.daru/treewayabstractive: {'rouge1': 0.35648633247803135, 'rouge2': 0.13258370390182936} |
|
INFO: 2024-07-12 13:24:30,414: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:24:30,426: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.603 0.245 0.404 0.536 0.830 0.507 0.608 0.118 1.000 0.999 0.487 0.754 0.540 0.876 0.659 0.528 0.553 |
|
INFO: 2024-07-12 13:25:00,587: llmtf.base.darumeru/cp_para_ru: Processing Dataset: 917.89s |
|
INFO: 2024-07-12 13:25:00,603: llmtf.base.darumeru/cp_para_ru: Results for darumeru/cp_para_ru: |
|
INFO: 2024-07-12 13:25:00,607: llmtf.base.darumeru/cp_para_ru: {'symbol_per_token': 2.470731796239884, 'len': 0.9979824104186845, 'lcs': 0.959932364013627} |
|
INFO: 2024-07-12 13:25:00,609: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [32000, 13] |
|
INFO: 2024-07-12 13:25:00,609: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-07-12 13:25:03,462: llmtf.base.darumeru/cp_para_en: Loading Dataset: 2.85s |
|
INFO: 2024-07-12 13:36:47,842: llmtf.base.darumeru/cp_para_en: Processing Dataset: 704.38s |
|
INFO: 2024-07-12 13:36:47,846: llmtf.base.darumeru/cp_para_en: Results for darumeru/cp_para_en: |
|
INFO: 2024-07-12 13:36:47,850: llmtf.base.darumeru/cp_para_en: {'symbol_per_token': 3.960763996832381, 'len': 0.9995281850843424, 'lcs': 0.9811766452032213} |
|
INFO: 2024-07-12 13:36:47,852: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-07-12 13:36:47,881: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_para_en darumeru/cp_para_ru darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruTiE darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.644 0.245 0.404 0.536 0.830 0.507 0.608 0.118 0.981 0.960 1.000 0.999 0.487 0.754 0.540 0.876 0.659 0.528 0.553 |
|
|