kaiwenw
·
AI & ML interests
Reinforcement Learning
Organizations
kaiwenw/dec9_sp1_pref_jdpo_all_reject_first
Viewer
•
Updated
•
4.64k
•
1
kaiwenw/dec9_sp1_pref_jdpo_all_chosen_first
Viewer
•
Updated
•
3.39k
•
4
kaiwenw/dec9_sp1_pref_jdpo
Viewer
•
Updated
•
7.64k
•
3
kaiwenw/dec9_sp1_pref_jdpo_n_5_temp_0.9
Viewer
•
Updated
•
7.29k
•
3
Viewer
•
Updated
•
3.64k
•
3
kaiwenw/dec8_aft_pref_judge_actor_temp_0.9_5_responses
Viewer
•
Updated
•
3.64k
•
10
kaiwenw/dec7_aft_pref_judge_temp_0.9
Viewer
•
Updated
•
20
•
5
kaiwenw/dec7_aft_llama8b_1.1
Viewer
•
Updated
•
3.64k
•
2
kaiwenw/dec7_aft_llama8b_1.0
Viewer
•
Updated
•
3.64k
•
7
kaiwenw/dec7_aft_llama8b_0.9
Viewer
•
Updated
•
3.64k
•
3
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_chosen_75_reject_25
Viewer
•
Updated
•
8.71k
•
1
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_chosen_25_reject_75
Viewer
•
Updated
•
8.71k
•
1
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_chosen_50_reject_50
Viewer
•
Updated
•
12.7k
•
2
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_all_chosen_first
Viewer
•
Updated
•
6.39k
•
1
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_all_reject_first
Viewer
•
Updated
•
7.95k
•
1
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9
Viewer
•
Updated
•
14.3k
•
1
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_n_9_temp_0.9
Viewer
•
Updated
•
14.7k
•
3
kaiwenw/nov18_oasst_mini_pref_jdpo_llama8b_1.0
Viewer
•
Updated
•
938
•
1
kaiwenw/nov18_oasst_mini_pref_jdpo_llama8b_1.0_n_9_temp_1.0
Viewer
•
Updated
•
790
•
1
kaiwenw/nov14_oasst_pref_jdpo_llama8b
Viewer
•
Updated
•
17.7k
•
2
kaiwenw/nov14_oasst_pref_jdpo_llama8b_9_judges
Viewer
•
Updated
•
14.7k
•
1
kaiwenw/nov13_oasst_pref_jdpo_llama70b
Viewer
•
Updated
•
5.21k
•
3
kaiwenw/nov13_oasst_pref_jdpo_llama70b_9_judges
Viewer
•
Updated
•
14.7k
•
10
kaiwenw/nov13_oasst_mini_pref_jdpo_llama70b
Viewer
•
Updated
•
302
•
2
kaiwenw/nov13_oasst_mini_pref_jdpo_llama70b_9_judges
Viewer
•
Updated
•
790
•
2
kaiwenw/nov12_oasst_pref_jdpo_llama70b
Viewer
•
Updated
•
2.61k
•
3
kaiwenw/nov12_oasst_pref_jdpo_llama70b_9_judges
Viewer
•
Updated
•
14.7k
•
1
kaiwenw/nov12_oasst_mini_pref_jdpo_llama70b_A1_try2
Viewer
•
Updated
•
139
•
1
kaiwenw/nov12_oasst_mini_pref_jdpo_llama70b_A1_try2_9_judges
Viewer
•
Updated
•
790
•
1
kaiwenw/nov11_oasst_pref_jdpo_gpt4o
Viewer
•
Updated
•
1.8k