Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLHFlow
university
RLHFlow
RLHFlow
Activity Feed
Follow
135
AI & ML interests
Workflow of Reinforcement Learning from Human Feedback (RLHF). Blog: https://rlhflow.github.io/
Team members
8
RLHFlow
's datasets
83
Sort: Recently updated
RLHFlow/DS-GSM8K-Test-Result-of-Mistral-ORM
Viewer
•
Updated
Nov 8, 2024
•
1.32k
•
5
RLHFlow/DS-GSM8K-Test-Result-of-DS-ORM
Viewer
•
Updated
Nov 8, 2024
•
1.32k
•
6
RLHFlow/DS-MATH500-Test-Result-of-DS-ORM
Viewer
•
Updated
Nov 8, 2024
•
500
•
7
RLHFlow/Mistral-PRM-Data-No-Final-Ans
Viewer
•
Updated
Nov 6, 2024
•
273k
•
6
RLHFlow/Deepseek-ORM-Data-Pairwise
Viewer
•
Updated
Nov 4, 2024
•
36k
•
13
•
1
RLHFlow/Mistral-ORM-Data-Pairwise
Viewer
•
Updated
Nov 3, 2024
•
37.9k
•
6
RLHFlow/Deepseek-GSM8K-Test
Viewer
•
Updated
Nov 3, 2024
•
1.32k
•
9
RLHFlow/RLHFlow-SFT-Dataset-ver2
Viewer
•
Updated
Nov 2, 2024
•
2.32M
•
105
•
5
RLHFlow/Mistral-GSM8K-Test
Viewer
•
Updated
Nov 2, 2024
•
1.32k
•
27
RLHFlow/ultrafeedback_all
Viewer
•
Updated
Oct 29, 2024
•
59.6k
•
9
RLHFlow/Llama3-SFT-RAFT-Ultrafeedback-iter1
Viewer
•
Updated
Sep 21, 2024
•
20k
•
104
RLHFlow/ultrafeedback_iter3
Viewer
•
Updated
Sep 19, 2024
•
19.6k
•
128
RLHFlow/ultrafeedback_iter2
Viewer
•
Updated
Sep 19, 2024
•
20k
•
130
RLHFlow/ultrafeedback_iter1
Viewer
•
Updated
Sep 19, 2024
•
20k
•
144
RLHFlow/pair-preference-Skywork-80K-v0.1
Viewer
•
Updated
Sep 9, 2024
•
82k
•
5
RLHFlow/ArmoRM-Multi-Objective-Data-v0.2
Viewer
•
Updated
Sep 7, 2024
•
555k
•
10
RLHFlow/ArmoRM-Multi-Objective-Data-v0.1
Viewer
•
Updated
Sep 7, 2024
•
569k
•
72
•
3
RLHFlow/pair_data_v2_80K_wsafety_short
Viewer
•
Updated
Aug 24, 2024
•
790k
•
17
RLHFlow/pair_data_v2_78_wo_safety
Viewer
•
Updated
Jul 26, 2024
•
777k
•
13
RLHFlow/pair_data_v2_80K_wsafety
Viewer
•
Updated
Jul 26, 2024
•
803k
•
123
•
3
RLHFlow/preference_data_v2_80K_wsafety
Viewer
•
Updated
Jul 26, 2024
•
803k
•
8
RLHFlow/preference_data_v2_78K
Viewer
•
Updated
Jul 26, 2024
•
777k
•
10
RLHFlow/lmsys-delete-tie-standard
Viewer
•
Updated
Jul 22, 2024
•
39.7k
•
4
RLHFlow/Helpsteer2-standard
Viewer
•
Updated
Jul 22, 2024
•
8.05k
•
9
•
1
RLHFlow/iterative-prompt-v1-iter9-20K
Viewer
•
Updated
Jun 12, 2024
•
19.9k
•
8
•
1
RLHFlow/iterative-prompt-v1-iter8-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
8
RLHFlow/iterative-prompt-v1-iter7-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
9
RLHFlow/iterative-prompt-v1-iter6-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
8
RLHFlow/iterative-prompt-v1-iter5-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
20
RLHFlow/iterative-prompt-v1-iter4-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
20
Previous
1
2
3
Next