RLHF-And-Friends
community
AI & ML interests
None defined yet.
models 27
RLHF-And-Friends/RM-TLDR-TLDR-Qwen2-0.5B-SmallSFT-lr-1e-5
Text Classification • 0.5B • Updated
RLHF-And-Friends/RM-TLDR-TLDR-Qwen2-0.5B-SmallSFT
Text Classification • 0.5B • Updated
RLHF-And-Friends/TLDR-Qwen2-0.5B-SmallSFT
Text Generation • 0.5B • Updated
• 5
RLHF-And-Friends/TLDR-Llama-3.2-1B-SmallSFT-RM
Text Classification • 1B • Updated
RLHF-And-Friends/TLDR-Llama-3.2-1B-SmallSFT
Text Generation • 1B • Updated
• 3
RLHF-And-Friends/Wiki-Lingua-Llama-3.2-3B-RM
Text Classification • 3B • Updated
• 1
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-RM
Text Classification • 3B • Updated
• 1
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-RM-lr-1e-5
Text Classification • 3B • Updated
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-lr-1e-5
Text Generation • 3B • Updated
• 1
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT
Text Generation • 3B • Updated
datasets 13
RLHF-And-Friends/alpaca-cleaned
Viewer
• Updated
• 51.8k • 4
RLHF-And-Friends/tldr-thematic
Viewer
• Updated
• 130k • 148
RLHF-And-Friends/wiki-lingua-ppo
Viewer
• Updated
• 493k • 2
RLHF-And-Friends/wiki-lingua-reward
Viewer
• Updated
• 77k • 2
RLHF-And-Friends/wiki-lingua-preference
Viewer
• Updated
• 77k • 9
RLHF-And-Friends/wiki-lingua-paired
Viewer
• Updated
• 77k • 2
RLHF-And-Friends/wiki-lingua
Viewer
• Updated
• 742k • 18
RLHF-And-Friends/helpsteer3-multilingual
Viewer
• Updated
• 8.06k • 7
RLHF-And-Friends/helpsteer3-code
Viewer
• Updated
• 8.86k • 27 • 2
RLHF-And-Friends/tldr-ppo
Viewer
• Updated
• 113k • 2