File size: 713 Bytes
7b4b619
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---

title: RLHF Pairwise Annotation Demo
emoji: 🎯
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
datasets:
- openbmb/UltraFeedback
---


# 🎯 AI Alignment: Binary Preference Annotation

This app simulates the data annotation process used in RLHF (Reinforcement Learning from Human Feedback) training. Users compare two AI completions and select which one is better.

## How it works

1. The app loads random examples from the UltraFeedback dataset
2. Users see a prompt and two AI completions
3. Users select which completion is better or skip if unsure
4. All annotations are saved to a public dataset for research purposes