Spaces:

QUTGenAILab
/

alignment-annotation-pairwise

Running

Initial commit

7b4b619 verified 5 days ago

713 Bytes

	---
	title: RLHF Pairwise Annotation Demo
	emoji: 🎯
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	license: mit
	datasets:
	- openbmb/UltraFeedback
	---

	# 🎯 AI Alignment: Binary Preference Annotation

	This app simulates the data annotation process used in RLHF (Reinforcement Learning from Human Feedback) training. Users compare two AI completions and select which one is better.

	## How it works

	1. The app loads random examples from the UltraFeedback dataset
	2. Users see a prompt and two AI completions
	3. Users select which completion is better or skip if unsure
	4. All annotations are saved to a public dataset for research purposes