TPO

community

AI & ML interests

Alignment, Preference Optimization, RLHF

Papers

Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization

View all Papers

tpo-alignment 's models 11

tpo-alignment/Instruct-Llama-3-8B-TPO-L-y2

8B • Updated Feb 19, 2025 • 6

tpo-alignment/Instruct-Llama-3-8B-TPO-y2

8B • Updated Feb 19, 2025 • 2

tpo-alignment/Instruct-Llama-3-8B-TPO-y4

8B • Updated Feb 19, 2025 • 1

tpo-alignment/Instruct-Llama-3-8B-TPO-y3

8B • Updated Feb 19, 2025 • 11

tpo-alignment/Mistral-Instruct-7B-TPO-y2-v0.2

7B • Updated Feb 19, 2025 • 3

tpo-alignment/Mistral-Instruct-7B-TPO-y2-v0.1

7B • Updated Feb 19, 2025 • 4

tpo-alignment/Mistral-Instruct-7B-TPO-y4

7B • Updated Feb 19, 2025 • 3

tpo-alignment/Mistral-Instruct-7B-TPO-y3

7B • Updated Feb 19, 2025 • 3

tpo-alignment/Llama-3-8B-TPO-L-40k

8B • Updated Feb 19, 2025

tpo-alignment/Mistral-7B-TPO-40k

7B • Updated Feb 19, 2025 • 3

tpo-alignment/Llama-3-8B-TPO-40k

8B • Updated Feb 19, 2025