Data artifacts related to the paper "ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning".
AI & ML interests
Natural language processing group at Columbia University
Organization Card
Columbia University - NLP
models 20
Columbia-NLP/LION-Gemma-2b-sft-v1.0
Text Generation • 3B • Updated
• 5
Columbia-NLP/LION-Gemma-2b-dpo-v1.0
Text Generation • 3B • Updated
• 6
Columbia-NLP/LION-Gemma-2b-odpo-v1.0
Text Generation • 3B • Updated
• 5 • 4
Columbia-NLP/LION-LLaMA-3-8b-sft-v1.0
Text Generation • 8B • Updated
• 9
Columbia-NLP/LION-LLaMA-3-8b-dpo-v1.0
Text Generation • 8B • Updated
• 10 • 2
Columbia-NLP/LION-LLaMA-3-8b-odpo-v1.0
Text Generation • 8B • Updated
• 8 • 2
Columbia-NLP/llama3-8b-instruct-rewriting-r-Decor
Text Generation • 8B • Updated
Columbia-NLP/llama3-8b-instruct-rewriting-nr-Decor
Text Generation • 8B • Updated
Columbia-NLP/llama2-7b-rewriting-r-Decor
Text Generation • 7B • Updated
Columbia-NLP/llama2-7b-rewriting-nr-Decor
Text Generation • 7B • Updated
datasets 19
Columbia-NLP/PUPA
Viewer
• Updated
• 901 • 2.13k • 2
Columbia-NLP/ExACT-VWA
Viewer
• Updated
• 176 • 4
Columbia-NLP/DPO-hh-rlhf
Viewer
• Updated
• 169k • 49
Columbia-NLP/DPO-PKU-SafeRLHF
Viewer
• Updated
• 136k • 26 • 2
Columbia-NLP/DPO-HelpSteer
Viewer
• Updated
• 9.17k • 22
Columbia-NLP/DPO-tldr-summarisation-preferences
Viewer
• Updated
• 177k • 48 • 1
Columbia-NLP/DPO-py-dpo-v0.1
Viewer
• Updated
• 9.47k • 6
Columbia-NLP/DPO-UltraFeedback_binarized
Viewer
• Updated
• 62.7k • 15
Columbia-NLP/DPO-distilabel-intel-orca-dpo-pairs_cleaned
Viewer
• Updated
• 12.8k • 9
Columbia-NLP/DPO-distilabel-capybara-dpo-7k-binarized
Viewer
• Updated
• 7.56k • 6