amirabdullah19852020/pythia-70m_utility_reward Reinforcement Learning • 0.1B • Updated Feb 10, 2024 • 11
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward Reinforcement Learning • Updated Sep 23, 2023