amirabdullah19852020/gpt-neo-125m_utility_reward Reinforcement Learning • Updated Feb 10, 2024 • 3
amirabdullah19852020/pythia-70m_sentiment_reward Reinforcement Learning • Updated Feb 10, 2024 • 9
amirabdullah19852020/pythia-160m_sentiment_reward Reinforcement Learning • Updated Feb 10, 2024 • 7
amirabdullah19852020/gpt-neo-125m_sentiment_reward Reinforcement Learning • Updated Feb 10, 2024 • 2
amirabdullah19852020/pythia-160m_utility_reward Reinforcement Learning • Updated Feb 10, 2024 • 8
amirabdullah19852020/pythia-70m_utility_reward Reinforcement Learning • 0.1B • Updated Feb 10, 2024 • 19
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward Reinforcement Learning • Updated Sep 23, 2023
amirabdullah19852020/pythia-410m_utility_reward Reinforcement Learning • Updated Sep 21, 2023 • 6
amirabdullah19852020/pythia-410m_sentiment_reward Reinforcement Learning • Updated Sep 19, 2023 • 7
amirabdullah19852020/pythia_70m_ppo_imdb_sentiment_with_checkpoints Reinforcement Learning • Updated Jul 16, 2023 • 16
amirabdullah19852020/pythia_70m_ppo_imdb_sentiment_v3 Reinforcement Learning • Updated Jul 16, 2023 • 11
amirabdullah19852020/pythia_70m_ppo_imdb_sentiment_v2 Reinforcement Learning • Updated Jul 15, 2023 • 14
amirabdullah19852020/pythia_70m_ppo_imdb_sentiment Reinforcement Learning • Updated Jul 15, 2023 • 13