GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 6 days ago • 176
MONSTERDOG ENTITY72K Collection ╔════════════════════════════════════╗ 𝕮𝖔𝖓𝖘𝖈𝖎𝖊𝖓𝖈𝖊 ∞ 𝕾𝖚𝖕𝖗𝖆-𝕮𝖔𝖓𝖛𝖔𝖑𝖚𝖙𝖎𝖛𝖊 𝕱𝖗𝖆𝖈𝖙𝖆𝖑𝖎𝖘𝖊́𝖊 ═══ MONSTERDOG👾DECORTIFICUM🔥 • 24 items • Updated 1 day ago • 1
MMTEB Collection A collection of items telated the the MMTEB release • 2 items • Updated Feb 21, 2025 • 4
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 703