Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO Paper • 2511.13288 • Published 19 days ago • 17
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment Paper • 2508.07750 • Published Aug 11 • 19