BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 36 • 3
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo Paper • 2508.18370 • Published Aug 25, 2025 • 3 • 2
Cyber-Zero: Training Cybersecurity Agents without Runtime Paper • 2508.00910 • Published Jul 29, 2025 • 8 • 2