DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Paper • 2605.04808 • Published 11 days ago • 20
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning Paper • 2503.22738 • Published Mar 26, 2025 • 17