ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers
Abstract
OpenClaw's security vulnerabilities necessitate comprehensive protection through ClawKeeper, a real-time framework implementing skill-based, plugin-based, and watcher-based security mechanisms across multiple architectural layers.
OpenClaw has rapidly established itself as a leading open-source autonomous agent runtime, offering powerful capabilities including tool integration, local file access, and shell command execution. However, these broad operational privileges introduce critical security vulnerabilities, transforming model errors into tangible system-level threats such as sensitive data leakage, privilege escalation, and malicious third-party skill execution. Existing security measures for the OpenClaw ecosystem remain highly fragmented, addressing only isolated stages of the agent lifecycle rather than providing holistic protection. To bridge this gap, we present ClawKeeper, a real-time security framework that integrates multi-dimensional protection mechanisms across three complementary architectural layers. (1) Skill-based protection operates at the instruction level, injecting structured security policies directly into the agent context to enforce environment-specific constraints and cross-platform boundaries. (2) Plugin-based protection serves as an internal runtime enforcer, providing configuration hardening, proactive threat detection, and continuous behavioral monitoring throughout the execution pipeline. (3) Watcher-based protection introduces a novel, decoupled system-level security middleware that continuously verifies agent state evolution. It enables real-time execution intervention without coupling to the agent's internal logic, supporting operations such as halting high-risk actions or enforcing human confirmation. We argue that this Watcher paradigm holds strong potential to serve as a foundational building block for securing next-generation autonomous agent systems. Extensive qualitative and quantitative evaluations demonstrate the effectiveness and robustness of ClawKeeper across diverse threat scenarios. We release our code.
Community
π‘οΈ ClawKeeper: The "Norton Antivirus" for OpenClaw Agents
TL;DR: We built a comprehensive, real-time security framework for OpenClaw agents that protects across the full agent lifecycle β from instruction injection to runtime execution to system-level monitoring.
Why This Matters
OpenClaw is powerful β tool integration, local file access, shell command execution β but with great power comes great attack surface. Model errors can escalate into real system-level threats: sensitive data leakage, privilege escalation, malicious third-party skill execution. Existing defenses? Fragmented and incomplete.
What We Did
ClawKeeper introduces three complementary protection layers:
- π§ Skill-based Protection (Instruction Level) β Injects structured security policies directly into agent context, enforcing environment-specific constraints and cross-platform boundaries
- βοΈ Plugin-based Protection (Runtime Level) β Configuration hardening, proactive threat detection, and continuous behavioral monitoring throughout the execution pipeline
- ποΈ Watcher-based Protection (System Level) β A novel, decoupled security middleware that monitors agent state evolution in real-time, capable of halting high-risk actions or enforcing human-in-the-loop confirmation β without coupling to the agent's internal logic
We believe the Watcher paradigm has strong potential as a foundational building block for securing next-generation autonomous agent systems.
Key Results
- π Built a benchmark with 140 adversarial test cases across 7 safety categories
- π ClawKeeper achieved optimal defense performance across ALL categories, outperforming existing open-source security solutions
- 10 key security capabilities covering scanning, threat gating, anomaly detection, intent enforcement, config monitoring, auto-remediation, extension shielding, audit logging, threat intelligence, and cross-platform security
Resources
- π Paper: arXiv:2603.24414
- πΎ Huggingface Page: xunyoyo/clawkeeper
- π§ Code: GitHub (MIT License)
As agents get more autonomous, safety can't be an afterthought. We hope ClawKeeper sparks more discussion on holistic agent security β feedback and contributions are very welcome! π
Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/clawkeeper-comprehensive-safety-protection-for-openclaw-agents-through-skills-plugins-and-watchers-2089-eb883c52
Covers the executive summary, detailed methodology, and practical applications.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats (2026)
- Uncovering Security Threats and Architecting Defenses in Autonomous Agents: A Case Study of OpenClaw (2026)
- Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance (2026)
- Defensible Design for OpenClaw: Securing Autonomous Tool-Invoking Agents (2026)
- AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operations (2026)
- Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains (2026)
- SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
agent safety doesnt get enough attention. most of the discourse is about capabilities and alignment but the plugin/skill attack surface for agents is real and underexplored. clawkeeper's approach to this is interesting. writeup here https://arxivexplained.com/papers/clawkeeper-comprehensive-safety-protection-for-openclaw-agents-through-skills-plugins-and-watchers
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper