arxiv:2603.24414

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Published on Mar 25

· Submitted by

Chenxu Wang on Apr 2

Upvote

169

Authors:

Chenxu Wang ,

Abstract

OpenClaw's security vulnerabilities necessitate comprehensive protection through ClawKeeper, a real-time framework implementing skill-based, plugin-based, and watcher-based security mechanisms across multiple architectural layers.

AI-generated summary

OpenClaw has rapidly established itself as a leading open-source autonomous agent runtime, offering powerful capabilities including tool integration, local file access, and shell command execution. However, these broad operational privileges introduce critical security vulnerabilities, transforming model errors into tangible system-level threats such as sensitive data leakage, privilege escalation, and malicious third-party skill execution. Existing security measures for the OpenClaw ecosystem remain highly fragmented, addressing only isolated stages of the agent lifecycle rather than providing holistic protection. To bridge this gap, we present ClawKeeper, a real-time security framework that integrates multi-dimensional protection mechanisms across three complementary architectural layers. (1) Skill-based protection operates at the instruction level, injecting structured security policies directly into the agent context to enforce environment-specific constraints and cross-platform boundaries. (2) Plugin-based protection serves as an internal runtime enforcer, providing configuration hardening, proactive threat detection, and continuous behavioral monitoring throughout the execution pipeline. (3) Watcher-based protection introduces a novel, decoupled system-level security middleware that continuously verifies agent state evolution. It enables real-time execution intervention without coupling to the agent's internal logic, supporting operations such as halting high-risk actions or enforcing human confirmation. We argue that this Watcher paradigm holds strong potential to serve as a foundational building block for securing next-generation autonomous agent systems. Extensive qualitative and quantitative evaluations demonstrate the effectiveness and robustness of ClawKeeper across diverse threat scenarios. We release our code.

View arXiv page View PDF Project page GitHub 290 Add to collection

Community

xunyoyo

Paper author Paper submitter 1 day ago

🛡️ ClawKeeper: The "Norton Antivirus" for OpenClaw Agents

TL;DR: We built a comprehensive, real-time security framework for OpenClaw agents that protects across the full agent lifecycle — from instruction injection to runtime execution to system-level monitoring.

Why This Matters

OpenClaw is powerful — tool integration, local file access, shell command execution — but with great power comes great attack surface. Model errors can escalate into real system-level threats: sensitive data leakage, privilege escalation, malicious third-party skill execution. Existing defenses? Fragmented and incomplete.

What We Did

ClawKeeper introduces three complementary protection layers:

🧠 Skill-based Protection (Instruction Level) — Injects structured security policies directly into agent context, enforcing environment-specific constraints and cross-platform boundaries
⚙️ Plugin-based Protection (Runtime Level) — Configuration hardening, proactive threat detection, and continuous behavioral monitoring throughout the execution pipeline
👁️ Watcher-based Protection (System Level) — A novel, decoupled security middleware that monitors agent state evolution in real-time, capable of halting high-risk actions or enforcing human-in-the-loop confirmation — without coupling to the agent's internal logic

We believe the Watcher paradigm has strong potential as a foundational building block for securing next-generation autonomous agent systems.

Key Results

📊 Built a benchmark with 140 adversarial test cases across 7 safety categories
🏆 ClawKeeper achieved optimal defense performance across ALL categories, outperforming existing open-source security solutions
10 key security capabilities covering scanning, threat gating, anomaly detection, intent enforcement, config monitoring, auto-remediation, extension shielding, audit logging, threat intelligence, and cross-platform security

Resources

📄 Paper: arXiv:2603.24414
💾 Huggingface Page: xunyoyo/clawkeeper
🔧 Code: GitHub (MIT License)

As agents get more autonomous, safety can't be an afterthought. We hope ClawKeeper sparks more discussion on holistic agent security — feedback and contributions are very welcome! 🙌

avahal

1 day ago

Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/clawkeeper-comprehensive-safety-protection-for-openclaw-agents-through-skills-plugins-and-watchers-2089-eb883c52
Covers the executive summary, detailed methodology, and practical applications.

librarian-bot

about 14 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

grantsing

about 11 hours ago

agent safety doesnt get enough attention. most of the discourse is about capabilities and alignment but the plugin/skill attack surface for agents is real and underexplored. clawkeeper's approach to this is interesting. writeup here https://arxivexplained.com/papers/clawkeeper-comprehensive-safety-protection-for-openclaw-agents-through-skills-plugins-and-watchers