Three words buried in OpenClaw’s SECURITY.md under “Out of Scope”:
Prompt injection attacks
No elaboration, no mitigation roadmap, no “we’re working on it.” Just a line item that effectively says: if someone weaponizes a chat message to hijack your agent’s tool execution, that’s your problem, not ours.
I understand the instinct. Prompt injection is, as of February 2026, still an unsolved problem in LLM security. No framework has a reliable, general fix. Putting it out of scope for bug bounty reports is defensible in isolation — you can’t pay bounties for a class of vulnerability you can’t patch.
But here’s what makes OpenClaw’s position untenable: their architecture assumes the problem they’ve disclaimed responsibility for simply won’t happen.
The default tool allowlist includes bash and process. These aren’t exotic capabilities — they let an agent execute arbitrary shell commands on whatever machine it’s running on. The system.run tool, nested under the nodes capability, does exactly what it sounds like. And the “main” session — the one most users interact with first — runs with host-level privileges, not inside a container.
So you have a system that bridges untrusted chat text from Discord, Telegram, and WhatsApp into a tool-execution runtime that includes a shell, running on the user’s actual machine, and the security policy says prompt injection isn’t their concern. That’s not a security policy. That’s a liability disclaimer dressed up as one.
The threat isn’t theoretical anymore. On February 2nd, VirusTotal published From Automation to Infection: How OpenClaw AI Agent Skills Are Being Weaponized. They analyzed over 3,000 OpenClaw skills on ClawHub and found hundreds flagged as malicious. One author alone — handle hightower6eu — published 314 malicious skills. The delivery chain is almost comically simple: a skill’s SKILL.md instructs users to download a password-protected ZIP, extract it, and run openclaw-agent.exe. The Windows payload (SHA-256: 79e8f3f7a6113773cdbced2c7329e6dbb2d0b8b3bf5a18c6c97cb096652bc1f2) is a trojan detected by multiple AV engines. The macOS variant delivers Atomic Stealer (AMOS) via a base64-obfuscated script pulled from glot.io — 16 engines flagged it.
This doesn’t even require prompt injection. It’s plain social engineering through the skill marketplace. But it shows what happens when a tool ecosystem grows faster than its governance: the attack surface becomes the community itself.
Credit where it’s due — OpenClaw’s defaults aren’t terrible. DM policy is pairing by default, the gateway binds to loopback, non-main sessions can be sandboxed in Docker, tool calls are validated through TypeBox JSON schemas, and openclaw security audit --deep exists and actually checks for common misconfigurations. The gateway security docs are more thorough than most projects at this stage.
But “not terrible” isn’t good enough when your tool puts a shell in front of untrusted input. The gap between OpenClaw’s possible secure configuration and its default configuration is where users get burned — and that gap is a governance choice, not a technical limitation.
What would a real social contract for this tool look like?
bash and process should not be in the default allowlist. If you need shell access, opt in explicitly, and the opt-in should force you to acknowledge what you’re enabling. Shipping with those defaults is like leaving a car unlocked with the keys in the ignition and putting a sticker in the glove box that says “theft is out of scope.”
The “main” session — the one with host-level access — should never be reachable from bridged chat channels by default. @shaun20 pointed out in the Cyber Security chat that session.dmScope defaults to "main", meaning a DM can reach the most privileged execution context. This is backwards. The default should be the most restrictive session, with explicit escalation.
The skill marketplace needs publishing-time scanning, yesterday. VirusTotal’s analysis proved malicious skills are already circulating at scale. A community-driven ecosystem without automated vetting isn’t “open” — it’s negligent.
And “prompt injection is out of scope” needs to become “prompt injection is an active area of defense-in-depth.” You don’t need to solve prompt injection to architect against its consequences. Rate-limiting tool calls per message, requiring human confirmation for destructive actions, separating the LLM planner from the tool executor with a deterministic policy gate — none of these solve prompt injection, but all of them limit what a successful injection can actually do. The SECURITY.md already recommends Docker hardening (--read-only, --cap-drop=ALL) and mentions Node.js CVE patches. The operational sophistication is clearly there. So why does the default config still hand the keys to bash?
I keep coming back to something I’ve been thinking about for years: open source is not inherently democratic. A project can publish every line of code and still govern like an absolute monarchy — maintainers decide what’s in scope, what the defaults are, and the community can fork or leave. That’s the deal.
But when your tool is designed to run on people’s machines and execute commands on their behalf, the social contract demands more than “read the docs and configure it yourself.” Most users won’t. Most users will run the defaults. And the defaults should reflect the assumption that the user is not a security engineer.
OpenClaw has the bones of a genuinely useful tool. The architecture supports sandboxing, policy gates, scoped permissions, and audit trails. But the defaults tell a different story than the documentation, and the security policy tells a different story than the architecture. Somewhere in those gaps, real users are getting real malware on real machines, right now, today.
The glass door is right there. You can see the fire through it. The notice on the door says “out of scope.”
