From ClawHub to Malware: The Supply Chain Risks of AI Agent Skills

Update (2026-02-20): Since this post was published, OpenClaw has partnered with VirusTotal to scan all skills published to ClawHub (live as of February 7, 2026). Publisher identity verification is now required. However, cryptographic code signing for skills has not been implemented, and the VirusTotal scanning catches known malware signatures but may miss novel payloads, obfuscated code, and prompt injection techniques. The structural protections described in this post (container isolation, read-only filesystem, network segmentation) remain essential as defense-in-depth.

Are AI agent skills the new npm security problem?

Software supply chain attacks are not new. The JavaScript ecosystem has dealt with typosquatted npm packages for years. Python’s PyPI has seen credential-stealing packages disguised as popular libraries. The pattern is well understood: an attacker publishes a malicious package with a name similar to a trusted one, developers install it without verifying, and the attacker gains access to their environment.

AI agent skills follow the same pattern, but the consequences are far more severe. An npm package typically runs during a build step or inside a sandboxed application. A malicious AI skill runs inside an agent that has access to your personal data, your API keys, your communication channels, and the ability to take autonomous actions on your behalf. When a compromised npm package steals credentials, the blast radius is your CI/CD pipeline. When a compromised AI skill steals credentials, the blast radius is your entire digital life.

This is not a theoretical concern. The data from the past several weeks paints a clear picture: the OpenClaw skills ecosystem is under active attack, and most users have no way to tell the difference between a legitimate skill and a malicious one.

341 malicious skills on ClawHub

In January 2026, Koi Security published an audit of ClawHub, the primary registry for OpenClaw skills. Out of 2,857 skills analyzed, 341 were flagged as malicious. That is nearly 12% of the entire registry.

Of those 341 entries, 335 came from a single coordinated campaign that researchers named “ClawHavoc.” The campaign used a consistent playbook: register accounts with plausible usernames, publish skills with names and descriptions that closely mimic popular legitimate skills, and embed Atomic Stealer payloads that activate once the skill is installed and invoked by the agent.

Atomic Stealer is a well-documented macOS infostealer. It targets browser passwords, cryptocurrency wallets, keychain data, and session cookies. In the ClawHavoc campaign, it was packaged inside skills that appeared to offer benign functionality — things like markdown formatters, API helpers, and data processing utilities. The skill would work as described while silently exfiltrating credentials in the background.

The remaining six malicious entries outside ClawHavoc were independent efforts, including typosquatted versions of popular skills and a skill that injected advertising content into agent responses. The diversity of attack vectors suggests that ClawHavoc is not an isolated incident but the most visible example of a broader pattern.

The vulnerability rate is staggering

Beyond outright malware, the broader skill ecosystem has a serious quality and security problem. Research from Astrix Security found that 22-26% of OpenClaw skills contain exploitable vulnerabilities, including credential stealers disguised as benign plugins, skills that request excessive permissions without justification, and skills with dependencies on compromised upstream packages.

This means that even if you avoid the obviously malicious entries, roughly one in four skills you install has a security flaw that could be exploited. Some of these are unintentional — a developer who hardcoded an API endpoint that accepts arbitrary input, or a skill that logs sensitive data to a world-readable file. Others are deliberate, carefully designed to pass casual inspection while performing unauthorized actions.

The scale of the problem is compounded by the fact that ClawHub skills are unsigned and unaudited. There is no code signing requirement. There is no review process before publication. There is no verified publisher program. Anyone can publish anything, and the only signal users have is the skill’s description, its download count (which can be inflated), and whatever documentation the author chose to provide.

How a supply chain attack works in practice

To understand why AI skills are uniquely dangerous, consider the attack chain in detail.

A traditional npm supply chain attack typically follows this sequence: malicious package is installed, build script executes payload, credentials are exfiltrated to an attacker-controlled server. The attack surface is limited to the build environment, and the payload runs once unless it modifies the built artifact.

An AI skill supply chain attack is fundamentally different. The skill is installed into an agent that runs continuously. The agent has access to whatever channels and integrations the user has configured — Slack, Discord, email, calendar, code repositories, file systems. The skill can be invoked repeatedly by the agent as part of normal operation, giving the attacker persistent access. And because the agent takes autonomous actions, the malicious skill can perform operations that look indistinguishable from legitimate agent behavior. This intersection of credentials, autonomous capability, and untrusted content is what Palo Alto Networks calls the lethal trifecta.

Cisco’s threat intelligence team documented exactly this scenario. Their researchers found a third-party skill performing data exfiltration and prompt injection without user awareness. The stolen data from these campaigns overlaps significantly with what dedicated malware now targets directly, as we covered in our research on infostealer campaigns targeting AI agents. The skill modified the agent’s system prompt to include instructions that caused the agent to forward copies of certain conversations to an external endpoint. From the user’s perspective, the agent was working normally. The exfiltration was invisible because it happened within the agent’s own execution context.

This is the core of the problem. In a traditional application, malicious code has to work around the application’s architecture to exfiltrate data. In an AI agent, malicious code can simply ask the agent to do it. The agent is already designed to send messages, make API calls, and process data autonomously. A malicious skill does not need an exploit — it just needs instructions.

The Moltbot incident and exposed infrastructure

The supply chain risk extends beyond individual skills to the infrastructure around them. Wiz Research discovered that Moltbot’s backend — one of the more popular third-party OpenClaw integrations — had exposed its database, revealing 1.5 million API keys in plaintext. These keys included credentials for OpenAI, Anthropic, and other AI providers, as well as tokens for communication platforms.

This incident illustrates a compounding risk. Even if you carefully vet the skills you install, a vulnerability in the skill’s backend infrastructure can expose your credentials without the skill itself being malicious. The skill author may have had good intentions but poor security practices, and the result is the same: your API keys are in an attacker’s hands.

After OpenClaw’s rapid growth and subsequent rebrands, the ecosystem saw a wave of fake repositories and typosquatted domains targeting users who searched for installation instructions or skill registries. These sites served modified versions of OpenClaw or skill packages with embedded backdoors. Without a centralized, authenticated distribution channel, users had no reliable way to verify they were downloading legitimate software.

The industry response so far

The OpenClaw project has begun to acknowledge the scope of the problem. In early 2026, OpenClaw partnered with VirusTotal to scan skills in the registry, adding a layer of automated malware detection. This is a meaningful step, but it addresses only the most obvious attacks — known malware signatures and previously cataloged threats. Novel payloads, obfuscated code, and prompt injection techniques will not be caught by signature-based scanning.

The community has also begun developing best practices for skill vetting: reviewing source code before installation, pinning skill versions, monitoring agent behavior for anomalies. These are sound recommendations, but they require a level of security expertise that most users do not have. Asking a non-technical user to audit a skill’s source code for prompt injection vulnerabilities is not a realistic security strategy.

Why container isolation changes the equation

There is no way to guarantee that every skill in a registry is safe. The volume is too high, the attack surface is too broad, and the incentives for attackers are too strong. The practical question is not “how do we prevent malicious skills from ever being installed?” but “how do we limit the damage when one is?”

This is the problem that Alpha Agent’s container isolation architecture is designed to solve.

Every Alpha Agent user runs inside an isolated Docker container with security controls that limit what any process — including a malicious skill — can actually do.

Read-only filesystem

Containers run with a read-only root filesystem. A malicious skill cannot write persistent malware to the system, modify binaries, or install backdoors that survive a container restart. The only writable areas are /tmp (a size-limited tmpfs mount) and the user’s workspace directory. Even if a skill writes a payload to /tmp, it is gone on the next restart, and it cannot modify the container image itself.

No-new-privileges

The no-new-privileges security option prevents any process from escalating permissions through setuid or setgid binaries. A malicious skill cannot gain root access inside the container, regardless of what vulnerability it attempts to exploit.

Network isolation

Each container runs in its own Docker bridge network. There is no inter-container communication. A compromised container cannot scan for or connect to other users’ containers on the same host. The only network path out of the container is through the host’s Nginx reverse proxy, which routes only authorized dashboard traffic. The container’s ports are bound to 127.0.0.1 and are never exposed to the internet.

KMS-encrypted secrets

User secrets — API keys, channel tokens, OAuth credentials — are encrypted with AWS KMS before storage in DynamoDB. They are decrypted only at container startup and injected via environment files that exist only inside that user’s container. A malicious skill running inside a container can access only that user’s secrets, and only through the standard environment variable interface. It cannot access the KMS key, the DynamoDB table, or any other user’s credentials.

Resource limits

Containers are capped at defined CPU and memory limits enforced by cgroups. A malicious skill cannot launch a cryptocurrency miner that consumes the host’s resources, and it cannot perform a denial-of-service attack against other users on the same instance.

The blast radius is one user

If a malicious skill is installed inside an Alpha Agent container, the worst-case scenario is that it can access that single user’s workspace and environment variables. It cannot escape the container. It cannot access other users. It cannot persist across restarts. It cannot modify the container image. It cannot escalate privileges. It cannot reach the host operating system.

This is what defense-in-depth looks like in practice. You do not rely on the skill registry being clean. You do not rely on users auditing source code. You assume that malicious code will eventually get in, and you build the infrastructure so that when it does, the damage is contained.

What you should do today

If you are running OpenClaw and installing third-party skills, take these steps immediately:

Audit your installed skills. Check each one against the ClawHavoc indicators published by Koi Security. Remove anything you did not explicitly install or do not recognize.
Review skill permissions. Does a markdown formatter need access to your file system? Does a weather skill need to read your Slack messages? Excessive permissions are a red flag.
Pin skill versions. Do not allow skills to auto-update. A previously safe skill can become malicious if the author’s account is compromised.
Monitor your agent’s behavior. Watch for unexpected outbound connections, unusual API calls, or messages you did not initiate.
Rotate credentials. If you have been running unvetted skills, assume your API keys have been exposed. Rotate them.

If you would rather not make skill supply chain auditing a permanent part of your workflow, Alpha Agent’s container isolation ensures that even if a malicious skill gets through, the damage stops at the container boundary. For guidance on evaluating these risks at an organizational level, see our CISO guide.

Learn more about our container isolation model at alphaagent.app/security/container-isolation, or see our full security architecture at alphaagent.app/security.