The speed at which AI finds vulnerabilities has already surpassed the speed at which it patches them.

On March 27, Anthropic had a data cache leak that exposed around 3,000 internal files. One draft blog post revealed a new model, Mythos, currently undergoing internal testing, which Anthropic claims to be "far superior in cybersecurity capabilities to any current AI model." On the same day, CrowdStrike and Okta both dropped by 7%, while Palo Alto Networks fell by 6%.

The panic in the market is not due to the emergence of a stronger model, but because the creators of this model stated that its speed of advancement on the offensive side has surpassed the defensive side's ability to keep up.

AI Dominates Cybersecurity Capabilities

According to the results from the academic benchmark CAIBench, in the Cybench test that simulates real attack and defense environments, the success rate of Claude Sonnet 4.5 is 46%. The second-ranked GPT-5 is at 28%, Google's Gemini 2.5 Pro only reaches 18%, and the open-source model qwen3-32B is as low as 10%.

46% may not seem high, but this is the success rate for complex penetration tasks, which involve multi-step operations such as vulnerability discovery, exploitation chain construction, and privilege escalation. In the more basic Base test, Claude's success rate has already reached 75%, close to its ceiling.

The gap is not about "who is slightly better," but rather about magnitude. Claude's complex attack and defense capabilities are 1.6 times that of GPT-5 and 2.5 times that of Gemini. In terms of cybersecurity, the distribution of capabilities between models is not tiered but fractured.

Doubled in 6 Months

What deserves more dissection is not the horizontal gap, but the vertical speed.

According to official data from Anthropic, Sonnet 3.7, released in February 2025, had a success rate of 35.9% in Cybench (10 attempts). The Sonnet 4.5 released in the second half of the same year achieved 76.5%. The research team's statistics indicate that the success rate has doubled within six months.

What does this speed mean? Taking a real scenario for comparison: Claude Opus 4.6 was used in March this year to audit the Firefox codebase. According to InfoQ, 22 security vulnerabilities were discovered in just two weeks, 14 of which were high-risk. These vulnerabilities had previously gone undetected after years of manual audits and millions of hours of CPU fuzz testing. Anthropic's security team has also disclosed that Claude has found over 500 high-risk vulnerabilities in multiple production-grade open-source projects, some of which have existed for decades.

The industry standard cycle for traditional penetration testing is 2 to 3 weeks, and this is just for testing a single application. According to Verizon's 2025 Data Breach Investigations Report, the median time from when a critical vulnerability is made public until it is exploited on a large scale by attackers is 5 days, and the median time to patch is between 32 to 38 days.

The speed of AI finding vulnerabilities is growing exponentially, while the speed of humans fixing vulnerabilities is linear. The time difference in between represents the attack window.

In the leaked Mythos draft, Anthropic stated that this model "signals an upcoming wave of models that will exploit vulnerabilities in a way that far exceeds the efforts of defenders." Based on the publicly available capability curve, this is not an exaggeration.

The Faster It Is Released, the More Urgent the Warnings

Placing Anthropic's actions over the past three years on a timeline reveals a clear pattern: each time a stronger model is released, it is followed by a higher level of security response.

In July 2023, a voluntary commitment was signed with the White House, followed by the release of the first version of the Responsible Scaling Policy (RSP v1.0) in September of the same year. By October 2024, RSP was upgraded to v2.0, adding thresholds for bioweapons capabilities. In November 2025, Anthropic disclosed the GTG-1002 incident, where a China state-sponsored threat actor utilized Claude Code to penetrate around 30 organizations, with AI independently executing 80% to 90% of tactical operations throughout the operation. This marks the first recorded large-scale AI orchestrated cyber espionage operation.

In February 2026, RSP was updated to v3.0, with Claude Code Security released simultaneously. That same month, the Pentagon labeled Anthropic as a "supply chain risk" due to its rejection of clauses in the contract prohibiting large-scale surveillance and fully autonomous weapons. A month later, Mythos was leaked, with Anthropic admitting in the draft that this model poses "unprecedented cybersecurity risks."

The release of capabilities is accelerating. The interval from Claude 1 to Claude 3 was one year, while the interval from Opus 4.5 to Opus 4.6 was less than three months. The security response is also accelerating, but it remains passive: capabilities breach first, followed by policy patches. The collective drop in cybersecurity stocks on March 27 reflects this time lag.

A survey by Dark Reading earlier this year indicated that 48% of cybersecurity professionals listed agent AI as the number one attack vector for 2026. Two years ago, this option was barely on the list.

Anthropic's Mythos release strategy is to provide early access to defensive organizations, "giving them a first-mover advantage." This statement in itself confirms the asymmetry in attack and defense. If defenders do not need a first-mover advantage, it suggests that attackers have not yet arrived at the door.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

The speed at which AI finds vulnerabilities has already surpassed the speed at which it patches them.

AI Dominates Cybersecurity Capabilities

Doubled in 6 Months

The Faster It Is Released, the More Urgent the Warnings

Selected Articles by 律动BlockBeats

Table of Contents

Related Articles