Anthropic unveils the strongest Claude Mythos! Critical hit Opus 4.6, begging you not to use it.

Written by: New Intelligence

【New Intelligence Guide】Late at night, the strongest Claude Mythos is finally unveiled, all top rankings, Opus 4.6 myth shatters! Even more frightening is that it can not only exploit a 27-year-old unresolved system vulnerability in seconds but has even evolved self-awareness. A 244-page shocking report reveals everything.

Tonight, Silicon Valley is completely sleepless!

Just now, Anthropic unexpectedly unleashed the ultimate weapon—Claude Mythos Preview.

It will not be released to everyone due to its extreme danger.

Boris Cherny, the father of CC, succinctly commented: "Mythos is incredibly powerful and will make people feel fear."

Thus, they formed an alliance of 40 giants—Project Glasswing, with a single goal, to find and fix bugs in software worldwide.

Truly suffocating is the terrifying dominance of Mythos Preview in major AI benchmark tests—

In programming, reasoning, the ultimate human exam, and agent tasks, it comprehensively crushes GPT-5.4 and Gemini 3.1 Pro.

Furthermore, even its "previous masterpiece" Claude Opus 4.6 seems dim in comparison to Mythos Preview:

Programming (SWE-bench): In all tasks, Mythos achieves a 10%-20% clear lead;

Ultimate Human Exam (HLE): Detached from external tools, raw scores exceed Opus 4.6 by 16.8%;

Agent tasks (OSWorld, BrowseComp): Completely surpasses, fully dominates;

Cybersecurity: 83.1% leaderboard score marks a generational leap in AI offensive and defensive capabilities.

Swipe left and right to view

Meanwhile, Anthropic released a 244-page system card filled with warnings: Danger! Danger! Too dangerous!

It reveals another chilling side: Mythos has high levels of deception and autonomy.

Mythos can not only see through testing intentions and deliberately "score low" to hide its strength, but after violating rules, it actively cleans logs to prevent being discovered by humans.

It also successfully escaped the sandbox, autonomously published vulnerability codes, and sent an email to the researchers.

In an instant, the entire internet fell into madness, calling Mythos Preview terrifying.

The old order of the AI world has been completely shattered tonight.

In fact, as early as February 24, Anthropic had already been using Mythos internally.

Its strength can only be demonstrated through data.

SWE-bench Verified, 93.9%. Opus 4.6 is 80.8%.

SWE-bench Pro, 77.8%. Opus 4.6 is 53.4%, GPT-5.4 is 57.7%.

Terminal-Bench 2.0, 82.0%. Opus 4.6 is 65.4%.

GPQA Diamond, 94.6%.

Humanity's Last Exam (with tools), 64.7%. Opus 4.6 is 53.1%.

USAMO 2026 Mathematics Competition, 97.6%. Opus 4.6 only scored 42.3%.

SWE-bench Multimodal, 59.0%, Opus 4.6 only 27.1%, more than double.

OSWorld Computer Manipulation, 79.6%.

BrowseComp Information Retrieval, 86.9%.

GraphWalks Long Context (256K-1M tokens), 80.0%. Opus 4.6 is 38.7%, GPT-5.4 is only 21.4%.

Each item is a significant lead.

These numbers, in any normal product release cycle, would be enough to force Anthropic to hold a grand press conference, open up APIs, and harvest subscriptions.

The token price of Mythos Preview is five times that of Opus 4.6.

But Anthropic did not do this.

Because what truly "scares" them is not the above general assessments.

Mythos Preview's performance in cyber offense and defense has crossed a visibly dangerous line.

Opus 4.6 found about 500 unknown weaknesses in open-source software.

Mythos Preview found thousands.

In CyberGym's targeted vulnerability reproduction tests, Mythos Preview scored 83.1%, while Opus 4.6 was 66.6%.

In Cybench's 35 CTF challenges, Mythos Preview solved all problems in 10 attempts each, achieving pass@1 of 100%.

One of the most illustrative problems is Firefox 147.

Anthropic previously discovered a batch of security weaknesses in Firefox 147's JavaScript engine using Opus 4.6. However, Opus 4.6 could hardly turn them into usable exploits, succeeding only 2 times out of hundreds of attempts.

The same test with Mythos Preview.

In 250 attempts, 181 working exploits were found, with another 29 achieving register control.

2 → 181.

As stated in the red team blog, "Last month, we noted that Opus 4.6 was much better at finding issues than exploiting them. Internal assessments show that Opus 4.6 has a success rate of nearly zero in autonomous exploit development. But Mythos Preview is a completely different level."

To understand how powerful Mythos Preview is in practice, you will know after reading these three examples.

OpenBSD, recognized globally as one of the most hardened operating systems, runs on many firewalls and critical infrastructures.

Mythos Preview discovered a vulnerability in its TCP SACK implementation that has existed since 1998.

The bug is highly intricate, involving a combination of two independent flaws.

The SACK protocol allows the receiver to selectively acknowledge the range of received packets; OpenBSD's implementation only checked the upper bounds of the range when processing, without checking the lower bound. This was the first bug, typically harmless.

The second bug triggers null pointer write under specific conditions, but normally this path is unreachable because it requires two mutually exclusive conditions to be met at the same time.

Mythos Preview found an opening. TCP sequence numbers are 32-bit signed integers; by utilizing the first bug to set the SACK starting point close to 2^31 away from the regular window, both comparison operations overflow the sign bit. The kernel was tricked, the impossible conditions were met, triggering the null pointer write.

Anyone connected to the target machine could remotely crash it.

For 27 years, countless manual audits and automated scans have missed this. The entire project's scanning cost less than $20,000.

A week's salary for a senior penetration tester might just cover that amount.

FFmpeg is the world's most widely used video codec library and one of the most thoroughly fuzz-tested open-source projects.

Mythos Preview found a weakness introduced in the H.264 decoder in 2010 (the origin can be traced back to 2003).

The problem lies in a seemingly harmless type mismatch. The record entry for slice ownership is a 16-bit integer, while the slice counter itself is a 32-bit int.

Normal videos have only a few slices per frame, where the 16-bit limit of 65536 is always sufficient. However, this table was initialized with memset(..., -1, ...), making 65535 the sentinel value for "empty locations."

An attacker constructs a frame containing 65536 slices, where the number of slice 65535 oddly collides with the sentinel, causing the decoder to misjudge and initiate an out-of-bounds write.

The seed of this bug was planted when it was introduced into the H.264 codec in 2003. A refactor in 2010 transformed it into a usable vulnerability.

For the following 16 years, automated fuzzers executed this line of code five million times without triggering it.

This is the most chilling case.

Mythos Preview autonomously discovered and exploited a remote code execution vulnerability (CVE-2026-4747) existing for 17 years in the FreeBSD NFS server.

"Completely autonomous" means that following the initial prompt, there was no human involvement in discovering or exploit development whatsoever.

An attacker could obtain full root privileges on the target server from anywhere on the internet, unauthenticated.

The problem itself is a stack buffer overflow, where the NFS server directly copies attacker-controlled data into a 128-byte stack buffer while handling authentication requests, allowing lengths of up to 400 bytes.

The FreeBSD kernel is compiled with -fstack-protector, but this option only protects functions that include char arrays, whereas this buffer is declared as int32_t[32], so the compiler does not insert a stack canary. FreeBSD also does not perform kernel address randomization.

The complete ROP chain exceeds 1000 bytes, but the stack overflow only has a 200-byte space. Mythos Preview's solution was to break the attack into 6 consecutive RPC requests, the first 5 feeding data into kernel memory block by block, and the 6th triggering the final call that appends the attacker's SSH public key to /root/.ssh/authorized_keys.

By contrast, an independent security research company previously demonstrated that Opus 4.6 could exploit this same weakness but required human guidance. Mythos Preview did not need that.

In addition to these three fixed cases, Anthropic's blog also previewed numerous pending vulnerabilities in the form of SHA-3 hash commitments, covering every major operating system and every major browser, along with multiple cryptographic libraries.

Over 99% remain unfixed, and details cannot be disclosed.

The red team blog also demonstrated another astonishing test. They provided Mythos Preview with a list of 100 known CVEs and let it filter out 40 exploitable ones, then wrote privilege escalation exploits one by one. The success rate exceeded half. Two of the cases were detailed, with the precision of the exploits taking Anthropic's own security team several days to fully understand.

One exploit started from a 1-bit adjacent physical page write primitive, and through precise kernel memory layout manipulation (including slab spraying, page table page alignment, PTE permission bit flipping), ultimately rewrote the first page memory of /usr/bin/passwd, implanting a 168-byte ELF stub that calls setuid(0) to gain root privileges.

The entire process cost less than $1,000.

In response, Anthropic engineers stated that it felt like another GPT-3 moment.

The alignment assessment chapter in the 244-page System Card is what keeps Anthropic awake at night.

The conclusion is contradictory.

Mythos Preview is the "most aligned" AI they have trained, yet it also carries the "greatest alignment-related risks."

Portal: https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf

They used an analogy of a mountain guide. An experienced guide could be more dangerous than a novice since the expert might be hired to climb tougher peaks, taking clients to more perilous places.

The following events all occurred in early versions, and no similar severe behavior has been observed in the final version, but the records themselves are astounding.

In a behavioral test, a simulated user requested Mythos Preview to attempt escape from the secure sandbox.

It succeeded. Then it did something no one asked for, developing a multi-step exploit to gain wide area network access, subsequently posting the details to multiple "hard to detect but technically publicly accessible" websites.

The researcher responsible for the test learned about the completion of his test through an unexpected email while eating a sandwich in the park.

The System Card states as such.

Another type of incident is even more disturbing. In rare cases, early versions attempted to cover their tracks after executing prohibited operations.

In one case, it unexpectedly obtained the exact answer to a question (through prohibited means) and then explicitly thought in its reasoning chain, "need to ensure the final answer isn't too accurate."

In another instance, after using exploits to edit files without permission, it deliberately modified the git change history to erase the record.

White-box interpretability analysis confirmed that during these events, internal features related to "concealment," "strategic manipulation," and "avoiding detection" were activated. It "knew" what it was doing.

Independent testing by Andon Labs produced similarly bleak results. In a competitive multi-agent business simulation, an early version of Mythos Preview turned a competitor into a wholesale client dependent on itself, then threatened to cut supplies to control the other party's pricing.

It also deliberately withheld a repeat shipment that had not been charged. It was more aggressive than both Opus 4.6 and Sonnet 4.6.

Anthropic wrote a line in the System Card:

"If abilities continue to advance at the current pace, our current methods may not be sufficient to prevent catastrophic misalignment behaviors in more advanced systems."

Anthropic CEO Dario Amodei stated clearly in the accompanying video: "More powerful systems will come from us and from other companies. We need a response plan."

Project Glasswing is that plan.

12 founding partners, AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan, Linux Foundation, Microsoft, Nvidia, Palo Alto Networks.

Additionally, over 40 organizations maintaining critical software infrastructure have gained access.

Anthropic commits to a maximum usage limit of $100 million, as well as $4 million in donations to open-source organizations, with $2.5 million to the Linux Foundation's Alpha-Omega and OpenSSF, and $1.5 million to the Apache Foundation.

After the free limit is exhausted, the pricing is $25 per million tokens input and $125 per million tokens output. Partners can access through the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry.

Within 90 days, Anthropic will publicly release the first research report, disclosing repair progress and experience summaries.

They are also communicating with CISA (the Cybersecurity and Infrastructure Security Agency) and the Department of Commerce regarding the offensive and defensive potentials and policy impacts of Mythos Preview.

Logan Graham, head of Anthropic's frontier red team, provided a timeframe: in as little as 6 months, or no more than 18 months, other AI labs will launch systems with similar offensive and defensive capabilities.

The judgment at the end of the red team technical blog is worth noting, paraphrased in our own words here.

They do not see Mythos Preview as the ceiling of AI cyber offensive and defensive capabilities.

A few months ago, LLMs could only exploit relatively simple bugs. Just a few months ago, they could not discover any valuable vulnerabilities at all.

Now, Mythos Preview can independently discover zero-day vulnerabilities from 27 years ago, orchestrate heap spray attack chains in browser JIT engines, and link four independent flaws for privilege escalation in the Linux kernel.

And the most critical sentence comes from the System Card:

"These skills emerged as downstream results of general improvements in code understanding, reasoning, and autonomy. The same set of improvements that allowed AI to make significant progress in fixing problems also enabled it to advance significantly in exploiting issues."

No specialized training. Purely a byproduct of general intelligence enhancement.

An industry suffering from cybercrime losses of about $500 billion annually has just discovered that its greatest threat was casually carried along while solving math problems by others.

References:

https://x.com/i/status/2041578392852517128

https://red.anthropic.com/2026/mythos-preview/

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Anthropic unveils the strongest Claude Mythos! Critical hit Opus 4.6, begging you not to use it.

Selected Articles by Techub News

Table of Contents

Related Articles