19 Comments
User's avatar
Mikey B's avatar

WOW!

Reads like a murder mystery.

Amazing information as provided by one or more brilliant AI minds.

What great fun to read and learn from.

Thank you Sir Devansh (Dev<3)

Devansh's avatar

I'm glad you enjoyed this

Vaibhav's avatar

If scaffolding is at play here I wonder what they would do with analysis tools like joern in place.

Devansh's avatar

That's a good question

Gabe's avatar

Nice article Dev

Devansh's avatar

Thank you <3. Put a lot into it

Gabe's avatar

I know !! I have to read it again but it calm my curiosity itch 😁 Thank you for the great effort 🙏love it

Devansh's avatar

Nothing but the best for you

Leon Liao's avatar

The key is to separate three things: the model, the narrative, and the workflow.

Mythos is not some magical fully autonomous hacker that suddenly appeared out of nowhere. A big part of what people are reacting to is actually a high-capability model embedded inside a much broader security research pipeline. So the real shift is not “one model woke up,” but that strong models plus strong security workflows are starting to produce genuinely useful offensive and defensive results.

That also means the moat is narrower than the hype suggests. Anthropic’s edge seems to be more about exploit development, multi-step reasoning, tool use, and reliability, not that only Anthropic can even detect these classes of vulnerabilities.

But the correction should not swing too far in the other direction. Even after adjusting for hype, the underlying capability jump is real. Frontier models are moving beyond just helping read CVEs or draft PoCs and toward something much closer to operational usefulness in software understanding and attack-path exploration. That is the real boundary, and also the real issue.

Devansh's avatar

Very true. That's why I said that mythos is impressive.

James's avatar

Incredible analysis!

Devansh's avatar

Thank you.

ScienceGrump's avatar

Thanks for an incredibly thorough, informative dive into the hype. I'm curious whether Anthropic published the false positive rate for Mythos prior to *any* human assessment. When they say x% validated, did any curation happen first? My suspicion is that model performance is not so much jagged as purely stochastic; there is no rhyme or reason for when a given model will seem brilliant or fall on its face. The high false positive rate in AISLE might not be so different from Mythos.

Devansh's avatar

They did no research.

Tanel's avatar

Thank you for this excellent expose. Way too many people think Anthropic are the "good guys".

Devansh's avatar

Yeah. I love that they stood against surveillance systems but they have a long history of sketchy behavior

Austin Morrissey's avatar

when are we getting the constructing research vulnerability pipelines post ?

L. rufus's avatar
1dEdited

Am showing this to one of our clients on Monday. Their CIO called frantically after the Claude Mythos story “broke”. He was adamant we “do something”. After some discussion I offered to disconnect their network from the Internet and then we could go hit a bucket of balls at the nearby golf course. I think he actually considered disconnection (yikes).

ToeKneeEi's avatar

Shhh… The Mythos sky-is-falling hype is mostly a play to spook the White House to kiss and make up with Anthropic because national security, if you buy the hype, trumps the Hegseth feud. It’s bought Dario a meet with Susie Wiles and some words of defusememt from all. Look to see Anthropic’s federal contracts restored if Pennsylvania Avenue doesn’t read Devansh’s post too carefully.