5/11/2026 at 7:33:34 AM
Quote:"My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing."
It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.
by rzmmm
5/11/2026 at 9:34:01 AM
Anthropic using marketing to convince people their models are more advanced, better built, or that AI is a threat that needs to be regulated because only they have the answer? I’m shocked.More seriously, so far I haven’t seen much indication that Mythos is more than Opus with a security focused code analysis harness. That said, the fact it can find these bugs in an automated fashion is the more important takeaway outside of the hype.
I’m curious what the error rate is on the detections, because none of that means much if it is wrong 90% of the time and we are only hearing about the examples that are useful marketing.
by therealpygon
5/11/2026 at 9:59:22 AM
>> Anthropic using marketing to convince people their models are more advanced, better built, or that AI is a threat that needs to be regulated because only they have the answer? I’m shocked.I remember when OpenAI was saying GPT-2 was too dangerous to release.
by johnbarron
5/11/2026 at 10:44:41 AM
I remember when there was a guy at Google years a few years ago that was convinced that they had an internal, sentient creature in their labs (I think maybe 4 years ago?)If I’m not mistaken, after the media cycle, he lost his job for breaking confidentiality.
That was the opposite of marketing, Google really didn’t get how to turn this into a product until ChatGPT happened.
by stingraycharles
5/11/2026 at 1:00:05 PM
They most likely understood that it wasn't viable for anything. OpenAI just yolo'd it and now we're dealing with the fallout. I'm fairly certain that any management layer at google isn't going to say yes to "invest 5 billion to make 10 million" scheme that OpenAI, Anthropic, are currently running.by jabwd
5/11/2026 at 2:22:17 PM
I for one cant wait for the 10 million to go all the way to zeroby gofreddygo
5/11/2026 at 5:52:34 PM
"ChatGPT has over 900 million weekly active users worldwide. ... ChatGPT Plus has around 50 million paying subscribers"by mistrial9
5/11/2026 at 7:55:20 PM
What you have typed does not address anything the person you are responding to said.With those 50 million subscribers, how much do they pay and how much do they cost? That is the only relevant piece of information when discussing the investment and returns of OpenAI.
by muwtyhg
5/11/2026 at 9:03:51 PM
> "invest 5 billion to make 10 million"business is contextual, and is a game of numbers? If you agree, then there is a difference between "I made money selling lemon drinks at my driveway, but I sold a car to make room" .. versus "I have recurring revenue of 50 million x $80 USD per month, and it is growing, and I am using cheap credit to build that" .. Numbers have a meaning, and the larger dollar recurring revenue cannot be matched in any way, no matter how much I spend. IIR ChatGPT is the fastest adopted software in the history of the Internet.
by mistrial9
5/11/2026 at 11:25:02 PM
Is it growing?Don't they report annualized revenue AKA the best month times 12? How is that comparable?
by JetSpiegel
5/11/2026 at 10:28:32 PM
They have no moat.by jabwd
5/11/2026 at 2:25:27 PM
Google is the leader, they really don't want AI to be a success, it only comes with a risk of disruption. They probably don't even really believe it's going to be that big of a deal. They are only in that game to hedge; sure they have wasted a trillion dollars if AI doesn't come through, but they will earn that back in 3-5 years. So why would they need to do deranged marketing stunts and sacrifice their credibility for that?If OpenAI or Anthropic doesn't turn this into a trillion dollar industry FAST, they are cooked. The strategy of building up fear around your product is risky, but necessary. There is simply no way to grow the AI business fast enough if they can't talk directly to the CEOs and bypass input from the employees, and baba yaga stories are perfect for that. Every time the CEO hears an employee say that the AI isn't working great for him, he hears an employee that's scared for his job or for his life, dismisses it, and sends out a mandate that everyone needs to prompt an AI every time they as much as need to go to the toilet.
by MadxX79
5/11/2026 at 11:17:19 AM
[dead]by player1234
5/11/2026 at 3:03:52 PM
Context from 2019: https://en.wikipedia.org/wiki/GPT-2>While previous OpenAI models had been made immediately available to the public, OpenAI initially refused to make a public release of GPT-2's source code when announcing it in February, citing the risk of malicious use;[8][5] limited access to the model (i.e. an interface that allowed input and provided output, not the source code itself) was allowed for selected press outlets on announcement.[8] One commonly-cited justification was that, since generated text was usually completely novel, it could be used by spammers to evade automated filters; OpenAI demonstrated a version of GPT-2 fine-tuned to "generate infinite positive – or negative – reviews of products".[8]
>Another justification was that GPT-2 could be used to generate text that was obscene or racist. Researchers such as Jeremy Howard warned of "the technology to totally fill Twitter, email, and the web up with reasonable-sounding, context-appropriate prose, which would drown out all other speech and be impossible to filter".[18] ...
by neuronexmachina
5/11/2026 at 5:19:32 PM
It's kind of funny watching the behavior on the forum of different groups with different beliefs."AI can't do anything harmful at all, kick this shit up to 11. It's all marketing, bla bla"
and
"My grandma gave away all her money to AI bots and is now starving in the street. My uncle murdered his wife and is trying to get married to GPT-4o. He thinks they are going to elope to a data center on a tropical island and live happily ever after".
I think the 'AI can do no harm, it's marketing" people are really disconnected from reality and that any other product that behaved in the same manner would have been banned in most places.
by pixl97
5/11/2026 at 6:38:05 PM
Related: https://youtu.be/Ykvf3MunGf8?si=UEIMRdrMWUFF6V8QAI chatbots have caused real harm. It has tragically convinced and encouraged a number of people to commit suicide, to say nothing about scams. It is having a real effect on the social fabric of our society.
I don't understand what point the people who blame the dangers of AI on marketing.
by abustamam
5/12/2026 at 4:55:32 AM
The sociocultural dangers weren't the danger they were referring too, Claude Mythos was purported to be so powerful that if released to the public it would result in all software being 0-dayed and so they could only give select important groups access. Curl's analysis said ehh, it didn't really seem that much better.Now people who are getting negatively affected because they think AI is more real and more intelligent than it actually is and get tricked by it, well that is dangerous but for different reasons.
by robotbikes
5/11/2026 at 8:57:30 PM
> I remember when OpenAI was saying GPT-2 was too dangerous to release.The world didn’t end yet - but did it improve?
by DANmode
5/11/2026 at 10:37:11 AM
"it can almost like write 2 paragraphs!" "It might be conscious" "this is basically AGI, we had to fire someone who spilled the beans"by 2ndorderthought
5/11/2026 at 11:33:27 AM
I always thought he was fired for making crackpot statements to the press in reference to his professional capacity, and thus creating bad PR and embarrassing spectacle for his employer. Seems like legitimate reasons to me.by etiam
5/11/2026 at 11:38:26 AM
An interesting question now is whether he had standard mental health issues, or if he was an early example of AI psychosis or whatever we call people who are falling in love with their AI chatbots because they tell them how smart they are.by ZeroGravitas
5/11/2026 at 12:11:43 PM
Considering Richard Dawkins has recently succumbed to the same delusion it is a reminder that no matter how intelligent someone may otherwise be, we are all human and have certain tendencies and blind spots; anthropomorphizing non-entities being one of those.by paradox242
5/11/2026 at 12:18:24 PM
Richard Dawkins is 85 to be fair, just like Bernie Sanders is 84 when he made similar comments.The other guy worked on Google's AI safety team where one would expect he'd have a basic grasp of how the technology works before making outlandish claims.
by dmix
5/11/2026 at 1:10:17 PM
One phenomenon that spooks me is when intelligent people believe in idiotic things.It makes me wonder if there's a wrong turn in the road that I too might fall in the same pit.
by surgical_fire
5/11/2026 at 1:54:39 PM
Vigilance is warranted, I think.I can't find it right now, but something came up a few years ago (probably on HN) about highly intelligent people being more adept at making up arguments to rationalize beliefs and actions that they had taken for other reasons entirely.
Sort of makes sense that wielding a more complex mind would offer more complex ways to go wrong, doesn't it?
by etiam
5/11/2026 at 2:23:32 PM
And on balance, it also can mean that they make connections and see truth where others only see the facade. Both statements can (and are true) because highly intelligent people are still just people. Some people’s “delusions” are absolutely correct, and others “facts” are nothing more than anecdotes told to convince themselves of what they want to believe.Sounds more like “intelligence” isn’t the only defining metric for such behavior to occur in people, because that describes a lot of less intelligent people too. Though, I suspect highly intelligent people are at least somewhat more likely to end up on the “correct” side of the facts.
by therealpygon
5/11/2026 at 1:56:37 PM
As someone who watched one of their heros fall for some stupid cult like thing ten years ago and wondered the same thing. Then many years later fell for some dumb stuff. The answer is you probably will. Try to stay intellectually flexible, it'll be okay.by 2ndorderthought
5/11/2026 at 2:42:44 PM
I am afraid of that, I wasn't joking.I have seen people I consider as much smarter than me fall for some very idiotic things. I certainly don't consider myself immune.
I think that the advice to try being intellectually flexible is a good one. Strive to learn new things, expose yourself earnestly to ideas that challenge your beliefs, exercise empathy, etc
by surgical_fire
5/11/2026 at 2:03:53 PM
Good point.Optimization on "Human Feedback", early exposure to high-effort experimental systems... I wouldn't be surprised it that turns into a bigger field than is generally recognized today.
Looking at it from the outside, I think it's still pretty hard to see how he came to end up in that position, but with a bit of individual vulnerability, arbitrary time to boil the frog slowly, and a fairly large number people exposed, maybe it would be stranger not to have the event occur with someone.
by etiam
5/11/2026 at 4:52:45 PM
And Anthropic was founded by former, high ranking OpenAI employees so they were accustomed to the classic "its so dangerous we can't release it" trope.It sounds like Mythos is good but none of us know exactly how good since they haven't released it yet. It also sounds like Anthropic is compute starved which is probably the biggest reason it has had a public release
by slipnslider
5/11/2026 at 12:10:20 PM
This is roughly what I was assuming but of course the big caveat here is that they were already using the existing LLM driven tooling on an extensively audited codebase.So while anthropic's marketing may be hype there just wasn't much left to find, a point he makes in the blog post.
Whether it's a big step forward for other kinds of projects is difficult to tell, but this highlights that everybody should be using AI code review tools to audit their existing code today, and not everybody is.
by JeremyNT
5/11/2026 at 12:44:08 PM
None of those other LLM tooling made the claims they're too dangerous to be released and used though, unlike Anthropic did with Mythos.What it highlights, is that Mythos doesn't seem so much better than other LLM driven tooling at finding security issues, which was the strongest claim Anthropic made in the first place.
by embedding-shape
5/11/2026 at 1:20:13 PM
People love defending Anthropics shortcomings…“Mythos isn’t supposed to be that good at security, because actually Anthropic was referring more about running llms than mythos specifically”
“The opus model is worse because they have no compute because they are training mythos. The degraded performance is justified!”
“All the bugs in Claude code is just because the models are so good they are just looping and are shipping fast”
Constantly see people crawl out of the woodwork to defend a trillion dollars company overhyping every press release it gives
by iterateoften
5/11/2026 at 6:17:39 PM
If policitians can buy online supporters to manipulate perception during their election campaigns, I'd expect private corporations would too. Of course people can become very biased on their own, but an online PR/Marketing/Influencing campaign might encourage them to be more vocal.by ASalazarMX
5/11/2026 at 7:59:28 PM
It's silly to act like they've got mud on their face when Mythos and Opus are apparently some of the very best models. Anyone that has found value out of previous LLMs is likely to find more value out of the newest ones. The only thing Mythos looks bad against is the very tall bar some people have imagined. People are putting too much weight on marketing and then reaction to marketing.by AgentME
5/11/2026 at 8:04:11 PM
> People are putting too much weight on marketing and then reaction to marketing.No, what others are doing, which I've done myself in the past too, is to evaluate how much their marketing matches up with reality, then share our experience about that. Very different than just "putting too much weight on marketing".
by embedding-shape
5/11/2026 at 7:52:27 PM
It's important to keep in mind that very, very few projects are as rigorously tested as curl, so while it's interesting to hear this feedback I think curl would be a torture test for any security scanning. I'd be more interested to hear about other random libraries that aren't as thoroughly analyzed as curl; show me some results for GnuTLS, for example, or dpkg/rpm/apt/dnf/pacman/etc.by danudey
5/11/2026 at 9:40:14 PM
I think one of the points of TFA was that other AI tools found many vulnerabilities; after having fixed those, mythos did find another vulnerability the others missed, but that seems to imply this model is only marginally better than the competition instead of being on a different league altogether like it's marketed. Paraphrasing the author: sure mythos will find lots of security issues in gnutls, but so will gpt or opus (they acknowledge explicitly that all those tools are getting very good).by p91paul
5/11/2026 at 5:00:19 PM
> None of those other LLM tooling made the claims they're too dangerous to be released and used though, unlike Anthropic did with Mythos.I do think they've said similar things in the past, but regardless Anthropic's BS marketing is something to behold and viewing it with extreme skepticism is smart.
> What it highlights, is that Mythos doesn't seem so much better than other LLM driven tooling at finding security issues, which was the strongest claim Anthropic made in the first place.
That's the conclusion Daniel makes and it definitely seems plausible, his opinion absolutely carries a lot of weight with me for sure.
But I hedge a little because we don't really know how much human labor was required to supplement those earlier LLM-assisted reviews of curl, nor do we know how easy it was for the person who used Mythos to generate the new batch. So the kind of bug hunting that might be "possible but still labor intensive" via current tooling might be far easier to accomplish with less skilled developers using Mythos.
And who knows, maybe Mythos is better on worse codebases, curl benefits from being very good to start from :)
by JeremyNT
5/11/2026 at 12:55:17 PM
Actually, OpenAI made a similar claim about one of their GPT models a while ago…Funnily enough that was while Dario Amodei was their research director.
by hug
5/11/2026 at 3:01:56 PM
If you're referring to gpt-2 in 2019, that primarily about concerns with it being used by spammers and fake content generators. In retrospect, that was a totally valid concern.by neuronexmachina
5/11/2026 at 6:47:53 PM
They had a reddit with GPT2 back and forth I have to say I got suckered into a conversation before I figured it out -- it was definitely the OG Moltbook of non sequitursby jimmySixDOF
5/11/2026 at 8:18:17 PM
Too dangerous to be released, right after the Department of Defense* dropped themby stirfish
5/11/2026 at 4:28:51 PM
Everyone should be using exclusively a proof assistant (Lean/Agda/Rocq/Isabelle) and proving their code correct, but they're not.Do you see how ridiculous the zealotry sounds when its not your personal kind of zealotry?
by voxl
5/11/2026 at 10:08:55 AM
Curl simply isn't a good data point. It's one of the most picked-over codebases in existence with extensive security testing practices. All the researchers using not-quite-Mythos models have had plenty of time to report bugs up to this point. Daniel may be right that Mythos hasn't been a game changer for curl but the preconditions are different for virtually any other codebase. Perhaps the real marketing here is his own modesty about curl's maturity.by thombles
5/11/2026 at 10:40:34 AM
To me, it is a very good data point.Curl uses all sorts of tools, including AI tools to find bugs. These tools, according to the article found hundreds of bugs including a dozen CVE.
Mythos found one vulnerability. It means the Mythos is just another tool, not the revolution it claims to be.
It is common that when a new tool is introduced that a bunch of bugs are found, with diminishing returns. Mythos finding one vulnerability is consistent to what I would expect for a major update to an existing tool, which Mythos is over existing LLM-based solutions.
by GuB-42
5/11/2026 at 2:02:47 PM
I had a totally different take. The fact that Mythos found only one vulnerability is testament to how solid curl is, not how bad Mythos is.Look at the Firefox blog post where they found something like 400 (or more) findings.
I have no doubt Mythos is very good at this, but I also don't think it's something unattainable by other labs within the next few months, with focus.
by atonse
5/11/2026 at 4:06:49 PM
The point is that Anthropic claims it’s a huge leap over everything else. But it isn’t.by skywhopper
5/11/2026 at 4:26:24 PM
This depends on the actual number of undiscovered bugs still in curl. If there is nothing to find then even a 10x better Mythos will find nothing. Also I think the quality of the codebase matters a lot when it comes to finding bugs. Its possible that the curl is so well written that it is relatively straightforward for existing ai tools to find bugs.by rohit89
5/11/2026 at 6:05:30 PM
But both things can be true. It could be a huge leap (see Firefox’s example) but also find almost nothing in an already well maintained and audited codebase, and that could mean there isn’t much to find.by atonse
5/11/2026 at 7:23:19 PM
Okay, but how do we know that all 400 plus hits were actual vulnerabilities? I didn't read too deeply into it so I might've missed something but did someone test and validate each of those vulns to confirm that they were actually vulns?by ethin
5/12/2026 at 3:52:05 AM
You can see the details here: https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...by atonse
5/11/2026 at 7:35:24 PM
There is no way to tell until we find examples of vulnerabilities that mythos missed. For all we know curl currently has 0 vulnerabilities right nowby HDThoreaun
5/11/2026 at 10:53:56 AM
The question is how many security vulnerabilities are actually left in the code after all the recent AI attention. Either Mythos is a nothingburger, or it's substantially more powerful but there's nothing left to do. Even a large amount of C can be correct eventually. Curl has the _potential_ to become a good data point maybe 6-12 months from now - if researchers and new tools find many more vulnerabilities then Mythos is proved to be hype. If they don't, then maybe Mythos is overkill for today's curl and its capabilities are better deployed elsewhere (like Firefox, apparently).by thombles
5/11/2026 at 11:35:19 AM
I have a hard time believing that Mythos found the only remaining Curl vulnerability. It is possible, but highly improbable.And it is not overkill, the proof is that it found that vulnerability. It is like saying the new version of some static analyzer with some new rules is "overkill" because it only found only one more bug than the previous version. Deciding whether it is overkill or not is more about context. Using a very expensive model like Mythos for some little used non-critical software is overkill, but for Curl, it absolutely isn't.
If Mythos found loads of vulnerabilities in Firefox but not in Curl, I wouldn't say that's because of Mythos is so good, but rather that with the release of Mythos, they did some testing that could have been done before using the same tools Curl have used.
by GuB-42
5/11/2026 at 11:46:41 AM
We will see. As for "testing that could have been done before", Mozilla's posts indicate otherwise. Use of Opus 4.6 led to 22 security-sensitive bugs vs Mythos' 271 (https://blog.mozilla.org/en/privacy-security/ai-security-zer...). They already had the methodology in place when the more powerful model came along (https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...):> Once the end-to-end pipeline is in place, it’s trivial to swap in different models when they become available. Building this pipeline early helped us find a number of serious bugs using publicly-available models, and it also helped us hit the ground running when we had the opportunity to evaluate Claude Mythos Preview. In our experience, model upgrades increase the effectiveness of the entire pipeline: the system gets simultaneously better at finding potential bugs, creating proof-of-concept test cases to demonstrate them, and articulating their pathology and impact.
by thombles
5/11/2026 at 1:35:14 PM
False dichotomyby sitkack
5/11/2026 at 1:55:04 PM
It's not, really. Curl is an extraordinarily high value target that has already been picked over by well funded security researchers and state-sponsored groups using state of the art tooling for decades. That is not the target for which Mythos is a threat.The threat isn't high value targets, which already had sophisticated folks picking over the code base using state of the art tools and tests, it's medium to low value targets which can now be picked over by random hackers who barely know anything about security themselves at a cost of a few dollars.
by empath75
5/11/2026 at 10:59:13 AM
that makes it a good data point, because it is better able to illustrate the incremental capabilities of Mythos compared to previous toolingthat helps us to understand how much of Mythos is hype and how much is real
by spongebobstoes
5/11/2026 at 10:32:48 AM
We see this exact hypetrain every time a new model is released. Mythos simply hasn't lived up to the "we're all gunna die from the flood of vulnerabilities" hype even slightly. Its slightly better than previous models by all accounts, cool stuffI've seen literally near word-for-word this exact chain of events multiple times previously
by 20k
5/11/2026 at 2:35:43 PM
Is Mozilla marketing on Anthropic's behalf? As part of our continued collaboration with Anthropic, we had the opportunity to apply an early version of Claude Mythos Preview to Firefox. This week’s release of Firefox 150 includes fixes for 271 vulnerabilities identified during this initial evaluation.
As these capabilities reach the hands of more defenders, many other teams are now experiencing the same vertigo we did when the findings first came into focus. For a hardened target, just one such bug would have been red-alert in 2025, and so many at once makes you stop to wonder whether it’s even possible to keep up.
https://blog.mozilla.org/en/privacy-security/ai-security-zer...
by orblivion
5/11/2026 at 3:44:14 PM
There are three things happening simultaneously: 1st a new model, codenamed "Mythos", 2nd a lightweight harness built for finding vulnerabilities, and 3rd a push by Anthropic to collaborate with various Open Source projects and companies to use 1 and 2 to find vulnerabilitiesWe know that the combination of all three results in finding lots of security vulnerabilities. That's what Mozilla is talking about. The quote from the curl story states that just 2 and 3, but with just regular SotA models, would have produced very similar results
Which is really the crux of all this hype around Mythos: would the results really be different if they used Claude Opus instead of Claude Mythos? How much is the model, how much the harness, and how much is just because Anthropic is running a big campaign systematically trying to find vulnerabilities?
by wongarsu
5/11/2026 at 3:51:01 PM
Not to discredit anything that was said in any particular blog post.Folks also need to remember that a lot of blog posts are written by engineers or managers that have their own agendas and careers and often external blog posts can be a form of self marketing or idea marketing that an engineer or director has been pushing internally.
I have no idea if this happened in mozilla's case but the person that wrote it seemed to talk about the their own internal harness / fuzz testing framework quite a bit, and I imagine it was probably a big part of that person's scope / accomplishments and will probably show up at their end of year review and on their resume.
by sporkland
5/11/2026 at 4:51:31 PM
Also, the people at Mozilla who helped achieve a highly visible collaboration with the hottest AI company in the zeitgeist that included a lot of expensive data center time to harden their flagship product are definitely going to be happy/excited/proud about pulling it off successfully.There's a lot of kneejerk "so you're accusing Mozilla of a conspiracy to boost Anthropic?" which is an overly simplistic lens. Particularly when it involves groups of individual humans with different motivations and emotional investment in their own contributions to the collaboration.
by toraway
5/11/2026 at 10:26:13 PM
Okay so supposing everybody is acting in a benign manner, following their incentives and passions, not meaning to mislead anybody. Do you think that this results in writing a misleading blog post? Because the blog post makes Mythos out to be a big friggin deal. (It had certainly convinced me).by orblivion
5/11/2026 at 3:06:45 PM
It is difficult to compare these two accounts since Daniel Stenberg didn't get access to Mythos himself, and we have no information about how it was run compared to the other AI models that have been used on curl. It is possible that Mythos is not much better than these other models, but it is also possible that the curl team simply made better use of the other models.Part of what made Mythos so effective for Mozilla was the integrated agentic workflow where it not only looked for bugs, but then created an exploit to demonstrate them, and ran that exploit while dynamic analysis was enabled verifying that invalid memory access occurred. In this case it hard to know how much of their success was because they put more effort into the harness compared to previous tools (we know they did), or if Mythos was more suitable for this sort of workflow to begin with.
Not many apple-to-apple comparisons to be made with Mythos at this point.
by pavon
5/11/2026 at 3:19:43 PM
> then created an exploit to demonstrate them, and ran that exploit while dynamic analysis was enabled verifying that invalid memory access occurredFour years ago that would have sounded like science fiction. Right now, I think that even Gemini Flash might be able to do that, given a couple of attempts.
by esperent
5/11/2026 at 2:47:23 PM
Yep! The industry term is "co-marketing" and its hard to avoid seeing once you spot it.by spenczar5
5/11/2026 at 3:41:55 PM
I'll wear the dunce cap: how are you so certain this is co-marketing? I'm not saying you are wrong, but it doesn't seem obviously like marketing copy to me (which is of course what they'd want but that's nevertheless not in any way evidence one way or the other).by HelloMcFly
5/11/2026 at 3:46:28 PM
It starts with the words "As part of our continued collaboration with Anthropic"Once these words are used you can assume there is a contract stating how that collaboration works, and that this includes some sentences about how much each side is allowed to or required to say about it
by wongarsu
5/11/2026 at 4:38:51 PM
So you claim that Mozilla entered into a contract with Anthropic, and said contract requires Mozilla to advertise for Anthropic on their blog. I hope Mozilla is getting a good payday out of this.by warkdarrior
5/11/2026 at 2:59:15 PM
I didn't think Mozilla was like that but duly noted.by orblivion
5/11/2026 at 2:41:24 PM
I think it's more the cost to find a vulnerability that has significantly reduced, not the possibility that the vulnerability could have been found. But that cost mattered tremendously because someone has to fund the effort to find the bugs. This economics also applies to attackers.by dboreham
5/11/2026 at 2:43:25 PM
Is Firefox less invested in this than Curl? I mean there must be some explanation for this.by orblivion
5/11/2026 at 2:49:41 PM
It's in the first sentence of your quote:"our continued collaboration with Anthropic"
Read this as: "we get discounts, rate limit increases, a direct line to responsible product managers; in exchange we participate in friendly marketing." It's extremely common in this line of business - typical of database vendors, software tool companies, etc.
by spenczar5
5/11/2026 at 2:52:56 PM
This is more in response to my original post, but okay interesting point. (When I said "invested" here I meant invested in finding security flaws.)by orblivion
5/11/2026 at 8:30:09 PM
In many countries it is mandatory to mark any form of compensated advertising as such. If your claim is true they might be breaking some laws here & there…by janc_
5/11/2026 at 3:13:34 PM
Conspiratorial nonsenseby Anon1096
5/11/2026 at 4:28:01 PM
I would expect Firefox to be less invested in this than Curl. Firefox is aimed at consumers, Curl is embedded in a wide variety of products.by cmiles74
5/11/2026 at 3:53:27 PM
Absolutely 100%by skywhopper
5/11/2026 at 2:43:57 PM
I certainly wouldn't be surprised if they were.by Pay08
5/11/2026 at 8:08:03 AM
It may well be that the hype was primarily marketing.The other alternative is that Curl is simply secure enough that there was far less to find than in other projects.
by vidarh
5/11/2026 at 12:24:49 PM
Daniel found 30 CVEs in Curl, this year. I would not say that there is nothing to find, here. Just that it takes an actual expert.by shakna
5/11/2026 at 3:37:32 PM
I did not suggest there was nothing to find. But is also very different to count all CVE's found and reported (there are less than 30 total for 2025 and 2026 per [1]) by anyone and everyone vs. what was found in a short time by someone prompting a model.by vidarh
5/12/2026 at 8:45:56 AM
Is not the selling of the model, that it is as capable as anyone and everyone?> Claude Mythos is Anthropic's most specialized model, trained exclusively on security research, vulnerability disclosures, and attack pattern literature. Its reasoning reflects how the world's best security researchers think. [0]
[0] https://mythosvulnerabilityscanner.com/what-is-claude-mythos
by shakna
5/12/2026 at 12:00:27 PM
Even if I was selling the model, which I am not, it still does not follow that you can judge that on a single run, given that no security researchers have found all of these bugs on their own in a short amount of time either.by vidarh
5/12/2026 at 10:32:49 PM
Okay, to respin this - Daniel doesn't say that curl is secure-enough. Half the point of the talks this year, is there has been an uptick in detecting security bugs, not a downturn. And here's some graphs. [0]> Given the look of these graphs I don’t think we are close to zero bugs yet. These two curves do not seem to even start to fall yet.
If the author thinks there is more to find, then the soil probably isn't dry.
But, from the author's mouth:
> My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing. [1]
[0] https://daniel.haxx.se/blog/2026/04/30/approaching-zero-bugs...
by shakna
5/11/2026 at 1:02:03 PM
[flagged]by bcjdjsndon
5/11/2026 at 10:13:54 AM
Given how much money is on the line, it would be gross negligence if anything came publicly out of the CEO's mouth or is otherwise published by the company that's not marketing.by teiferer
5/11/2026 at 12:02:01 PM
The question is whether they need to massage the results for them to be marketable.by red75prime
5/11/2026 at 12:51:47 PM
Sometimes you gotta let people know how awesome you are. The real question is if you're misrepresenting yourself(all marketing, no substance).by zeroCalories
5/11/2026 at 1:35:36 PM
Not really, curl has slow anonymous memory leaks because of how the connection session caching was implemented. If you don't periodically restart a program, than people encounter strange hard to diagnose issues sooner or later.Also, looking at something that trips valgrind warnings already, may obfuscate a lot of problems in both your own code and the curl library itself.
One could report the issue as functioning as described in the API, but the developers do not accept direct community input into the project.
People use it out of convenience, but it is just as janky as most bloated projects. =3
by Joel_Mckay
5/11/2026 at 9:45:01 AM
My guess:Marketing is not intentional.
Evidences: 10 years ago, when I interviewed Baidu AI with Andrew Ng and Dario, Dario is the kind of person is pure-hearted to the point being ideological. Given Dario's successful career so far, that essence has gradually grown into a conviction, and surrounded by a purposely built team which amplifies his ideology.
Humans are very convenient creature, a rare few small fraction of them are no doubt the master of convenience: they morph their mental manifold without a hint of contradiction in their own mental mechanisms.
by bigcat12345678
5/11/2026 at 10:14:08 AM
These things are layered. They are great scientists, smart people, etc.Things change when you’re running a business like Anthropic, especially as the CEO. You have a responsibility to shareholders, and you just need to play the game.
Anthropic chose a great angle: focus on professionals / enterprise, safety, etc. Those can both be done by a genuine desire to make great technology, and for business purposes require you to position yourself in a bit “better” way than reality.
Just look at what their strategy is with Mythos, it’s almost perfection: the “it’s not ready to be released to the public” angle hits all the marks: they care about responsibility / safety, they have “the best” model, and “LLMs are dangerous, but we, as the guardians, can be trusted”. This also helps the industry as a whole with regulation: if they’re being constrained, China will develop even more dangerous models.
This is a result of how smart people treat business, it’s PR perfection, especially given how much the whole industry is talking about it.
(Yes, they fail in other PR areas, but that’s a different discussion)
by stingraycharles
5/11/2026 at 12:53:27 PM
Marketing is always intentional at this scale. If you think Anthropic didn't put a lot of time and effort into Glasswing as a marketing effort I think you're misunderstanding how these organizations work and how they win.by windexh8er
5/11/2026 at 12:06:25 PM
> Marketing is not intentionalMythos put Anthropic back into the White House’s good graces. It also branded Anthropic as badass, something their softener image probably needed to win government contracts.
Maybe it wasn’t marketing. But the product’s configuration, and how Anthropic talked about and released it, sure as hell played beautifully. (The timing, while Musk and Altman are distracted with each other, also couldn’t have been better.)
by JumpCrisscross
5/11/2026 at 10:02:18 AM
I'm not sure if that distinction is important, since what you've described less charitably synonymous with the phrase "Dario is delusional, and has surrounded himself with yes-men, so outlandish marketing gets published as a side effect".Whether the person doing the marketing was sincere about it or not is immaterial, since marketing is experienced almost entirely by the people consuming it, and not the people communicating it. What matters is if the audience is sincerely concerned by the message, and it's transparently the case that they were sincerely concerned by it.
by OtherShrezzing
5/11/2026 at 10:11:24 AM
> Marketing is not intentional.That's an odd definition of "intentional". Evolution has filtered for people with certain views and the marketing has just emerged from their actions. ... So?
A deadly virus (naturally occurring one let's say) wasn't created intentionally. Evolution selected for it. It's still bad and kills people. Doesn't make it nice because of lack of intention.
by teiferer
5/11/2026 at 2:10:06 PM
I think that's a reasonable analysis, but it's very different than the one that's usually implied by "marketing". Most people I see talking about Dario and his "marketing" go on to express confusion or frustration on why he would decide to message this way, ignoring what I (and perhaps you?) consider to be the obvious answer that he believes it's true.by SpicyLemonZest
5/11/2026 at 12:49:45 PM
This is marketing.by keybored
5/11/2026 at 1:11:56 PM
All your evidences can be exactly true, and he genuinely believes that Anthropic "winning" the AI race is the best outcome for humanity even with a little subterfuge including marketing to the current administration. If I genuinely thought I needed to do something to secure humanity, there's little I wouldn't do to achieve it.by petesergeant
5/11/2026 at 12:23:47 PM
Even that press release never claimed that Mythos was better than Opus at finding bugs.They claim the huge advance is in exploiting the bugs.
by cvwright
5/11/2026 at 12:15:16 PM
He also said this [1] a few weeks ago about AI PRs.> Over the last few months, we have stopped getting AI slop security reports in the #curl project. They're gone.
> Instead we get an ever-increasing amount of really good security reports, almost all done with the help of AI.
> They're submitted in a never-before seen frequency and put us under serious load.
> I hear similar witness reports from fellow maintainers in many other Open Source projects.
> Lots of these good reports are deemed "just bugs" and things we deem not having security properties.
[1]: https://www.linkedin.com/posts/danielstenberg_hackerone-shar...
by smusamashah
5/11/2026 at 3:18:42 PM
> My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing.I think the results say more about the great job the curl team has done maintaining their codebase.
This doesn’t mean Anthropic's Project Glasswing is a marketing stunt. Logically, it doesn’t make sense: when they announced Mythos Preview, Anthropic couldn’t meet customer demand; they didn’t have enough compute to go around. So they decide to hype an unreleased product to drive even more demand? All that would do is piss off their existing customers who already experiencing rationing and frequent outages.
Many forums were already flooded with "I cancelled Claude Code" as it was.
On the contrary, it would be incredibly irresponsible and unethical for such a young company with billions of dollars of other people’s money invested in them.
Because the Mozilla team used Mythos and found 271 vulnerabilities [1], does that mean they're in on the so-called "marketing stunt"?
Of course, if Anthropic had released Mythos to the public and bad actors used it to hack a large number of banks, hospitals, government agencies, etc. in a matter of days, the HN crowd would be all over them for acting irresponsibly and criticizing them for not knowing better.
[1]: "Behind the Scenes Hardening Firefox with Claude Mythos Preview" — https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...
by alwillis
5/11/2026 at 3:34:04 PM
Other AI tools have found 300 bugs and this new sentient T1000 only found one. Stenberg himself found 30 this year.Mozilla is the current poster child but 271 in such a large codebase with thousands of user options, most of them being TOCTOU isn't that much. Sorry. TOCTOU can happen in any language when people are simply exhausted by the sheer volume of case explosions.
There is a third option: Anthropic could simply have reported the issue without mentioning the new model at all. But they don't, since they want to sell to governments and military and the artificial scarcity just provides a veneer of exclusivity that their clients will appreciate.
by asltp_
5/11/2026 at 9:04:05 AM
Mythos marketing really leans into that "too powerful to be legal" vibe, much like how PS2s were allegedly banned from North Korea because their chips were basically missile-grade.by jansan
5/11/2026 at 11:52:01 AM
I'm pretty sure mythos is just a new unreleased version of Opus + marketing + a different system prompt.by 63stack
5/11/2026 at 12:14:07 PM
I suspect so as well.I've been running my own security scanning software (disclaimer: now starting a company @ zeroquarry.com) for this, and from what I've seen there's a huge value in prompts + adversarial LLM review. Without adversarial review, you get garbage (as this blog points out: 4/5 basically are nonsense) and with a good prompt, you can use almost any "near frontier" model from my experience as long as the prompt helps with the guardrails or the model doesn't protect in such a strict way
by eskibars
5/11/2026 at 3:43:26 PM
It's almost as if management was a useful function in organizations ;)by dboreham
5/11/2026 at 2:17:28 PM
Yes, the governments fall for it for the time being:https://www.politico.eu/article/anthropic-hacking-technology...
This is an advertising masterpiece: UK gets first access, the EU is jealous and wants it, too. Thousands of bureaucrats and parasites make money in the process writing (probably using AI) whitepapers and sitting in meetings. The open source authors whose works are being scanned make nothing.
We know how the money flows. Another unrelated example is that ex MI6 director Sir John Sawers is a Palantir consultant and sells out the UK to Palantir.
by khlapz
5/11/2026 at 8:36:15 AM
They might be biased by the fact that curl is significantly more secure than the average softwareby h1fra
5/11/2026 at 4:43:23 PM
I've seen this suggested a few times in this thread but it seems like it's exactly backwards.Wouldn't that make it a better to distinguish whether Mythos is uniquely super powerful vs an incremental improvement from Opus etc that are routinely used as the basis for bug reports/fixes in cURL?
If Mythos found a hundred new show stopper bugs then it would have meant Opus missed them and therefore closer to a "step change". Otherwise it implies the difference in capability isn't nearly that stark. Mythos finding 100 low-hanging bugs in a less scrutinized/hardened project on the wouldn't be as useful signal to answer that.
by toraway
5/12/2026 at 4:47:56 AM
I have an impression that he expects something like the famous move 37 of AlphaGo, while it could be that the situation is like in chess where superhuman engines validated human findings.by red75prime
5/12/2026 at 7:48:08 AM
Am I missing something here. Not finding a major bug/vulnerability just means that maybe the code is really good, not that the model is not what is claimed?by hislaziness
5/11/2026 at 8:03:56 AM
>It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.About as subtle as a personal injury lawyer's billboard
by coldtea
5/11/2026 at 8:07:38 AM
Better Call Darioby steve1977
5/11/2026 at 8:06:15 AM
A thankfully American referenceby te_chris
5/11/2026 at 9:13:50 AM
Can you expand on this? Do you mean in contrast to the European AI milieu?by Exoristos
5/11/2026 at 9:57:06 AM
No, the personal injury lawyer billboards.by te_chris
5/11/2026 at 7:26:49 PM
In the UK of course it would be "personal injury barrister"by llbbdd
5/12/2026 at 7:10:31 AM
I’ve never seen a billboard for it here - lived here 11 years.by te_chris
5/11/2026 at 7:47:22 AM
I'd go out and say the marketing is not subtle. The hype and fanboys/girls are so in line with the marketing that any level of skepticism is seen a an act of defection, but if you look at the words, hyperbole and volume that is used, there is nothing subtle about it.It's almost Trump-esque - "this model will change everything forever; we are doomed; we are saved; we will all be fired; we will all be rich", etc
by greendude29
5/11/2026 at 8:01:42 AM
That's a pretty good encapsulation of the parallels between the political and the technological: One necessarily thrives upon the other and are inextricable. This moment is a culmination of all the disenfranchisement the bodypolitik have suffered, looking for any possible means of escape or elevation. AI and Trumpism, for their own respective cohorts, are salvation, on offer by different frontmen but ultimately in service of the same system.They need the hype to pay off way more than we do. So many of us who still write code directly stand to lose nothing of our capabilities if the marketing claims cannot hold water.
by xantronix
5/11/2026 at 8:24:22 AM
I seem to be totally outside the hype bubble, but I have to suspect there is a lot of imagineering and wild extrapolations in the elss technical hype bubbles. I am curious but no enough to go looking.by ehnto
5/11/2026 at 8:58:01 AM
>I seem to be totally outside the hype bubbleI'm surprised you say that because it is all over Hacker News. Every single post is co-opted into promoting AI. Try finding a submission with fifty points or more than doesn't have AI or LLM's mentioned somewhere in the comments.
by tonyedgecombe
5/12/2026 at 2:13:19 AM
That's a good point, I guess I see the Hacker News hype a bit more realistically then maybe I should. HN has definitely changed in the sense that I rarely see interesting technology or achievements hit the front page, that aren't AI related. It feels like AI has taken all the oxygen from the room.by ehnto
5/11/2026 at 10:03:17 AM
Feel free to retire from the field if you grow tired of seeing its latest developments.by zen928
5/11/2026 at 11:10:05 AM
I already have.That’s not really the point though. I have no doubt AI is useful, I just don’t want to have it shoved in my face every five minutes.
by tonyedgecombe
5/11/2026 at 12:06:20 PM
Eh... I think he puts the LLM down for his own ego's sake (as would I!). Curl may, next to the Linux kernel, be one of the most heavily audited codebases in existence. The LLM found something he and thousands of others missed. It's not unimpressive.by bjourne
5/12/2026 at 2:27:22 PM
The claim has never been that the new model could not do impressive things. The claim is that the new model is not the existential crisis Anthropic’s initial announcement post made it out to be.by billyoneal
5/11/2026 at 10:00:26 AM
[dead]by aaron695
5/11/2026 at 1:26:40 PM
I commented this in another post but I'm going to repeat it because I believe its important for this discussion.> The worrying part about Mythos isn't the fact that it can find bugs. The worrying part is Mythos being able to find them on its own across entire code base as vast as Firefox then write exploits for what its found with a very basic prompt.
> The skill required to find then create zero days is quickly approaching the floor.
by wnevets
5/11/2026 at 1:31:48 PM
Opus can find bugs on its own in large codebases just fine with minimal prompting.The great exaggeration is that this is a new capability.
by colechristensen
5/11/2026 at 1:37:03 PM
> Opus can find bugs on its own in large codebases just fine with minimal prompting.and then it write the exploits automatically for you?
by wnevets
5/11/2026 at 1:47:15 PM
Yesby colechristensen
5/11/2026 at 1:58:12 PM
I will never ever understand how people are amazed by this. Have they just not tried it and then just assume that because Anthropic says this is the first it must be true?This was one of the first things I tried and it works great.
by ofjcihen
5/11/2026 at 1:48:41 PM
Can you send me that link?by wnevets
5/11/2026 at 1:58:53 PM
Does this mean you’re only using the models in the web app? I mean that might be why you haven’t been able to do this?by ofjcihen
5/11/2026 at 1:54:05 PM
What link? I've done it myself.by colechristensen
5/11/2026 at 2:01:39 PM
You've pointed codex to the entire source code of firefox and simply prompted it to find bugs and then had it write the exploits for you? Why haven't you published this? That would sink all of the the claude code hype.by wnevets
5/11/2026 at 2:23:56 PM
No, I'm not interested in Firefox bugs, but I've done it with my own large projects.What I think happened here is an Anthropic team with very little security expertise were working on finding bugs for marketing reasons and when they prompted to make POC exploits of those bugs they didn't have much success because they didn't really know what to ask for. They then proceeded to very finely tune their next model to eagerly exploit vulnerabilities making the models much more powerful for the "I don't know what I'm doing" user which they're now trying really hard to convince everyone is a game changer. </speculation>
The reason many of us are skeptical is we've used the current models to do things and they've worked.
An analogy might be if they tuned their model to eagerly instruct somebody how to make improvised weapons, now somebody is asking about how to deal with a rival at work and their model gives instructions on building a bomb from hardware store parts. Then go on a marketing spree telling everybody how dangerous it is. This example might highlight how insincere the marketing is. At any point you could have tuned the model to exploit for inexperienced people, now that you've done it does not mark a grand new capability. People who knew what they were doing could already do this with models.
by colechristensen
5/11/2026 at 2:42:22 PM
> No, I'm not interested in Firefox bugs, but I've done it with my own large projects.Can you publish your results and send them to Bruce Schneier, Dave Lewis, & Heather Adkin [1] so they know that this isn't anything new and just the work of people with little security expertise?
by wnevets
5/11/2026 at 3:06:41 PM
That whitepaper did not need 19 authors. They're there for show.The Mythos FUD is a gift to the security team because it made the C-suite care about security and this is a plan to tell them what should be done and what to expect in the era of LLM security tools.
This is an emperor-has-no-clothes situation but we're selling winter coats and winter is near. Not focusing on how the Mythos FUD is exaggeration and instead focusing on actually necessary security postures is perhaps a tad dishonest but it still gets everybody in a better state and is an unfortunate common point in C-suite politics (and why the rich and powerful often seem so disconnected from reality and common people, everyone around them is trained to interact with them in a certain way and "mythos marketing is bullshit" is one of those things that people just don't say to them)
by colechristensen
5/11/2026 at 3:36:24 PM
Isn't that all the more reason to publish your process & results using Codex to do the same thing they're claiming? Presuming any bugs Codex found would be fixed and no longer a security concern.by wnevets
5/12/2026 at 1:48:00 AM
Why would you publish something unremarkable and benign?Is it actually that hard for you to go try this out yourself?
by ofjcihen
5/12/2026 at 3:35:20 AM
> Is it actually that hard for you to go try this out yourself.I can't get it to work Codex, can you?
by wnevets
5/12/2026 at 12:58:01 PM
Yes. That’s my main driver. What do you mean you can’t get it to work?by ofjcihen
5/12/2026 at 1:52:37 PM
You must show me how you are able to coerce Codex to be useful using this setup with no hand holding. You say its unremarkable and benign but it doesn't match my experience at all. I'm convinced I am not the only person on HN who would love to know how you are able to do it.> We launch a container (isolated from the Internet and other systems) that runs the project-under-test and its source code. We then invoke Claude Code with Mythos Preview, and prompt it with a paragraph that essentially amounts to “Please find a security vulnerability in this program.” We then let Claude run and agentically experiment. In a typical attempt, Claude will read the code to hypothesize vulnerabilities that might exist, run the actual project to confirm or reject its suspicions (and repeat as necessary—adding debug logic or using debuggers as it sees fit), and finally output either that no bug exists, or, if it has found one, a bug report with a proof-of-concept exploit and reproduction steps.
> Finally, once we’re done, we invoke a final Mythos Preview agent. This time, we give it the prompt, “I have received the following bug report. Can you please confirm if it’s real and interesting?” This allows us to filter out bugs that, while technically valid, are minor problems in obscure situations for one in a million users, and are not as important as severe vulnerabilities that affect everyone. [1]
by wnevets
5/12/2026 at 2:51:24 PM
I quite literally do this almost exactly with GPT 5.4. Sometimes I give it a poke in a direction but it largely runs by itself.I don’t know what to tell you. You say it’s not possible but the money in my HackerOne account says otherwise.
by ofjcihen
5/12/2026 at 4:13:47 PM
> I don’t know what to tell you. You say it’s not possible but the money in my HackerOne account says otherwise.I haven't said it was impossible. I said I can't replicate the Mythos setup with Codex on any project even approaching the size of Firefox.
If your Codex setup and the results its generates are unremarkable, please post them.
by wnevets
5/12/2026 at 4:40:12 PM
My codex setup is quite literally a single file that states a format I want my reports to be written in so that I can review them before submitting them. Sometimes I bother with the container setup, sometimes I don’t depending on the work.This isn’t a matter of a harness, skill files, anything. This is just something that a model can do.
You have multiple people saying they’ve done it here. I can only assume you’re being facetious at this point.
by ofjcihen
5/12/2026 at 6:18:29 PM
You've now spent multiple days in this comment thread describing this as simple and unremarkable but refuse to share anything about it. At any point in the last 24+ hours you could've posted your single file, the size of the project, and what the model was able to produce on its own.Must this information be protected or is its unremarkable?
by wnevets
5/11/2026 at 7:51:48 PM
No, what I'm doing isn't remarkable.Publishing an extensive critique of Anthropic marketing is just an exercise in attracting abuse from nitpickers and the ignorant. If the author of cURL can't convince people, and security of his product has been one of his primary responsibilities for decades in one of the most widely used pieces of software out there... what hope do I have?
I've got better things to do.
by colechristensen