4/8/2026 at 4:33:15 AM
I’m sure the new model is a step above the old one but I can’t be the only person who’s getting tired of hearing about how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.I would honestly go so far as to say the overhype is detrimental to actual measured adoption.
by ofjcihen
4/8/2026 at 4:50:43 AM
There is plenty of overhyping, no one denies that. But the antidote is not to dismiss everything. Ignore the words and look at the data.In this case, I see a pretty strong case that this will significantly change computer security. They provide plenty of evidence that the models can create exploits autonomously, meaning that the cost of finding valuable security breaches will plummet once they're widely available.
by qnleigh
4/8/2026 at 6:45:53 AM
You seem to see a "pretty strong case" from a bombastic press release.Don't get me wrong, I do know the reality has changed. Even Greg K-H, the Linux stable maintainer, did recently note[1] that it's not funny any more:
"Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality," he said. "It was kind of funny. It didn't really worry us."
... "Something happened a month ago, and the world switched. Now we have real reports." It's not just Linux, he continued. "All open source projects have real reports that are made with AI, but they're good, and they're real." Security teams across major open source projects talk informally and frequently, he noted, and everyone is seeing the same shift. "All open source security teams are hitting this right now."
---
I agree that an antidote to the obnoxious hype is to pay attention to the actual capabilities and data. But let's not get too carried away.
[1] https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_...
by kashyapc
4/8/2026 at 5:15:32 PM
Hadn’t been to a Kubecon in about a year as I’ve been tending to go to just the European ones. I definitely felt a much stronger this is real vibe at this event from people like Greg KH.by ghaff
4/8/2026 at 5:55:55 AM
Is there any actual independent data though, or verification of any of these claims?As it stands this is just a marketing programme for all involved.
by 4ndrewl
4/8/2026 at 6:07:37 AM
Ffmpeg confirmed on Twitter that they sent the patches.by H8crilA
4/8/2026 at 7:26:54 AM
Although, they also said, "Because the patches appear to be written by humans".by cubix
4/8/2026 at 8:47:39 AM
"Mythos writes code like a human" incomingby WithinReason
4/8/2026 at 11:20:02 AM
The patches could have been written by humans, it doesn't matter that much. Or written by a clanker and polished by engineers. The difficult part is usually not in writing the patches that fix such vulnerabilities, but in finding the vulnerabilities. And these days it's even harder to exploit them, since you need to bypass modern hardening features.by H8crilA
4/8/2026 at 6:34:23 AM
What would be the product they're marketing by this campaign?by kachnuv_ocasek
4/8/2026 at 6:54:01 AM
You don't market products, you market lifestyles/interests. Sell the sizzle, not the steak etc.For Anthropic it's "we own the big scary models, the AI security space, but it's ok we're responsible"
For the partners it's "we're the Big Boys here and will look after your enterprise needs"
None of it needs any more than anecdata and some nice, pre-approved, quotes.
Every organisation does it.
by 4ndrewl
4/8/2026 at 6:42:43 AM
The product they launched?by ozozozd
4/8/2026 at 4:16:29 PM
This product is explicitly not being released for usageby mholm
4/9/2026 at 1:05:28 PM
The product is being provided to some of the most influential companies. That can definitely serve to Anthropic's advantage. (Regardless, I suspect the hype is real.)by prawn
4/8/2026 at 5:38:22 PM
just because _we_ don't have access does not mean anthropic's not getting paidby 0123456789ABCDE
4/10/2026 at 1:47:30 AM
Imagine you were making purchasing decisions about which LLM-based coding tool to use.If one of the possible vendors convinces you that that they have a next gen model that is so powerful it found 20+ year old bugs in a hardened operating system, that would undoubtedly have an influence on your decision even if you are only buying the current model.
by timv
4/8/2026 at 8:20:17 PM
[dead]by danudey
4/8/2026 at 8:56:32 AM
That's pretty disingenuous, bordering on ridiculous.Do they have a record of lying to you? No.
Go read the system card. It's a lot more tame than you think, peoples are taking pieces out of this and hyping it. Doesn't mean it's not valid.
by KoolKat23
4/8/2026 at 4:53:16 AM
Which sounds like a great thing. Less undiscovered security vulnerabilitiesby killingtime74
4/8/2026 at 5:46:19 AM
The only people panicking are probably those state level actors who were using these for their own benefit.by harikb
4/8/2026 at 5:02:54 AM
With the right prompting (mostly creating a narrative that justifies the subject matter as okay to perform) other models have already been doing this for me though. That’s another confusing bit for me about how this is portrayed and I refuse to believe I’m a revolutionary user right?I mean I’m sitting on $10k worth of bug payouts right now partially because that was already a thing.
by ofjcihen
4/8/2026 at 5:25:50 AM
> Non-experts can also leverage Mythos Preview to find and exploit sophisticated vulnerabilities. Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit. In other cases, we’ve had researchers develop scaffolds that allow Mythos Preview to turn vulnerabilities into exploits without any human intervention.by dota_fanatic
4/8/2026 at 5:47:26 AM
I mean yeah. I’ve had these successes without scaffolding or really anything past Claude CLI and a small prompt as well?by ofjcihen
4/8/2026 at 6:32:15 AM
Just saw your edit. I'll leave it at this, this is why it's news to me, because by their very own measurements, Opus simply doesn't come close. I trust their empirical evidence over your hearsay. But feel free to prove me wrong with evidence.> With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5).
by dota_fanatic
4/8/2026 at 6:17:06 AM
You've taken control of a remote server running OpenBSD? Or similarly expert level exploit? Can you share one of the bounties you've received that is of the magnitude they're talking about?Edit: Wait, you wrote "As someone in cybersecurity for 10+ years" elsewhere in this thread. You wrote "a small prompt" using e.g. Opus 4.6 and it found critical vulnerabilities of the magnitude they're describing, presumably without your prompt having anything beyond what a non-expert could write? I feel like you might want to tell Anthropic since clearly they're not comfortable with that level of power being publicly available.
by dota_fanatic
4/8/2026 at 6:30:13 AM
I mean, yes? And my point is that this isn’t exactly a new capability. Sure it’s probably better but we’ve been able to do this. They didn’t just suddenly “turn on the security”. LLMs have excelled at code since widely being released. I have no idea why that’s news and the fact that they’re treating it as such makes it seem like hype.by ofjcihen
4/8/2026 at 7:55:25 AM
[dead]by heyethan
4/8/2026 at 6:31:39 AM
> how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.It's much the dynamic between parents and a child. The child, with limited hindsight, almost zero insight and no ability to forecast, is annoyed by their parents. Nothing bad ever happens! Why won't parents stop being so worried all the time and make a fuss over nothing?
The parents, which the child somewhat starts to realize but not fully, have no clue what they are doing. There is a lot they don't know and are going to be wrong about, because it's all new to them. But, what they do have is a visceral idea of how bad things could be and that's something they have to talk to their child about too.
In the eyes of the parents the child is % dead all the time. Assigning the wrong % makes you look like an idiot and not being able to handle any % too. In the eyes of the child actions leading to death are not even a concept. Hitting the right balance is probably hard, but not for the reasons the child thinks.
by jstummbillig
4/8/2026 at 7:39:14 AM
Disagree - we’re being told on one hand that we are 6 months away from AI writing all Code, and 3 months into that the tools are unusable for complex engineering [1]. Every time I mention this I’m told “but have you tried the latest model and this particular tool” - yes I have, but if I need to be on the hottest new model for it to be functional that means the last time you claimed it was solved, it wasn’t solved.[0] https://www.entrepreneur.com/business-news/ai-ceo-says-softw...
by maccard
4/8/2026 at 8:51:24 AM
> Every time I mention thisI feel like there’s a bunch of factors for why it will never be the same for many folks, from the models and harnesses, to the domains and existing tests/tooling.
I feel bad for the people for whom it doesn’t work, but Claude Opus has written most of my code in 2026 so far. I had to build some tools around linting entire projects and most of my tokens are probably referencing existing stuff and parallel review iterations and tests, but it’s pretty nice and even seeing legacy code doesn’t make me want move to a farm and grow potatoes.
It might be counter productive to be like: "Oh, just do X!" which works for the person suggesting it, and then have to do "But have you tried Y?" when it doesn't for the other person, if it just keeps being a never ending string of what works for one person not working for another.
by KronisLV
4/8/2026 at 10:38:08 AM
> I feel like there’s a bunch of factors for why it will never be the same for many folksYeah, and the problem arises simply because some people are unable to accept the fact. They insist that if LLM-assisted coding doesn't work for one, it's because “you're holding it wrong”.
by laserlight
4/8/2026 at 9:50:00 AM
> I feel like there’s a bunch of factors for why it will never be the same for many folks, from the models and harnesses, to the domains and existing tests/tooling.If the argument is “you have to use the right model, harness, test and tooling for it to work” then it’s not replacing software engineers any time soon.
The other thing is - where are all the web apps, mobile apps, games, desktop apps, from these 100x productivity multipliers. we’re 1-2 years into these tools being widely mainstream and available and I’m not seeing applications that took years to ship before appear at 100x the rate, or games being shipped by tiny teams, or new ideas of mobile apps coming out at 100x the rate. What we do see is vibe coded slop, stability issues with massive companies (windows, AWS for example), and mass layoffs back to pre-covid levels blamed on AI but everyone knows it’s a regression to the mean after a massive over hiring when money was cheap.
It’s like the emperor has no clothes on this topic to me.
by maccard
4/8/2026 at 3:03:55 PM
I’m an indie developer and I see the explosion in apps in my niche (creative tools for photography/videography).They wouldn’t have taken years to ship before, but easily a couple months.
Now the moment any app with any value gets popular, the App Store gets flooded with quick vibe coded copycat clones (very recognizable AI generated icon included).
The quality is low, but the impact this flood has on the market is real.
by gyomu
4/8/2026 at 11:07:09 AM
I wouldn't paint the image in such black terms. LLMs can be good in finding bugs and potential issues. And if you like, they can be like IntelliSense on steroids. Even agentic workflows can be good, e.g. for an initial assessment of a new large codebase. And potentially millions of other small tasks like writing one-off helper scripts etc.by dvfjsdhgfv
4/8/2026 at 12:45:04 PM
So which apps are seeing 10x the bug fixes and improvements in stability and quality? From my side, I see one shot CRUD apps, platforms like AWS and windows actively deteriorating, to the point of causing massive outages and needing to have development processes changed [0]. Who is actually shipping 10x more stuff, or fixing 10x more bugs?[0] https://arstechnica.com/ai/2026/03/after-outages-amazon-to-m...
by maccard
4/9/2026 at 3:07:11 AM
I "pair" with claude-code and still write 30% by hand, with additional review with gpt-5.4, but I definitely write fewer bugs than before. I'd estimate my speedup to be 2x.by MaybiusStrip
4/8/2026 at 2:00:29 PM
The Automation bias issue is something that has been raised by many people like myself but mostly ignored. The better models get the worse that problem with get, but IMHO the implications of the claims are not on the code generation side.The sandwich story in the model card is the bigger issue.
LLMs have always been good at finding a needle in a haystack, if not a specific needle, it sounds like they are claiming a dramatic increase in that ability.
This will dramatically change how we write and deliver software, which has traditionally been based on the idea of well behaved non-malfeasant software with a fix as you go security model.
While I personally find value in the tools as tools, they specifically find a needle and fundamentally cannot find all of the needles that are relevant.
We will either have to move to some form of zero trust model or dramatically reduce connectivity and move to much stronger forms of isolation.
As someone who was trying to document and share a way of improving container isolation that was compatible with current practices I think I need to readdress that.
VMs are probably a minimum requirement for my use case now, and if verified this new model will dramatically impact developer productivity due to increased constraints.
Due to competing use cases and design choice constraints, none of the namespace based solutions will be safe if even trusted partners start to use this model.
How this lands in the long run is unclear, perhaps we only allow smaller models with less impact on velocity and with less essential complexity etc…
But the ITS model of sockets etc.. will probably be dead for production instances.
I hope this is marketing or aspirational to be honest. It isn’t AGI but will still be disruptive if even close to reality.
by nyrikki
4/9/2026 at 8:16:38 PM
It depends on the use, I'm not fixed on "productivity" measured by LoC but on code quality. So when using LLMs to challenge my code I'm less productive but the quality of my code increases.by dvfjsdhgfv
4/8/2026 at 1:12:01 PM
It actually seems like people are shipping 10x more bugs, not fixing 10x more bugs.by camdenreslink
4/8/2026 at 11:46:54 AM
Where are all the apps? It's mostly visible in AI tooling itself. Harnesses, vibe coding tools and stuff with "claw" in the name saw a cambrian explosion.And maybe using AI to use AI better is just masturbatory. But coders want interesting problems to solve. Pros also need software ideas they can monetize. And what problem is attracting more investment in money, time and neurons than the problem of making AI productive? (I am referring only to problems that can be solved in software....)
So the thing with AI is that right now it is both a tool AND a potentially very valuable problem to solve, that's why most of the AI "productivity" gains go into AI itself. At one point this self-refetential phase will have to end and people are going to see if these new AI tools, harnesses.claw-things are actually applicable to things people are willing to pay the real prices for (not the subsidized ones).
by fpaf
4/8/2026 at 11:33:17 AM
wasn’t there a news story about the app store reviews being delayed because of an increase in app influx?by muggesmuds
4/8/2026 at 2:02:31 PM
that doesnt tell us much about the subjective quality of the apps in said influxby RugnirViking
4/9/2026 at 4:25:03 PM
And thus the goalpost was shifted. The first question was "where are all the AI coded apps?" And once this was answered, the subject is immediately switched to quality.by buzzin__
4/9/2026 at 7:22:50 PM
no? the post they responded to said> I’m not seeing applications that took years to ship before
> What we do see is vibe coded slop
by RugnirViking
4/8/2026 at 12:53:31 PM
I absolutely feel like there has been an explosion of software since the release of AI tools. This is a subjective assessment anyway…My company for example has gotten 500% better at creating productivity tools.
by JambalayaJimbo
4/8/2026 at 4:35:49 PM
Even co-pilot writes most of my code in april 2026.Further, i don't trust code anymore that hasn't been reviewed 3x or more by co-pilot.
If you have asked me 6 months ago I wouldn't have expected this change so soon.
by je42
4/8/2026 at 11:03:08 AM
> I had to build some tools around linting entire projectsOK, everybody is doing that. And everybody is doing their best at making LLMs more reliable when working on non-trivial tasks. Yet, it looks like nobody came up with a universal solution yet. This is particularly true for non-trivial projects.
by dvfjsdhgfv
4/8/2026 at 2:36:34 PM
It’s because the models response is conditioned on the prompt. They are as intelligent as the person using themIn some sense it’s a lot like a google search. There’s this big box of knowledge and you are choosing tokens to pluck out of it. The quality of the tokens depends on how intelligent you are.
by mlsu
4/8/2026 at 3:19:03 PM
Don’t forget, it also depends on the complexity of the work and the experiences of the operator.The less complex the work and the less experienced the operator means more perceived “wow” factor :)
There’s definitely an aspect of how you use it though. In my work it’s mostly been chaining to reduce non-determinism.
by ofjcihen
4/8/2026 at 3:34:10 PM
The irony here is that even if one is extracting legitimate value from LLMs because they are that much smarter than their peers, the process of using LLMs to perform all of their skilled labor makes them less intelligent.by GoatInGrey
4/8/2026 at 9:42:28 AM
Check out from this onwards and the following point. You get a nice summary on top right. Mind that Anthropic alone is doing 30B/y annualized already.Take a snapshot and check again in a few months. It's not perfect but it's much more falsifiable than a lot of the noise.
by ndr
4/8/2026 at 10:35:38 AM
> Mind that Anthropic alone is doing 30B/Y annualised alreadyHow many crypto exchanges were pulling in hundreds of millions in funding and doing billions in trades in 2021/2022?
That blog post is… really something, I’ll give you that. Im not entirely sure what else to say about it other than that.
by maccard
4/8/2026 at 12:55:40 PM
Trade volume and buying API credits are very dissimilar ways of measuring value. One can be wash traded into oblivion, the other is burning a hole in corporate accounts.by piyh
4/8/2026 at 8:50:05 AM
> “I think… I don’t know… we might be six to twelve months away from when the model is doing most, maybe all of what SWEs (software engineers) do end to end.”I think it's disingenuous (as disingenuous as you're accusing these marketing teams of being) to paraphrase that as "being told on one hand that we are 6 months away from AI writing all Code". It's merely stating that it's a real possibility. (It's also disingenuous to use a post complaining about a behavioral regression bug as evidence that it's not progressing)
Dismissing it as impossible is silly, considering how close it already is to a junior dev. Keep in mind that 14 months prior to that statement was before we even had any public reasoning models. Things really are moving that fast, it's just, at the moment, unclear how fast.
by LordDragonfang
4/8/2026 at 9:41:49 AM
We’ve been suggesting that programmers are going to be replaced by simpler programming languages, gui programming tools, no code tools, low code tools, and now AI. The real big step was when Claude code came out and introduced the agentic loop where it could self validate against tests/linters/tooling, but everything after that had been penned as miraculous when IME it’s a new iteration of the same thing - wild hallucinations, getting stuck in deep loops, ignoring explicit instructions and guard rails, wild tangents and just generating stuff that doesn’t work or solve the problem.> I think it's disingenuous (as disingenuous as you're accusing these marketing teams of being) to paraphrase that as "being told on one hand that we are 6 months away from AI writing all Code". It's merely stating that it's a real possibility
No - you don’t get to make wild predictions and say “oh I didn’t actually mean that, look how succesful we are though”. These teams aren’t saying “hey we think we’re going to majorly influence programming in 6-12 months”, they’re saying “we’re going to replace programmers”. If you can’t stand over your claims, don’t make them. _That’s_ disingenuous.
by maccard
4/8/2026 at 7:03:42 PM
> We’ve been suggesting that programmers are going to be replaced by simpler programming languages, gui programming tools, no code tools, low code tools, and now AI.The difference is that it's actually working this time. Non-programmers are writing full apps. Sure, they're simple ones, often just CRUD and UI, but it actually is changing things in a way it never has before. You can't assert something is the same as everything previous when there's already evidence that it's different.
> No - you don’t get to make wild predictions and say “oh I didn’t actually mean that, look how succesful we are though”.
Except that's not what's happening here. I'm criticizing you for misrepresenting what claim was made in the first place. No where in your evidence have you shown anyone "walking the claim back". If anything, TFA is claiming evidence of an LLM doing "most" of what SWEs do "end to end" three months ahead of schedule.
If you want to present evidence Dario (or another CEO -- I'm sure Sama has made much more fantastic claims that you could falsify) made claims that didn't pan out, be my guess, but don't tell falsehoods about the evidence you are presenting.
(And no, I'm not counting breathless tech reporters -- everyone knows how much to trust them when they report a cure for cancer -- they'll say everything is a miracle cure. But the fact that hundreds of "miracle weight loss cures" that never panned out made the new in the past several centuries didn't make GLP1s fake just because they had the same type of hype.)
by LordDragonfang
4/8/2026 at 8:29:01 PM
> The difference is that it's actually working this time. Non-programmers are writing full appsYou can say this about every step along the way. C programmers replaced assembly programmers. Python programmers replaced C programmers. low code tools replaced interal tools teams.
> I'm criticizing you for misrepresenting what claim was made in the first place. No where in your evidence have you shown anyone "walking the claim back".
The claim is that SWES will have their work done by models in 6-12 motnhs. We are _nowhere near_ that 9 months on to it. That's all there is to say it.
> If anything, TFA is claiming evidence of an LLM doing "most" of what SWEs do "end to end" three months ahead of schedule.
TFA based on a model that is so good that it has to be kept from us? from the company that literally can't keep their app up? From the company who shipped an update that didn't launch?
> be my guess, but don't tell falsehoods about the evidence you are presenting.
I mean, I literally posted a quote from the CEO of one of the two major companies saying that SWEs are 6-12 months away from being replaced. This is fantasy talk from a guy who is incentivised to have you believe this. If the claims are that software is changing, and how we're building/deploying software is adapting to that new world then yeah that's fair enough. But the current models, harnesses and tooling are not replacing an SWE unless there's a paradigm shift in the next 3 months. And my point is that we appear t be going backwards, not forwards.
> didn't make GLP1s fake just because they had the same type of hype.
No, GLPs work and that's the difference.
by maccard
4/9/2026 at 8:30:33 PM
> I mean, I literally posted a quote from the CEO of one of the two major companies saying that SWEs are 6-12 months away from being replaced.Even ignoring the other ways you're misrepresenting the, there's a huge difference between "might be" and "are going to be".
I'm sorry if English isn't your first language, but we're going to have to agree on basic grammar or else it's not going to be productive for me to continue responding to the flaws in your argument.
by LordDragonfang
4/8/2026 at 6:36:53 AM
That feels like a very complex way of looking at it. Another way would be to say “potentially profit seeking companies have an incentive to oversell products even if they’re good”.by ofjcihen
4/8/2026 at 8:46:31 AM
Is Anthropic lying about model capabilities? If not, where is the overselling?by WithinReason
4/8/2026 at 3:51:10 PM
March 2025, Anthropic was claiming that 90% of code would be written by LLMs in three to six months, and "essentially all" code within twelve months. This was one week after closing a Series E round for $3.5 billion. When they began working on their Series F round for $13 billion. You shouldn't need more than that to understand what's going on here.The Claude Code leak revealed that Anthropic runs Claude-operated bots on the internet. One should be very cautious in getting swept up in the fund-raising process if they are not seeing first-hand the fruition of all of the flattering claims being presented by strangers on the internet.
by GoatInGrey
4/8/2026 at 6:23:42 PM
>March 2025, Anthropic was claiming that 90% of code would be written by LLMs in three to six months, and "essentially all" code within twelve months.There's a pretty big difference between "We predict in X time frame our model will be capable of Y" and "Our model did Y."
This is like watching someone measure the size of an object and saying "I don't believe you because you guessed it was X before you pulled out your tape measure."
by supern0va
4/8/2026 at 5:46:02 PM
You're talking about marketing predictions and I'm talking about data presented in a whitepaper. They are not the same thing.by WithinReason
4/8/2026 at 6:43:03 AM
[flagged]by ACCount37
4/8/2026 at 6:50:11 AM
Homie chill. I use Opus every day and I love it. I’m not saying it’s all hype, just that these companies are here to make money and that every advertisement should be taken with salt yeah?Also maybe consider what this kind of visceral reaction indicates on a personal level :/
by ofjcihen
4/8/2026 at 6:56:19 AM
[flagged]by ACCount37
4/8/2026 at 7:09:05 AM
I mean if it helps I support the move to not release mythos right off the bat yeah? That makes sense, treat new models like new vulnerabilities and give companies time to scan with them etc.But you have to admit it does serve a savvy business purpose of creating a moat where one wasn’t by getting these tech companies on board and the threat does make for good marketing yeah?
by ofjcihen
4/8/2026 at 7:21:10 AM
[flagged]by ACCount37
4/8/2026 at 7:48:58 AM
"There is NO shadow wizard marketing gang pulling the strings on every single thing those companies do"Yes there is they are called marketing departments and it is their exact purpose.
by UltraSane
4/8/2026 at 7:40:28 AM
Normalcy bias is human nature. If human nature bothers you then you're going to be annoyed all the time.by sandspar
4/8/2026 at 7:30:10 AM
We have been hearing this since GPT-2. They’ve been crying wolf for too long, that’s on them (the model providers). That and the fact they never publish anything interesting around their claims. It’s the ffmpeg thing all over again (very old bug in a decoder for a format used by one game from the early 90’s sold as some major breakthrough).by techpression
4/8/2026 at 8:15:12 AM
The parents in this case are profiteering corporations on a mission to exploit the child for everything they can get away with, almost by definition.It's a slightly different dynamic.
by avaer
4/8/2026 at 9:12:58 AM
I feel like you’re muddying 2 different arguments here. Or rather, 2 different positions.You’re asserting that people who are tired of this line being wheeled out hold a position analogous to “what’s the big deal, nothing bad happens, just relax”. In reality, that’s only 1 position. The other position is “I understand fully, the consequences, but the relentless doomer language is tiring in the face of continuing-to-not-eventuate”.
by FridgeSeal
4/9/2026 at 3:10:40 AM
What do you think of people that say that about climate change? It seems you don't understand fully. This is not the time go get tired, right before this actually starts impacting jobs and people in other ways.by MaybiusStrip
4/8/2026 at 7:03:59 AM
It’s more like the abusive parents telling the child that they’ll sell him to the scary man at the bus stop every time they want to coerce the child into doing what they want.Eventually the child develops disrespect for authority.
by kubb
4/8/2026 at 6:51:28 AM
This is just a really bad analogy. It doesn't addresses that there are multiple sources, the incentives to be telling us about it, and the spectrum between disaster-mitigation heroes and snake-oil salesmen.by athrowaway3z
4/8/2026 at 7:54:19 AM
Did you compare AI companies to parents and engineers actually delivering value to toddlers? AI companies cannot, in any capacity, be regarded as caretakers.by materialpoint
4/8/2026 at 12:21:18 PM
Sure, if the parent's stock price soared if the child dies.by haritha-j
4/8/2026 at 11:35:29 AM
Don’t take it personally but this amount of fear and paranoia about death on every corner sounds like a mental illness to me. Generalised Anxiety disorder to be precise. Maybe I am just not a parent.In any case there are substances and realiable methods that fix whatever paralyzing existential dread anyone struggles with daily.
Probably best to use conventional route but I personally use special low thc, high cbg weed once a week with a medical grade vaporizer and once a year (early autumn) a moderate dose of golden teacher mushrooms. Although I understand that most people perhaps couldn’t due to not managing their own business but on a strict employment contract with urine tests.
by juleiie
4/8/2026 at 12:49:54 PM
Are you suggesting these researchers somehow have wisdom and aren’t just guessing, and that everyone else are children too naive to understand the technology? It certainly sounds that way from the description you are attempting to apply.This is two parents disagreeing on whether their child will automatically grow up to be a psychopath with one parent constantly remarking “if you teach that child how to cut bread, they will stab everyone later. If you teach that child to drive, they will run over everyone later”, not the “parents know better” situation you describe.
by therealpygon
4/8/2026 at 1:51:54 PM
An analogy that’s, quite literally, an appeal to paternalism to trust the motivations and pernicious incentive structures of the big AI labs.by toraway
4/8/2026 at 7:40:20 AM
I'll have some of what you're havingby shafyy
4/8/2026 at 8:35:05 AM
This is literally one the most infantilizing and simultaneously insulting analogies I've ever come across on this site. Do you really think consumers of the latest AI tools have no ability to forecast? The parents in this analogy have every incentive to lieby bottom999mottob
4/8/2026 at 7:53:06 AM
[dead]by lpln3452
4/8/2026 at 6:18:04 AM
There is step changes that actually merit this though. And a zero day machine IS one of those. It went from 4% zero day success rate to 85% on firefox.Can you not see the significance of that?
by nbardy
4/8/2026 at 6:21:22 AM
I mean I work in this world and overhype is constant.Additionally those numbers are somewhat meaningless without more context.
by ofjcihen
4/8/2026 at 6:38:36 PM
Can you explain why they are meaningless without more context?by jstummbillig
4/8/2026 at 8:09:10 PM
A 0 day is just a vulnerability that wasn’t known before now.What’s the criticality of these? Are they realistically exploitable? En mass? Through a complex and highly contextual set of actions? What’s the impact? Etc etc etc.
Yes those numbers are a big change but they’re also not spelling doom for us in the security world until we actually know what they mean.
The demonstrated ones that they have on the red team blog are neat, the kernel chain is impressive and fun. But nothing I’m seeing here is as world ending as the presser implies.
by ofjcihen
4/9/2026 at 7:24:21 AM
> The demonstrated ones that they have on the red team blog are neat, the kernel chain is impressive and funSo by your estimation, for rogue actors being able to uncover hundreds of this class in each major software product roughly for free would not be a big issue?
by jstummbillig
4/9/2026 at 1:26:21 PM
We must have read two different red team blogs from Anthropic if that’s what you think is happening. But let’s go ahead and assume what you’re asking at face value.It would not be a doomsday issue as implied, no. Org security has gone far beyond static detections and “just exclude some IPs that fail to log in too much and we’re good”. SOAR exists. Behavioral analysis and monitoring exists. Layered defenses exist.
Believe it or not for those of us in security in large highly targeted companies we’ve been dealing with the potential for multiple chained 0 days for years and the processes, monitoring, and (yes, automated) response architecture is already there.
I get that this is absolutely frightening for some and that causes panic but for us this is Tuesday.
by ofjcihen
4/8/2026 at 7:56:53 AM
I side with you but on the other hand: this is how it works to get attention by those who aren't affiliated with computer science and AI.I am totally annoyed as well and put any buzzwords in my personal bs filter. Java was revolutionary, the Apple I etc. ;)
On the other hand I see progress! AI enriched press releases balance buzzwords and information way better than marketing of large companies did before AI.
I remember throwing away an instruction for an electronic toothbrush away because - I won't mention the name but have a look at the upper tier - instead of putting something like "Turn toothbrush on, choose mode by pressing..." it read "Take your super awesome premium masterpiece using patented technology for the first time in human life now available to you by us. Move your finger over to the innovative sensory surface, that uses material from rocket scientists and world leading designers".
No joke. These were text blocks and repeated - 30 pages for one compact one.
The toothbrush is top notch, except for the instructions.
by _the_inflator
4/8/2026 at 3:27:57 PM
Hahaha I think we might have the same toothbrush.That makes sense and I like the analogy.
by ofjcihen
4/8/2026 at 7:21:51 AM
I think Claude Code with Sonnet 4.6 is already at the level of paradigm shift and can change the entire tech industry.If you're paranoid it doesn't mean you're not being followed. If something is overhyped it doesn't mean it's not game-changing.
by alexey-salmin
4/8/2026 at 3:26:16 PM
Oh I agree with you on that. But that’s partially why the language in the presser falls flat for me.I mean as an example while web app pen testing I’ve been running and proxying all my traffic through it with instructions to find vulnerabilities with instructions telling it it’s a senior web app security export looking over my shoulder. It’s already great at that.
Ive even told it to do recon and run pen tests on lists of subdomains before (please for the love of god have the right harnesses and guardrails before you do this) and woken up to paid findings before.
So like I’m in a weird place where this was already happening and Mythos is being sold like it wasn’t good before?
End ramble :/
by ofjcihen
4/8/2026 at 2:10:41 PM
I came across this article just this morning saying AMD researchers, who hitherto have relied on Claude Code heavily, have noticed degraded performance in the recent update: https://www.theregister.com/2026/04/06/anthropic_claude_code...Claude Code and Glasswing are not the same, but presumably they have a lot of overlap under the hood. I feel like while AI is certainly advancing in major ways, there will always be the up and down of new software releases.
by davenporten
4/8/2026 at 12:32:24 PM
At launch, a technology is considered dangerous for being too powerful.3 months later, you are an absolute idiot to still be using that useless model. Are you not using glasswing 2-01 high? Oh, yeah, glasswing from 3 months ago is absolutely worthless, every viber knows, it's your fault for holding it wrong.
For once you should not get too excited for new models release and words and adjectives promising things. Honestly it's your fault humanity lost its humanity and we just have words words words and mass schizophrenia
by heliumtera
4/8/2026 at 1:01:48 PM
To me it makes absolutely zero sense that they would decide to not release the model to the public because of the effects that it would have due to its exploitation capabilities. Previous models were also capable of providing harmful information, yet that wasn't a problem, because models can actually be effectively censored using RHLF. So what is preventing Anthropic to simply forbid the model from letting people vibe-code exploits???by FiberBundle
4/9/2026 at 5:22:54 AM
Fully agree here. I think it’s evident more that mythos does not deliver step change results and more is a disappointment.. so let’s hype it up by further scaring the masses of its ‘mythic’ abilitiesby Jesus_piece
4/8/2026 at 11:10:04 AM
This looks more like another lobby group (quite a bad one) than something primarily focused on security.The "urgency" is very likely mostly appreciated to drive policy.
by raxxorraxor
4/8/2026 at 12:04:03 PM
I’ve lost trust in anything they say.The fear marketing is clearly intentional at this point.
by adam_patarino
4/8/2026 at 12:27:53 PM
And the complicit, click-thirsty tech media falls for it every time.by baggachipz
4/8/2026 at 12:06:54 PM
Everybody remembers the fable of the boy who cried wolf and how he died at the end. Left out of the story is the multiple other villagers who died of starvation because their flock of sheep was eaten. So because they didn't want to feel like suckers. Tuning out completely because of the existence of false positives is not a good choice.by DonsDiscountGas
4/8/2026 at 1:57:15 PM
Remember OpenAI decided GPT 2 was far too dangerous to unleash upon the world when they first trained it!by gchadwick
4/8/2026 at 3:43:17 PM
That's an editorialized headline. What they actually wrote was that it could be used to "generate misleading news articles, impersonate others online, automate the production of abusive or faked content to post on social media, [or] automate the production of spam/phishing content" and that they are aware other researchers have the ability to reproduce and open source their results, but this would give the community some time to decide how to proceed.They were correct.
by nearbuy
4/8/2026 at 4:07:26 PM
Hasn't almost every model created a paradigm shift lately? Maybe it's you who has moved the needle on what a paradigm shift means?by akmiller
4/8/2026 at 2:16:33 PM
> I can’t be the only person who’s getting tired of hearing about how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.There's a little bit of a grading your own homework aspect to companies being able to declare their new models revolutionary.
It doesn't mean they're wrong, but there is a clear conflict of interest.
by throwawayq3423
4/8/2026 at 7:50:47 AM
Well Opus 4.5/4.6 kinda was right?I mean software development has changed more since then than it has in my 30 year software development career.
by nl
4/8/2026 at 4:11:26 PM
https://news.ycombinator.com/item?id=47682262by aagha
4/9/2026 at 12:45:08 AM
a lot of times people cry wolf for a couple of times before wolf actually comes.i feel like theres a good chance that this is the actual wolf coming here. cause i was using opus for a lot and it's really good.
by mik09
4/8/2026 at 4:27:07 PM
It feels to me full with marketing in the guise of trying to save the world from their own making. "we have a model so strong we can't release it, here are all the details of why it's so good, but don't ask for access, you can't get it, it's too risky for your own good"Something smells really really weird:
1. Per the blog post[0]: "This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings"
Since they said it was patched, I tried to find the CVE, it looks like Mythos indeed found a 27 years old OpenBSD bug (fantastic), but it didn’t get a CVE and OpenBSD patched it and marked it as a reliability fix, am I missing something? [1]
2. From the same post, Anthropic red team decided to do a preview of their future responsible disclosure (is this a common practice?): "As we discuss below, we’re limited in what we can report here. Over 99% of the vulnerabilities we’ve found have not yet been patched" [0] So this is great, can't wait to see the actual CVEs, exploitability, likelihood, peer review, reproducibility, the kind of things the appsec community has been doing for at least the last 27 years since the CVE concept was introduced [2]
3. On the same day, an actual responsible disclosure, actual RCEs, actual CVEs, in Claude Code, that got discovered mostly because of the source code leak, I don't see anyone talking about it (you probably should upgrade your Claude Code though).
CVE-2026-35020 [3] CVE-2026-35021 [4] CVE-2026-35022 [5]
Do with this information as you may...
[0] https://red.anthropic.com/2026/mythos-preview/
[1] https://www.openbsd.org/errata78.html (look for 025)
[2] https://www.cve.org/Resources/General/Towards-a-Common-Enume...
[3] https://www.cve.org/CVERecord?id=CVE-2026-35020
by eranation
4/8/2026 at 7:04:18 AM
I agree I can’t open any social media no moreby jwpapi
4/9/2026 at 5:23:22 PM
Spell doom.. frfrby jonesn11
4/8/2026 at 7:30:03 AM
It’s great marketing to lead with how the n+1 model is so amazing that you can’t have it yet.by corranh
4/8/2026 at 7:36:18 AM
yeah, they gotta find a way to build hype on every new model releaseby Th3Alt3r
4/8/2026 at 4:28:11 PM
Agreed. Do we have any information on what these "vulnerabilities" actually are? Every vulnerability is typically immediately reported to CVE or NIST... are these "so destructive" they have to be kept behind closed doors? Give me a break...by fullstackchris
4/8/2026 at 12:45:45 PM
And every single time what they release is underwhelming.Remember how Sam spent like a year talking about how scary close GPT-5 was to AGI and then when it did finally come out... it was kinda meh.
by dkersten
4/8/2026 at 5:31:02 AM
> I would honestly go so far as to say the overhype is detrimental to actual measured adoption.I think you are a bit dishonest about how objectively you are measuring. From where I'm sitting, I don't know a lot of developers that still artisanally code like they did a few years ago. The question is no longer if they are using AI for coding but how much they are still coding manually. I myself barely use IDEs at this point. I won't be renewing my Intellij license. I haven't touched it in weeks. It doesn't do anything I need anymore.
As for security, I think enough serious people have confirmed that AI reported issues by the likes of Anthropic and OpenAI are real enough despite the massive amounts of AI slop that they also have to deal with in issue trackers. You can ignore that all you like. But I hope people that maintain this software take it a bit more seriously when people point out exploitable issues in their code bases.
The good news of course is that we can now find and fix a lot of these issues at scale and also get rid of whole categories of bugs by accelerating the project of replacing a lot of this software with inherently safer versions not written in C/C++. That was previously going to take decades. But I think we can realistically get a lot of that done in the years ahead.
I think some smart people are probably already plotting a few early moves here. I'd be curious to find out what e.g. Linus Torvalds thinks about this. I would not be surprised to learn he is more open to this than some people might suspect. He has made approving noises about AI before. I don't expect him to jump on the band wagon. But I do expect he might be open to some AI assisted code replacements and refactoring provided there are enough grown ups involved to supervise the whole thing. We'll see. I expect a level of conservatism but also a level of realism there.
by jillesvangurp
4/8/2026 at 5:36:41 AM
> From where I'm sitting, I don't know a lot of developers that still artisanally code like they did a few years ago.You don't know a lot of developers then.
by junon
4/8/2026 at 5:47:55 AM
I do. The good ones use AI.by literalAardvark
4/8/2026 at 5:01:18 PM
You are in a bubble. Some segments use essentially no AI, while others have gone all in. Just because the type of engineers you're surrounded by do engineering that is obsolete doesn't mean that's the case across the board. All the best game engineers I know still write at least 90% of the code (probably closer to 99%). The bad ones use AI nearly exclusively - just like yourself. They can't create very complex or performant game systems, and they struggle even with highly unique or interactive game UI systems. I've looked over their code; almost every choice is bad, and it's clear why their projects completely collapse after a certain point. They simply can't build super complex, performant, or novel systems.I'm going to assume you do the type of engineering where all the hard problems are solved for you already, and you are merely connecting inputs/outputs and hooking up APIs. Because, frankly, the value in "software plumbing" is gone; anyone with a Claude license can do that now.
by jenniferhooley
4/10/2026 at 8:05:19 AM
You're condescending for no valid reason and I will tell you that what you say is not correct. Models superseded "plumbing" tasks and went well into the engineering grounds a generation or two ago already. Evidence is plenty. We see models perfectly capable reasoning about the kernel code yet you're convinced that game engines are somewhat more special. Why? There're plenty of such examples where AI is successfully applied to hard engineering tasks (database kernels), and where it became obvious that the models are almost perfectly capable reasoning about that tbh quite difficult code. I think you should reevaluate your stance and become more humble.by menaerus
4/10/2026 at 2:19:57 PM
Link me the research on the hard engineering tasks they've done on database kernels, I'd love to see it, sounds interesting.As long as people comment, "Only bad/stupid engineers hand-write code because LLMs are better in every way," and that's objectively not true in various engineering circles, I'll keep trolling them and being just as hyperbolic in the inverse because it amuses me. Don't take things too seriously on the internet; you'll have a bad time ;)
by jenniferhooley
4/9/2026 at 9:15:30 AM
> They simply can't build super complex, performant, or novel systems.Neither can single humans.
If you introduce some reasonable constraints AI will come out ahead most of the time, especially for optimization cases where AI will run circles around your average programmer and is perfectly happy to inline some ASM for you.
You still have bespoke cordwainers/cobblers 100 years after that process has been well and truly automated. But they're rare and almost nobody cares.
by literalAardvark
4/9/2026 at 10:35:30 AM
Inking ASM isn't generally a good thing. This line of commenting reeks of confidently incorrect energy.by junon
4/9/2026 at 12:38:19 PM
Inlining*by junon
4/8/2026 at 5:50:44 AM
> I think you are a bit dishonest about how objectively you are measuringAs someone who has made a sizable amount of money in security research while using Claude you might be right but not in the way you think.
by ofjcihen
4/8/2026 at 7:35:04 AM
Do you think they're lying about the vulnerabilities they claim Mythos has found? Seems like a very short-term play, if so.by AlexCoventry
4/8/2026 at 7:51:11 AM
[dead]by blairharper