Evaluating and mitigating the growing risk of LLM-discovered 0-days

2/7/2026 at 2:19:58 AM

The post is light on details, and I agree with the sentiment that it reads like marketing. That said, Opus 4.6 is actually a legitimate step up in capability for security research, and the red team at Anthropic – who wrote this post – are sincere in their efforts to demonstrate frontier risks.

Opus 4.6 is a very eager model that doesn't give up easily. Yesterday, Opus 4.6 took the initiative to aggressively fuzz a public API of a frontier lab I was investigating, and it found a real vulnerability after 100+ uninterrupted tool calls. That would have required lots of of prodding with previous models.

If you want to experience this directly, I'd recommend recording network traffic while using a web app, and then pointing Claude Code at the results (in Chrome, this is Dev Tools > Network > Export HAR). It makes for hours of fun, but it's also a bit scary.

by lebovic

2/7/2026 at 6:57:49 AM

Wondering how many of these memory errors would be caught by running the Clang Static Analyzer (or similar) on them.

https://clang-analyzer.llvm.org

Alternatively, testing these projects with ASan enabled:

https://clang.llvm.org/docs/AddressSanitizer.html

by nielsbot

2/6/2026 at 3:19:23 PM

Glad to see that they brought in humans to validate and patch vulnerabilities. Although, I really wish they linked to the actual patches. Here's what I could find:

https://cgit.ghostscript.com/cgi-bin/cgit.cgi/ghostpdl.git/c...

https://github.com/OpenSC/OpenSC/pull/3554

https://github.com/dloebl/cgif/pull/84

by samfundev

2/7/2026 at 12:16:46 AM

Yeah, having a layer of human experts to sanity check and weed out hallucinated false positive issues seems like an important part of this process:

> To ensure that Claude hadn’t hallucinated bugs (i.e., invented problems that don’t exist, a problem that increasingly is placing an undue burden on open source developers), we validated every bug extensively before reporting it. [...] for our initial round of findings, our own security researchers validated each vulnerability and wrote patches by hand. As the volume of findings grew, we brought in external (human) security researchers to help with validation and patch development.

Based on the experiences shared by curl's maintainers over the last couple of years, resulting in them ending their bug bounty program [1] [2] [3], I'd suggest the "growing risk of LLM-discovered [security issues]" is primarily maintainers being buried under a deluge of low-effort zero-value LLM-hallucinated false positive security issue reports, where the reporter copy-pastes LLM output without validation.

[1] https://daniel.haxx.se/blog/2026/02/03/open-source-security-...

[2] https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-b...

[3] https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s...

by shoo

2/7/2026 at 6:24:31 AM

Ending a bug bounty program seems like a mistake.

Why not just change the incentives? Don't pay for patches. Move the money over to human review of the infinite cesspool with an emphasis on how the findings are presented. Maintainers rank and filter by how concise the reviews are and how critical the bugs are. Stop allowing wide open pull requests for bugs and make that it's own new workflow.

Bugs rarely happen in isolation and many are regressions. Many are related to features added or refactors. Fixing bugs should be more about understanding the nature of the project than just playing whack-a-mole. LLMs don't have as good of a memory as humans and much of the meta discussion would be out-of-band for the LLMs. We shouldn't be paying for monkey work. We should be paying the humans that deeply understand "the lore" of the project and can apply it in a meaningful way.

In the first place, it's a long time coming that some maintainers feel the pressure to take the direction of the projects more seriously, and in some cases let others step up. So many open source projects need to be stop being the stereotype of lone genius pet projects or cultish power grabs. When people whine about open source not getting paid, this is the real reason why. It's not that the money or value isn't there, but a lack of confidence in the maintainers.

by sublinear

2/6/2026 at 11:06:02 PM

Grepping for strcat() is at the "forefront of cybersecurity"? The other one that applied a GitHub comment to a different location does not look too difficult either.

Everything that comes out of Anthropic is just noise but their marketing team is unparalleled.

by tznoer

2/7/2026 at 12:23:00 AM

Did they discover a vulnerability or not?

by blackqueeriroh

2/7/2026 at 12:31:04 AM

Not

by dmbche

2/7/2026 at 5:24:33 AM

> Our view is this is a moment to move quickly—to empower defenders and secure as much code as possible while the window exists.

Yawn.

by catlifeonmars

2/7/2026 at 2:33:32 AM

"Evaluating and mitigating the growing risk of LLM-developed 0-days" would be much more interesting and useful. Try harder, guys.

by username223

2/7/2026 at 12:13:33 AM

Is there a polymarket on the first billion dollar AI company to 0$ by their own insecure Model deployment?

by cyanydeez

2/7/2026 at 12:04:04 AM

This reads like an advertisement for Anthropic, not a technical article.

by octoberfranklin

2/7/2026 at 12:23:24 AM

Okay, so if that’s the case, what do you have that’s constructive to say about it?

by blackqueeriroh

2/7/2026 at 1:03:34 AM

Their comment was constructive for me, now I’m not going to read the article.

by irishcoffee