3/24/2026 at 8:02:17 AM
I am kind of amazed at how many commenters respond to this result by confidently asserting that LLMs will never generate 'truly novel' ideas or problem solutions.> AI is a remixer; it remixes all known ideas together. It won't come up with new ideas
> it's not because the model is figuring out something new
> LLMs will NEVER be able to do that, because it doesn't exist
It's not enough to say 'it will never be able to do X because it's not in the training data,' because we have countless counterexamples to this statement (e.g. 167,383 * 426,397 = 71,371,609,051, or the above announcement). You need to say why it can do some novel tasks but could never do others. And it should be clear why this post or others like it don't contradict your argument.
If you have been making these kinds of arguments against LLMs and acknowledge that novelty lies on a continuum, I am really curious why you draw the line where you do. And most importantly, what evidence would change your mind?
by qnleigh
3/24/2026 at 9:59:30 AM
LLMs can generate anything by design. LLMs can't understand what they are generating so it may be true, it may be wrong, it may be novel or it may be known thing. It doesn't discern between them, just looks for the best statistical fit.The core of the issue lies in our human language and our human assumptions. We humans have implicitly assigned phrases "truly novel" and "solving unsolved math problem" a certain meaning in our heads. Some of us at least, think that truly novel means something truly novel and important, something significant. Like, I don't know, finding a high temperature superconductor formula or creating a new drug etc. Something which involver real intelligent thinking and not randomizing possible solutions until one lands. But formally there can be a truly novel way to pack the most computer cables in a drawer, or truly novel way to tie shoelaces, or indeed a truly novel way to solve some arbitrary math equation with an enormous numbers. Which a formally novel things, but we really never needed any of that and so relegated these "issues" to a deepest backlog possible. Utilizing LLMs we can scour for the solutions to many such problems, but they are not that impressive in the first place.
by Yizahi
3/24/2026 at 10:41:35 AM
If LLMs can come up with formerly truly novel solutions to things, and you have a verification loop to ensure that they are actual proper solutions, I don't understand why you think they could never come up with solutions to impressive problems, especially considering the thread we are literally on right now? That seems like a pure assertion at this point that they will always be limited to coming up with truly novel solutions to uninteresting problems.by logicprog
3/24/2026 at 11:20:57 AM
"Truly novel" is fast becoming a True Scotsman.by eru
3/24/2026 at 8:53:38 AM
I've been working on a utility that lets me "see through" app windows on macOS [1] (I was a dev on Apple's Xcode team and have a strong understanding of how to do this efficiently using private APIs).I wondered how Claude Code would approach the problem. I fully expected it to do something most human engineers would do: brute-force with ScreenCaptureKit.
It almost instantly figured out that it didn't have to "see through" anything and (correctly) dismissed ScreenCaptureKit due to the performance overhead.
This obviously isn't a "frontier" type problem, but I was impressed that it came up with a novel solution.
by LatencyKills
3/24/2026 at 9:18:38 AM
That's actually pretty cool. What made you think of doing this in the first place?by skc
3/24/2026 at 9:33:13 AM
Thanks! I've been doing a lot of work on a laptop screen (I normally work on an ultrawide) and got tired of constantly switching between windows to find the information I need.I've also added the ability to create a picture-in-picture section of any application window, so you can move a window to the background while still seeing its important content.
I'll probably do a Show HN at some point.
by LatencyKills
3/24/2026 at 10:21:25 AM
Why is ScreenCaptureKit a bad choice for performance?by saagarjha
3/24/2026 at 10:34:21 AM
Because you can't control what the content server is doing. SCK doesn't care if you only need a small section of a window: it performs multiple full window memory copies that aren't a problem for normal screen recorders... but for a utility like mine, the user needs to see the updated content in milliseconds.Also, as I mentioned above, when using SCK, the user cannot minimize or maximize any "watched" window, which is, in most cases, a deal-breaker.
My solution runs at under 2% cpu utilization because I don't have to first receive the full window content. SCK was not designed for this use case at all.
by LatencyKills
3/24/2026 at 9:12:00 AM
What was the solution?by stavros
3/24/2026 at 9:28:33 AM
Well, I'm not going to share either solution as this is actually a pretty useful utility that I plan on releasing, but the short answer is: 1) don't use ScreenCaptureKit, and 2) take advantage of what CGWindowListCreateImage() offers through the content server. This is a simple IPC mechanism that does not trigger all the SKC limitations (i.e., no multi-space or multi-desktop support). In fact, when using SKC, the user cannot even minimize the "watched" window.Claude realized those issues right from the start.
One of the trickiest parts is tracking the window content while the window is moving - the content server doesn't, natively, provide that information.
by LatencyKills
3/24/2026 at 9:49:24 AM
Huh, Claude one-shotted it out of a single message from me. Man, LLMs have gotten good.by stavros
3/24/2026 at 9:57:09 AM
No it didn't. Like I said... it may have gotten something that worked but there is no way Claude got it to work while supporting multi-spaces, multi-desktops, and using under 2% cpu utilization. My solution can display app window content even when those windows are minimized, which is not something the content server supports.My point was that Claude realized all the SKC problems and came up with a solution that 99% of macOS devs wouldn't even know existed.
by LatencyKills
3/24/2026 at 10:19:09 AM
> it may have gotten something that worked but there is no way Claude got it to work while supporting multi-spaces, multi-desktops, and using under 2% cpu utilization.Maybe, but that's the magic of LLMs - they can now one-shot or few-shot (N<10) you something good enough for a specific user. Like, not supporting multi-desktops is fine if one doesn't use them (and if that changes, few more prompts about this particular issue - now the user actually knows specifically what they need - should close the gap).
by TeMPOraL
3/24/2026 at 9:07:40 AM
Most inventions are an interpolation of three existing ideas. These systems are very good at that.by SequoiaHope
3/24/2026 at 10:18:41 AM
My take as well. Furthermore, most innovations come relatively shortly after their technological prerequisites have been met, so that suggests the "novelty space" that humans generally explore is a relatively narrow band around the current frontier. Just as humans can search through this space, so too should machines be capable of it. It's not an infinitely unbounded search which humans are guided through by some manner of mystic soul or other supernatural forces.by mikkupikku
3/24/2026 at 11:00:22 AM
I can't even find a good example of an invention that is not an interpolation.by fsflover
3/24/2026 at 8:52:25 AM
It is like not trusting someone who attained highest score in some exam by by-hearting the whole text book, to do the corresponding job.Not very hard to understand.
by qsera
3/24/2026 at 10:22:42 AM
Yet we do that all the time by hiring based on GPA/degree.by TeMPOraL
3/24/2026 at 10:29:05 AM
Do we? I've never been asked for my grades unless it was for filling in a visa, and my degree is in a marginally related field.by ori_b
3/24/2026 at 8:23:16 AM
> e.g. 167,383 * 426,397 = 71,371,609,051They may be wrong, but so are you.
by jacquesm
3/24/2026 at 9:07:05 AM
No, its correct:by KellyCriterion
3/24/2026 at 11:13:16 AM
You missed the point.by jacquesm
3/24/2026 at 9:56:14 AM
You could have just checked the math yourself, you know.by swingboy
3/24/2026 at 11:14:43 AM
My pocket calculator says the same thing and it doesn't even have training data.by jacquesm
3/24/2026 at 8:35:28 AM
[dead]by tourist2d
3/24/2026 at 8:23:49 AM
Beliefs are not rooted in facts. Beliefs are a part of you, and people aren't all that happy to say "this LLM is better than me"by tornikeo
3/24/2026 at 8:41:45 AM
I'm very happy to say calculators are far better than me in calculations (to a given precision). I'm happy to admit computers are so much better than me in so many aspects. And I have problem saying LLMs are very helpful tools able to generate output so much better than mine in almost every field of knowledge.Yet, whenever I ask it to do something novel or creative, it falls very short. But humans are ingenious beasts and I'm sure or later they will design an architecture able to be creative - I just doubt it will be Transformer-based, given the results so far.
by benterix
3/24/2026 at 9:07:33 AM
But the question isn't whether you can get LLMs to do something novel, it's whether anyone can get them to do something novel. Apparently someone can, and the fact that you can't doesn't mean LLMs aren't good for that.by stavros
3/24/2026 at 11:05:38 AM
When it comes to LLMs doing novel things, is it just the infinite monkey theorem[0] playing out at an accelerated rate, helped along by the key presses not being truly random?Surely if we tell the LLM to do enough stuff, something will look novel, but how much confirmation bias is at play? Tens of millions of people are using AI and the biggest complaint is hallucinations. From the LLMs perspective, is there any difference between a novel solution and a hallucination, other than dumb luck of the hallucination being right?
by al_borland
3/24/2026 at 11:07:24 AM
This argument doesn't go the way you want it to go. Billions of people exist, but maybe a few tens of thousands produce novel knowledge. That's a much worse rate than LLMs.by stavros
3/24/2026 at 9:58:51 AM
To have a proper discussion we would have to define the word "novel" and that's a challenge in itself. In any case, millions of poeple tried to ask LLMs to do something creative and the results were bland. Hence my conclusion LLMs aren't good for that. But I'm also open they can be an element of a longer chain that could demonstrate some creativity - we'll see.by benterix
3/24/2026 at 9:48:10 AM
Novel is a tricky word. In this case, the LLM produced a python program that was similar to other programs in its corpus, and this oython program generated examples of hypergraphs that hadn't been seen before.That's a new result, but I don't know about novel. The technique was the same as earlier work in this vein. And it seems like not much computational power was needed at all. (The article mentions that an undergrad left a laptop running overnight to produce one of the previous results, that's absolute peanuts when compared to most computational research).
by tovej
3/24/2026 at 9:50:00 AM
I have never seen a human produce a Python program that wasn't similar to other programs they'd seem.by stavros
3/24/2026 at 8:27:24 AM
It's not possible to know something without believing it to be true. https://en.wikipedia.org/wiki/Belief#/media/File:Classical_d...by ChrisGreenHeur
3/24/2026 at 9:01:23 AM
This is objectively wrong. If that was the case every scientist performing a test would have always had their expectations and beliefs proven true. If you're trying to disprove something also because you believe it to be wrong you would never be proven wrong.by bilekas
3/24/2026 at 9:12:51 AM
Do we know for a fact that LLMs aren't now configured to pass simple arithmetic like this in a simpler calculator, to add illusion of actual insight?by veltas
3/24/2026 at 9:40:04 AM
You can train a LLM on just multiplication and test it on ones it has never seen before, it's nothing particularly magical.by GaggiX
3/24/2026 at 9:49:36 AM
It's not 'magic' though but previously LLMs have performed very badly on longer multiplication, 'insight' is the wrong word but I'm saying maybe they're not wildly better at this calculation... maybe they are just optimising these well known jagged edges.by veltas
3/24/2026 at 8:35:22 AM
The hardest part about any creativity is hiding your influencesby PUSH_AX
3/24/2026 at 10:48:19 AM
When I read through what they're doing? It sure doesn't sound like it's generating something new as people typically think of it. The link, they provide a very well defined problem and they just loop through it.I think you're arguing with semantics.
by cyanydeez
3/24/2026 at 8:50:46 AM
>>AI is a remixer; it remixes all known ideas together. It won't come up with new ideasI always found this argument very weak. There isn't that much truly new anyway. Creativity is often about mixing old ideas. Computers can do that faster than humans if they have a good framework. Especially with something as simple as math - limited set of formal rules and easy to verify results - I find a belief computers won't beat humans at it to be very naive.
by bluecalm
3/24/2026 at 8:39:11 AM
Yes! I call these the "it's just a stochastic parrot" crowd.Ironically, they are the stochastic parrots, because they're confidently repeating something that they read somehwere and haven't examined critically.
by ekjhgkejhgk
3/24/2026 at 8:28:31 AM
I guess when it can't be tripped up by simple things like multiplying numbers, counting to 100 sequentially or counting letters in a string without writing a python program, then I might believe it.Also no matter how many math problems it solves it still gets lost in a codebase
by bdbdbdb
3/24/2026 at 9:22:30 AM
LLMs are bad at arithmetic and counting by design. It's an intentional tradeoff that makes them better at language and reasoning tasks.If anybody really wanted a model that could multiply and count letters in words, they could just train one with a tokenizer and training data suited to those tasks. And the model would then be able to count letters, but it would be bad at things like translation and programming - the stuff people actually use LLMs for. So, people train with a tokenizer and training data suited to those tasks, hence LLMs are good at language and bad at arithmetic,
by fenomas
3/24/2026 at 8:40:52 AM
Arguments like "but AI cannot reliably multiply numbers" fundamentally misunderstand how AI works. AI cannot do basic math not because AI is stupid, but because basic math is an inherently difficult task for otherwise smart AI. Lots of human adults can do complex abstract thinking but when you ask them to count it's "one... two... three... five... wait I got lost".by anal_reactor
3/24/2026 at 8:55:46 AM
> fundamentally misunderstand how AI worksWho does fundamentally understand how LLMs work? Many claims flying around these days, all backed by some of the largest investments ever collectively made by humans. Lots of money to be lost because of fundamental misunderstandings.
Personally, I find that AI influencers conveniently brush away any evidence (like inability to perform basic arithmetic) about how LLMs fundamentally work as something that should be ignored in favor of results like TFA.
Do LLMs have utility? Undoubtedly. But it’s a giant red flag for me that their fundamental limitations, of which there are many, are verboten to be spoken about.
by datsci_est_2015
3/24/2026 at 9:10:59 AM
You're not doing yourself a favor when you point out "but they can't do arithmetic!" as if anyone says otherwise. Yes, we all know they can't do arithmetic, and that's just how they work.I feel like I'm saying "this hammer is so cool, it's made driving nails a breeze" and people go "but it can't screw screws in! Why won't anyone talk about that! Hammers really aren't all they're cracked up to be".
by stavros
3/24/2026 at 9:34:06 AM
Maybe because society has invested $trillions into this hammer and influencers are trying to convince CEOs to fire everyone and buy a bunch of hammers instead.My comment even said “LLMs have utility”. I gave an inch, and now the mile must be taken.
by datsci_est_2015
3/24/2026 at 9:35:55 AM
Saying that the fundamental limitations are things like counting the number of rs in strawberry is boring, though. That's how tokens work and it's trivial to work around.Talking about how they find it hard to say they aren't sure of something is a much more interesting limitation to talk about, for example.
by stavros
3/24/2026 at 10:01:52 AM
> Talking about how they find it hard to say they aren't sure of something is a much more interesting limitation to talk about, for example.Sure, thank you for steelmanning my argument. I didn’t think I needed to actually spell out all of the fundamental limitations of LLMs in this specific thread. They are spoken at length across the web, but are often met with pushback, which was my entire point.
Here’s another one: LLMs do not have a memory property. Shut off the power and turn it back on and you lose all context. Any “memory” feature implemented by companies that sell LLM wrappers are a hack on top of how LLMs work, like seeding a context window before letting the user interact with the LLM.
by datsci_est_2015
3/24/2026 at 10:06:03 AM
But that's also like saying "humans don't have a memory property, any 'memory' is in the hippocampus". It's not useful to say that "an LLM you don't bother to keep training has no memory". Of course it doesn't, you removed its ability to form new memories!by stavros
3/24/2026 at 10:58:55 AM
So why then do we stop training LLMs and keep them stored at a specific state? Is it perhaps because the results become terrible and LLMs have a delicate optimal state for general use? This sounds like an even worse case for a model of intelligence.by datsci_est_2015
3/24/2026 at 11:04:02 AM
Nope, it's not that, but it's nice of you to offer a straw man. Makes the argument flow better.by stavros
3/24/2026 at 9:31:04 AM
Because know one owns a $300 billion dollar hammer that literally runs on fancy calculators.by TheSpiceIsLife
3/24/2026 at 8:38:31 AM
[dead]by tourist2d