5/18/2026 at 1:31:17 PM
You can find the 4 versions of Benedict's deck here: https://www.ben-evans.com/presentations I appreciate the temporal view into this thinking. My interpretation:Nov 2024: Don’t dismiss this; it may be the next platform shift. But the actual questions are still unsettled: scaling, usefulness, deployment, and business model.
May 2025: The model layer is already showing signs of commoditization, so the important question shifts toward deployment: products, use cases, UX, errors, and enterprise adoption.
Nov 2025: The capital cycle has become the story: everyone is spending because missing the platform shift is worse than overbuilding, but there is still no clarity on product shape, moats, or value capture. That creates bubble-like dynamics.
May 2026: Provisional thesis: models look likely to become infrastructure, while value probably moves up-stack into apps, workflows, product, proprietary data/context, GTM, and new questions made possible by cheap automation. But he is still explicitly calling this provisional.
by btucker
5/18/2026 at 3:56:21 PM
Thanks for the summary. I do love Benedict‘s work; I find he’s one of the few commentators who consistently strikes a balance between taking the transformative potential of AI seriously while not falling over into hype.Some things that stand out:
* He’s really good with his historical analogies, especially looking at previous transformations like the early Internet and mobile; no surprise given that he has a history degree.
* he emphasizes over and over how we have still have no idea how all of this is going to work when the dust settles. I think that’s kind of a historian’s move as well. When you look at what people were saying during the early days of the web, for example, almost all of their predictions weren’t just wrong… in hindsight, given how the future played out, they were asking the wrong questions. The implication is that we are probably asking the wrong questions about AI too.
* Nonetheless his thesis about the commoditization of models is actually a fairly strong concrete prediction. i’m not sure if I agree with it entirely, but I do keep it in mind every time I look at the valuation of leading AI labs.
* he continually makes the point that a chat bot is barely a product and that AI labs have so far had very little success in delivering products above that layer… with the exception of coding agents, of course.
by libraryofbabel
5/19/2026 at 10:11:23 AM
I just got a bit triggered by the "hype" word. What if the hype was real? It is easy to say that nobody knows how all of this is going to work, and I would say it is a prudent thing to say, but there is value in making a bold prediction from the start instead of just updating your view to respond to change. In one case you are predicting stuff, in the other, just reacting.But I absolutely agree that in hindsight we are often asking the wrong questions about each new technology.
I keep seeing on HN that AI is a hype, and many here are anti AI (which I get, as a programmer AI made my job less interesting, and I'm even worried about losing it), but where has AI underdelivered?
by pingou
5/20/2026 at 1:25:09 PM
I agree, especially the juxtaposition of "we have still have no idea how all of this is going to work when the dust settles" and "hype". If we don't know, then there is a chance it isn't a hype.For example, now it may seem that the models are becoming mere infrastructure, and the value moves up to apps and data. But if the models of tomorrow become able to write the apps themselves, then the value moves back. I won't need to pay some to write me a wrapper for the LLM, if the LLM will be able to write the same wrapper, maybe even better because it will be customized for my needs. The app providers are currently profiting from the gap between "what a software company can do using the AI" and "what the AI can do unaided", but that gap is going to shrink, possibly to zero.
by Viliam1234
5/19/2026 at 11:45:21 AM
The hype is in what AI delivers (at least so far). I would never create a PR without an AI review. I will ask an AI to write code for me from time to time.But it still has huge gaps in quality. And from time to time, it shows me that it doesn’t really understand things. You might point out that how is that any different from your mediocre engineer. But for most people skilled enough, you can easily know the difference when someone doesn’t really know something.
With AI, you discover this after reading several pages being dumped on you by people being “more productive” with AI.
by elvis10ten
5/19/2026 at 12:20:36 PM
Ok so the hype would be people saying AI can currently do something well and autonomously when it cannot (or not consistently enough), and it is easy to prove them wrong.But I feel like people are more hyped about what the AI will be able to do soon rather than what it can do now.
I think AI does understand things (depending on your definition), how else could we communicate and ask it a question if it didn't? I mean we're quite far from Eliza here.
And yes, often their answer would be so wrong that we think it is impossible that AI understands anything, but this jagged intelligence doesn't prove, at least to me, that there isn't some understanding. At what point do we say that AI understands things? What if we can reduce 99% of those dumb failures, would we then say than AI understands?
by pingou
5/19/2026 at 1:12:40 PM
>I think AI does understand things (depending on your definition), how else could we communicate and ask it a question if it didn't?by GolfPopper
5/19/2026 at 1:52:35 PM
That doesn't really respond to the question though - there is a quite reasonable argument that the Chinese room as a system 'understands' things.The issue that is hit immediately is we don't have a definition or test of understanding that AI doesn't clear easily. Then on top of that we can't even really be sure that we ourselves are understand things given all the tricks that our minds play with memory and perception. There is precious little evidence that the people around us understand things, they seem to be guessing. It is completely unclear if a Chinese room has or doesn't have a property if we rule out all the tests that check for it as not really counting. But all the tests we can do suggest it does understand, because engineers can implement Chinese rooms now and they even turn out to be more reliably artistic/capable of novel thinking/creative than humans. Anything that tests understanding they can do.
by roenxi
5/19/2026 at 10:05:33 PM
> and it is easy to prove them wrongNo, they just say you are using the wrong model or something.
If it's a coworker dumping reviews of crap code on you at work, the incentive is to blanket approve everything because otherwise you're just the grumpy old man who is resisting innovation. No matter that the code makes no sense at all and the tests aren't actually testing what they should test.
by LtWorf
5/19/2026 at 1:09:05 PM
>where has AI underdelivered?Other than the stock market (which seems decoupled from reality at the moment), where has AI delivered?
The only use case where I see anything resembling AI delivering on it's promises is software, and my personal experience with that is that everything that comes out of the teams using AI is destructively broken. (Where they used to be able to deliver software that worked, even if it wasn't ideal, now they reliably make things worse and their stuff doesn't work when used.)
by GolfPopper
5/18/2026 at 4:26:24 PM
> for example, almost all of their predictions weren’t just wrong… in hindsight, given how the future played out, they were asking the wrong questionDo you have an example of this? My (poor) memory remembers "it's going to change how people buy things", was the big deal at the time, and it seems like it was a great prediction.
by zetsurin
5/18/2026 at 5:38:39 PM
Well, yes, but as the other commenter says, that’s a very broad general statement akin to something like “AI will change knowledge work“. That’s certainly true, but how? What are the details? What kind of companies are going to be the winners and what kind will be losers, or end up with commodity margins, like the telcos did after the mobile revolution? What is the pricing structure going to look like?I suppose a concrete example in 1997 would be that a lot of companies thought the future of e-commerce was setting up a store on AOL, that people would use while sitting down at a desktop PC. Obviously it didn’t turn out quite that way. Furthermore, the Internet enabled new kinds of ways to buy things that weren’t even envisioned in the pre-Internet pre-smartphone world: think Airbnb and Uber.
Predictions are hard, especially about the future. Most predictions reflect the worldview and biases of the time in which they are made: think about all the vintage sci-fi from the 60s 70s and 80s that actually reads or looks kind of retro now. Similarly, our predictions of the future will look kind of retro and strange to someone living in the 2030s or 2040s. If studying history has any lesson to teach us, it’s really just this: that the past is an alien world with alien moods of thinking, and that our moment in time will look similarly alien to people in the future who choose to look back and analyze it closely.
This isn’t an argument that we should stop trying to make predictions. We need to, but it is an argument for humility, and also for questioning all your assumptions that you might be importing.
by libraryofbabel
5/18/2026 at 4:47:08 PM
That's a very vague prediction that took decades to bear fruit. The concrete predictions behind the investments into companies like Pets.com and Webvan failed. It took the survivors like Ebay/Paypal and Amazon to build the digital payment and shipping infrastructure over decades until cultural acceptance hit critical mass.by akiselev
5/18/2026 at 8:43:57 PM
Agreed, I appreciate his historical perspective, but I think one critical mistake his posts make is implying, largely because the parallels to history have been similar so far, that history will repeat.Like, yes, the telecom bubble was a clear case of overbuilding and the AI data center "bubble" looks a lot like that... but this overlooks that the fiber capacity being laid back then far outstripped the demand, whereas all the compute providers today have been desperately crunched for capacity, despite investing almost a trillion in CapEx -- to the tune of almost a trillion dollars more of backlog -- for multiple quarters now.
Or yes, historically new technology has always created new jobs... but all those new jobs required a higher skill level along dimensions that current AI models are already good at, meaning we've never had a technological revolution quite like this.
Or yes, prior technological revolutions consigned incumbents to irrelevancy, primarily due to shifts in technical platforms... but then today's business leaders are 1) very well educated about what happened to their predecessors, 2) very paranoid about the same thing happening to them, and hence 3) are actively making moves to capitalize on the next platform shift.
I also think his dismissal of chatbots is a bit premature. It is precisely because chatbots operate via an extremely simple, flexible and natural modality, i.e. a conversation -- entirely unconstrained by the form factor necessitated by any app -- that their infinite use-cases have become unleashed.
My take is that the AI labs are actively exploiting this extreme flexibility to surface valuable use-cases -- one of the hardest parts of innovation -- at which point they can simply slap an agent on top of them. Which is, yet again, simply a chatbot, except one that can actually do useful things for you and hence can be charged for a lot more money.
by keeda
5/18/2026 at 8:49:57 PM
I didn’t make any comparison at all with the fibre bubble, for precisely that reason. The comparison is with mobile data, which was and is always behind capacity.I think one of the things that the usage data shows us is that chatbots absolutely do not have infinite use cases - most users only use them a day or two a week or less.
by benedictevans
5/18/2026 at 10:54:39 PM
That's fair, I may be conflating your takes on mobile data with others who've made the comparison to the telecom bubble, and if so, mea culpa!But I also do disagree with the take that usage patterns indicate a fundamental shortage of use-cases. Yes, everyone reports WAU instead of DAU because WAU numbers look much more impressive, but I think the extreme shortage of compute plays a major role in this. I suspect all the AI labs are deliberately holding back from pushing AI adoption too much because of this. (Google execs have even made comments internally to this effect.) Note that even at such low frequency of usage all the model providers are desperately strapped for compute, which means there is insanely high demand from some quarters.
One way how capacity limitations could impact adoption is that the free-tier models are not as good as the frontier ones, so the free users come away less impressed with AI capabilities, leading to lower regular usage. This problem is larger than it appears, because it can take a long time to figure out how to get AI to work for your use-case, and people simply have not experimented nearly enough, partially due to first impressions. On the other hand, most companies seem to be OK with huge tokenmaxxing bills!
It seems to me the AI players are all playing a delicate balancing game across three fundamental dimensions: adoption, monetization, capacity. That is, they are simultaneously 1) pushing free / cheap AI usage as much as possible to hook users, capture market share and suss out new use-cases, while 2) carefully allocating token quotas for the most lucrative use-cases to satisfy investors, and 3) balancing available compute between those two competing priorities. I suspect as the compute bottleneck is alleviated and frontier models become more accessible cheaply, we'll see way higher DAU numbers.
by keeda
5/19/2026 at 11:39:50 AM
> new technology has always created new jobs... but all those new jobs required a higher skill levelThe industrial revolution didn't seem to require any particular special skill at all. Just anyone who was willing to tend to a machine all day. (Maybe that's a parallel...)
by StilesCrisis
5/20/2026 at 1:29:28 AM
You're right actually, what I really meant to say is a "higher-level skill" rather than a "higher skill-level." Higher-level skills don't necessarily mean a more difficult skill, they're usually just at a higher level of abstraction.Specifically, the 3 dimensions along with new jobs required new skills were: a) cognitive, b) technical or c) social skills. I guess tending to a machine was a mix of a) and b), because even if the controls were straightforward, it probably required some understanding of the underlying mechanisms.
by keeda
5/18/2026 at 1:35:11 PM
I think that DeepSeek may be important to that. They have a really good model that's open source, raising the bar for all other players: how good your model needs to be so you can make meaningful money on it (better than DeepSeek).Same thing happened on other places the open source offering became popular.
by flossly
5/18/2026 at 3:11:46 PM
I think the original DeepSeek moment seemed important. And yes, the more recent model is good, but there are multiple. This commodification trend spans many different companies, including Kimi 2.5/2.6 and GLM5.1, and even Google itself with its Gemma models. There are a dozen models that exist at roughly the frontier from 6 months ago at 1/10th the cost.by mchusma
5/18/2026 at 4:13:02 PM
> that exist at roughly the frontierno disagree, specifics matter.. There are a dozen well-defined LLM application subject areas that are regularly tested.. one overall grade IMO lacks important detail.. To go a bit abstract, it is ironic that "oversimplification" in the discussion of these complex machines mirrors the effects on information of the automations themselves.. constantly simplifying, substituting and diluting real meaning
by mistrial9
5/18/2026 at 1:48:43 PM
What good is an open-weights DeepSeek model if you have nowhere to run it?OpenAI / Google / Anthropic / XAI also have a ton of compute. That is the real moat.
by dist-epoch
5/18/2026 at 2:24:41 PM
It's quite expensive to self-host but you have many places to run it. OpenRouter alone lists a dozen different providers for DeepSeek 4 Pro. https://openrouter.ai/deepseek/deepseek-v4-pro/providers.So long as there is demand, there are always going to be providers competing to offer it at a low cost. My understanding is that the median price on there is in the ballpark of what it costs to run the inference. This is very different from e.g. Opus, which you can basically only buy from Anthropic at the price they set.
by eli
5/18/2026 at 2:03:38 PM
antirez running (quantized) DeepSeek V4 Pro on a Mac Studio M3 Ultra with 512GB of RAM:https://bsky.app/profile/antirez.bsky.social/post/3mlzwmvlov...
It's much closer than you think. We're going to see specialized hardware in the next 24 months capable of running 2025-era frontier models. That's big.
by nmfisher
5/19/2026 at 9:01:26 AM
2-bit quantization? That's a lot of signal being removed. Considering how quickly the AI models are progressing in their capabilities (still exponential curve), I will not want to use the 2025 model in two years time. Similarly, how I don't want to use llama-3 or old Anthropic model from 2023 or 2024. Newer models are so much better that it makes it very difficult to ignore.Once and if the advancements with the AI models slow down, only then IMHO it will become feasible to design the specialized HW for general-purpose consumption and general-purpose workloads.
by menaerus
5/19/2026 at 10:05:24 AM
Opus 4.6 was a 2025 model and many people (myself included) feel that if that's where models peaked, we won't be disappointed.Even at 2-bit quantization, DS4 is probably on par with a 2024 frontier model. You can run that today on local hardware, and at a minimum, local models are going to keep pace over the next 12-24 months. Even if they don't close the gap with frontier models, they'll still play an important role in the overall pipeline for cost, speed and privacy reasons.
That's without even mentioning the additional capability that something like a Taalas chip churning out 17k tokens/sec could unlock.
by nmfisher
5/18/2026 at 3:50:50 PM
It's big because it may take a big swath of people who will actually pay for LLMs out of the market. But for the average consumer they're going to primarily use their phone/tablet and we're far away from that being possible.Even if it were possible the LLMs are such a gold mine of user data. It's really hard to see that opportunity be passed up.
by treis
5/18/2026 at 2:16:30 PM
That specialized hardware will be scooped up by AI data-centers, just like RAM is today.by dist-epoch
5/18/2026 at 2:38:21 PM
No more than Mac Studios. Datacenters need different hardware.by nine_k
5/18/2026 at 3:01:06 PM
The 512 GB ram studio can't even be purchased anymore. It's been delistedhttps://www.apple.com/shop/buy-mac/mac-studio
Same with the Mac mini. entirely removed from all store references
by ffsm8
5/18/2026 at 2:06:11 PM
I just got into self hosting Deepseek v4 Flash on a single DGX Spark via antirez’s DwarfStar 4 projectIt feels great to finally have access to something local.
by wolttam
5/18/2026 at 2:03:10 PM
That seems pretty temporary if people can just build more compute.by amanaplanacanal
5/20/2026 at 5:59:18 AM
There are myriad compute providers. I suspect the inference market is hard to monopolize. But given our anti-trust track record the past 40 years I suppose it’s possible.by danny_codes
5/18/2026 at 1:35:26 PM
Well, yes. Anyone who tells you they know how this is going to work is an idiot.by benedictevans
5/18/2026 at 2:51:12 PM
I didn’t know there were a sequence of these decks; thanks — it’s helpful to think of them as updating snapshots in time.The main thing that stands out to me on these graphs is just . how . early we still are - looking at industries like legal which in my mind are certainly going to be massively disrupted, and seeing the very low usage rates vs. tech (which still shows less than a quarter of tech people using AI daily) — we are in for a lot more change than we’ve seen so far.
by vessenes
5/18/2026 at 9:05:24 PM
Legal has lots of institutional inertia behind it though. I think AI will be very very useful for lawyers..... at their desk in private. But I don't see it replacing them. The legal system is heavily personal and relies a lot on reputation and tradition. I think you'll see courts, bar organizations, etc frowning on using AI too heavily, and certainly not using it to automate "official" processes.by cman1444
5/18/2026 at 2:50:00 PM
I appreciate Evans’ work and wrote an “antithesis” to the Nov 2024 iteration of this. Given the pivot to “models look likely to become infrastructure” I might want to update my take.by 7777777phil
5/18/2026 at 3:10:28 PM
Didn't you mean Claude take? It's ai written after all...by ffsm8