LLM providers on the cusp of an 'extinction' phase as capex realities bite

4/1/2025 at 8:26:31 AM

Just out of curiosity, I wish online LLMs would show real-time power usage and actual dollar costs as you interact with it. It would be so insightful to understand to which degree the technology is subsidized and what the actual value/cost ratio is.

I've read somewhere that generating a single AI image draws as much power as a full smartphone charge.

In case the suspicion is true that costs are too high to be monetized, then the current scale-up phase is going to be interesting. Right now people infrequently have a chat with AI. That's quite a different scenario from having it integrated across every stack and it constantly being used in the background, by billions of people.

Late as they may be, for the consumer space I think Apple is clever to push as much as possible to the local device.

by iteratethis

4/1/2025 at 8:57:33 AM

Also out of curiosity, I did some quick math regarding that claim you read somewhere.

Cellphone battery charge: I have a 5000mAh cellphone battery. If we ignore charging losses (pretty low normally, but not sure at 67W fast charging)... That battery stores about 18.5 watt-hours of energy, or about 67 kilojoules.

Generating a single image at 1024x1024 resolution with Stable Diffusion on my PC takes somewhere under a minute at a maximum power draw under 500W. Lets cap that at 500*60 = 30 kilojoules.

So it seems plausible that for cellphones with smaller batteries, and/or using intense image generation settings, there could be overlap! For typical cases, I think that you could get multiple (but low single digit) of AI generated images for the power cost of a cellphone charge, maybe a bit better at scale.

So in other words, maybe "technically incorrect" but not a bad approximation to communicate power use in terms most people would understand. I've heard worse!

by Saigonautica

4/1/2025 at 9:59:26 AM

Your home setup is much less efficient than production inference in a data center. Open source implementation of SDXL-Lightning runs at 12 images a second on TPU v5e-8, which uses ~2kW at full load. That’s 170J or about 1/400th the phone charge.

https://cloud.google.com/blog/products/compute/accelerating-...

https://arxiv.org/pdf/2502.01671

by grandmczeb

4/1/2025 at 8:46:18 PM

These models do not appear from thin air. Add in the training cost in terms of power. Yes it's capex and not opex, but it's not free by any means.

Plus, not all these models run on optimized TPUs, but mostly on nVIDIA cards. None of them are that efficient.

Otherwise I can argue that running these models are essentially free since my camera can do face recognition and tracking at 30fps w/o a noticeable power draw since it uses a dedicated, purpose built DSP for that stuff.

by bayindirh

4/1/2025 at 9:02:50 PM

GPU efficiency numbers in a real production environment are similar.

by grandmczeb

4/1/2025 at 9:08:47 PM

I doubt, but I can check the numbers when I return to the office ;)

by bayindirh

4/2/2025 at 11:58:45 AM

Oh, that's way better! I guess the comparison only holds as approximately true with home setups -- thanks for the references.

by Saigonautica

4/1/2025 at 9:31:10 AM

My PC with a 3060 draws 200 W when generating an image and it takes under 30 seconds at that resolution, in some configurations (LCM) way under 10 seconds. That's a low end GPU. High end GPUs can generate at interactive frame rates.

You can generate a lot of images with the energy you would use to play a game instead for two hours; generating an image for 30 seconds uses the same amount of energy as playing a game on the same GPU for 30 seconds.

by elpocko

4/1/2025 at 11:17:15 AM

One point missing from this comparison is that cell phones just don’t take all that much electricity to begin with. A very rough calculation is that it takes around 0.2 cents to fully charge a cell phone. You spend maybe around $1 PER YEAR on cell phone charging per phone. Cell phones are just confusingly not energy intensive.

by gdhkgdhkvff

4/1/2025 at 11:55:06 AM

And for reference, it takes around $10/year to run a single efficient indoor LED lightbulb. So charging a cell phone for a years-worth of usage costs less than 1/10th of running an efficient LED lightbulb bulb for the full year.

Again, cell phones are just confusingly not energy intensive.

by gdhkgdhkvff

4/1/2025 at 10:49:15 AM

How about if you cap the power of the GPU? Modern semiconductors have non-linear performance:efficiency curves. It's often possible to get big energy savings with only small loss in performance.

by mrob

4/1/2025 at 8:23:58 PM

> Generating a single image at 1024x1024 resolution

That's not a very big image, though. Maybe if this were 25 years ago

You should at least be generating 1920x1080, pretend you're making desktop backgrounds from 10 years ago

by bluefirebrand

4/1/2025 at 9:14:14 AM

> Generating a single image at 1024x1024 resolution with Stable Diffusion on my PC takes somewhere under a minute at a maximum power draw under 500W

That's insane, holy shit. That's not even a very large image.

Apparently I was off on my estimates about how power hungry gpus are these days by an order of magnitude.

by facile3232

4/1/2025 at 11:57:42 AM

Why is that "insane?" Drawing the same image in Photoshop, or modeling and rendering it, in the same quality and resolution on the same computer, would require much more time and energy.

by elpocko

4/1/2025 at 12:14:27 PM

> Drawing the same image in Photoshop, or modeling and rendering it, in the same quality and resolution on the same computer, would require much more time and energy.

Right, but there was a point at which we could stop people from doing stupid shit because it's useless and they're bad with money. Now it seems we've embraced irrational and misanthropic spending as a core service.

We honestly just need to take money away from people who obviously have no clue what to do with it. Using AI seems like a perfect signal for people who have lost touch with an understanding of value.

by facile3232

4/1/2025 at 9:29:18 AM

4090s literally melted their power connectors... https://videocardz.com/newz/nvidia-claims-melting-connector-...

by baq

4/1/2025 at 9:17:35 AM

A 1024x1024 image seems like an unrealistically small image size in this day and age. That’s closer to an icon than a useful image size for display purposes.

by nkrisc

4/1/2025 at 9:19:52 AM

I think you're being hyperbolic. On a 1080p screen that's almost the entire vertical real estate. You'd upscale it if you're going to actually use this thing for "useful purposes" like marketing material, but that's not an icon.

by aqme28

4/1/2025 at 10:49:38 AM

A bit, I do admit. But given the ubiquity of 2k+ screens I don’t think it’s entirely hyperbolic. Closer to an icon in size, I meant, not necessarily usage.

by nkrisc

4/1/2025 at 11:46:05 AM

They're not nearly as common as you think.

1920x1080 is still, by far, the dominate desktop and laptop resolution in 2025.

by esseph

4/1/2025 at 1:00:46 PM

> I've read somewhere that generating a single AI image draws as much power as a full smartphone charge.

To put that in perspective, using the 67 kJ of energy for a smartphone charge given in Saigonautica's comment you can charge a smartphone 336 times for $1 if you are paying the average US residential electricity rate of just under $0.16/kWh.

You could charge a smartphone 128 times for $1 if you were in the state with the most expensive electricity (Hawaii) and paying the average rate there of around $0.42.

Saigonautica's battery is on the large size. It's a little bigger than the battery of an iPhone 16 Pro Max. A plain iPhone 16 could be charged 470 times for $1 at average US residential electricity prices.

For most people energy used to charge a smartphone is in the "this is too small to ever care about" category.

We can do a similar calculation for AA rechargeable batteries, and the results might be surprising.

$1 of electricity at the US average residential rate is enough to recharge an AA Eneloop nearly 2300 times. Of course there are inefficiencies in the charger and charging, but if we can get even 75% efficiency that's good enough for more then 1700 charges.

That really surprised me when I first learned it. I knew it wasn't going to be a lot...but 1700 charges is I think more than the number of times I'll swap out an AA battery over my entire lifetime. I hadn't expected that all my AA battery use for my whole life would be less than $1 worth of electricity.

by tzs

4/1/2025 at 9:41:41 AM

> It would be so insightful to understand to which degree the technology is subsidized and what the actual value/cost ratio is.

it would be insightful for competitors too, because they could use this as part of their analysis and price strategies against you.

Therefore, no company would possibly allow such data to be revealed.

And in any case, if these LLM providers burn cash to provide a service to you, then you ought to take maximal advantage of it. Just like how uber subsidized rides.

by chii

4/1/2025 at 8:44:26 AM

feel like if they did this the whole AI bubble would pop

by polytely

4/1/2025 at 9:23:06 AM

It's not just Apple integrating AI into the hardware, Microsoft has been part of a big push to "AI PCs" with a certain minimum capabilities (and I'm sure their partners don't mind selling new gear) and the copilot button on keyboards, and certain android models have the processors and memory capacities specifically for running AI

by keyringlight

4/1/2025 at 12:45:00 PM

> It would be so insightful to understand to which degree the technology is subsidized and what the actual value/cost ratio is.

For whom would this be beneficial? The design goals of these products are to get as many users as fast as possible, using it for as long as possible. "Don't make me think" is the #1 UX principle at work here. You wouldn't expect a gas pump terminal to tut-tut about your carbon emissions.

by rchaud

4/1/2025 at 10:18:23 AM

How much energy does it cost for a human to generate an image?

by kosh2

4/1/2025 at 10:35:29 AM

You mean, how much extra energy, compared to what the human was going to do instead? It might be a negative amount. But that might be a bad thing, an artist could get fat.

by card_zero

4/1/2025 at 10:33:40 AM

Shortly after ChatGPT hit the scene, everyone said "Google invented this technology, how could they fall so far behind in commercializing it, haha they're IBM now".

Maybe they didn't fall behind in anything, maybe they just did an analysis of what it would cost to train transformer models with hundreds of billions of parameters, to run inferencing on them, and then decided that there was no way to actually be profitable doing this.

by DebtDeflation

4/1/2025 at 11:02:28 AM

Not with an ad-based monetization model. But what if consumers opened their wallets?

by amelius

4/1/2025 at 11:25:29 AM

There is no what if. The consumers haven't opened their wallets (in time for these companies to survive).

by discreteevent

4/1/2025 at 12:47:56 PM

Ad revenue dwarfs any number that consumer dollars could put up. ChatGPT has been a household name for years, Copilot bloatware is shoved into every Office subscription on earth, and it still runs deep in the red. Ads are the only way.

by rchaud

4/1/2025 at 1:15:57 PM

What if they considered that and determined that at typical consumer SaaS pricing of low tens of dollars per user per month it's "still impossible to make money"? What if they went a step further and looked at typical Enterprise SaaS pricing (low hundreds of dollars per seat per month) and determined "still can't make money"?

by DebtDeflation

4/1/2025 at 10:16:21 AM

I'm using the (more expensive) Gemini 2.5 pro and it's like talking to an adult again after claud went all RFK Jr. Brain Worm on me.

People have mentioned on hacker news that there seems to kind of "weather patterns" with how hard the various llms think, like during business hours they get stupid. But of course there is some disagreement about what "business hours" are. It's one of those "vibes".

Imagine scheduling your life around the moods of AIs.

That's the business model. If you don't want a surly and moody AI with a hangover and bad attitude, you gotta pay more!

Like isitdownrightnow.com for crowd sourcing web site availability, there should be a isitdumbrightnow.ai site!

by DonHopkins

4/1/2025 at 7:25:53 AM

> As the global tech research company forecasts worldwide generative AI (GenAI) spending will reach $644 billion in 2025, up around 76 percent from 2024

I’m having a hard time squaring the number $644 billion and the phrase “extinction phase.”

I don’t believe their actual estimate of GenAI spending but if it’s even in the same ballpark as the real value, that’s not an extinction.

by throwup238

4/1/2025 at 7:36:46 AM

That's the entire point. Bubbles are caused when future valuations drive excessive spending far beyond a reasonable valuation of something. Then at some point reality hits and it turns into a game of hot potato as people don't want to left holding the bag.

Pushing towards a trillion bucks a year for what LLMs are mostly currently used for does not seem like a sustainable system.

by somenameforme

4/1/2025 at 7:54:37 AM

Where are they going to get the trillions in revenue to pay any of that back? That's 10% of 2023 US total wages and salaries. Do people really believe it'll replace that much labour?

by pjc50

4/1/2025 at 12:50:46 PM

VC demands their valuation multipliers.

by esseph

4/1/2025 at 9:14:20 AM

Reading through the source [1] they basically get to that huuuuge number by including AI-enabled devices such as phones that have some AI functionality even if not core to their value proposition. That's basically reclassifying a big chunk of smartphones, TVs, and other consumer tech as GenAI spending.

Of the "real" categories, they expect: Service 27bn (+162% y/y) Software 37bn (+93% y/y) Servers 180bn (+33% y/y) for a total of $245bn (+58% y/y)

That's not shabby numbers, but way more reasonable. Hyperscaler total capex [2] is expected to be around $330bn in 2025 (up +32% y/y) so that'll most likely include a good chunk of the server spend.

[1] https://www.gartner.com/en/newsroom/press-releases/2025-03-3...

[2] https://www.marvin-labs.com/blog/deepseek-impact-of-high-qua...

by alexdoesstuff

4/1/2025 at 12:53:53 PM

The $644b number comes from Gartner, who are a 'trends' consultancy, not an accounting firm. It likely includes spending 'pledges', and doesn't account for things like a looming recession and self-inflicted trade war.

by rchaud

4/1/2025 at 9:50:32 AM

I'm an AI-bro, but I think the value of equity in OpenAI or Anthropic is likely zero. They've achieved incredible things with their models, but big-tech only ever seem to be a few months behind and have the economies of scale to make inference profitable. I think both will be acquired with valuations significantly below what was invested in them.

by petesergeant

4/1/2025 at 10:47:22 AM

Ads will eventually make their way into the responses or side bars. It will be interesting (and depressing) to see who does it first and who holds out hoping to squeeze out the ad-supported LLM providers.

by ConSeannery

4/1/2025 at 8:05:35 AM

In the light of this article, it makes sense that OpenAI are taking a "lmao we dont even pretend to care" approach to safety and intellectual property right now.

Altman loudly hyping "look you can ghibli-fy yourself", stating inflammatory things like "we are the death of the graphic designer"; a desparate ploy to rapidly consume the market before the bubble bursts.

by isoprophlex

4/1/2025 at 7:36:46 AM

It took Amazon around six to seven years to see its first profitable quarter, and they still went into the red sometimes when doing major investments thereafter.

by PeterStuer

4/1/2025 at 8:37:53 AM

>It took Amazon around six to seven years to see its first profitable quarter,

A key difference from OpenAI is that Amazon was cash flow positive from very early on and before that first profitable quarter. They only needed one funding round Series A of $8 million instead of repeatedly trying to raise extra rounds of funding from new VC investors.

The Amazon startup already had enough free cash from operations to internally fund their warehouse expansions. The "Amazon had no profits" was an accounting side-effect because of re-investment. Anybody seriously studying Amazon's financial statements in the late 1990s would have paid more attention to their cash flow rather than "accounting profits".

On the other hand, OpenAI doesn't have the same positive cash flow situation as early Amazon. They are truly burning more money than they take in. They have to get billions from new investors to buy GPUs and pay salaries. ($40 billion raised in latest investment round.) They are cash flow negative. The cash flow from ChatGPT subscription fees is not enough to internally fund their growth.

[1] https://en.wikipedia.org/wiki/Free_cash_flow

by jasode

4/1/2025 at 10:43:43 AM

Amazon was founded in 1995 and became cash flow positive in 2002[1], seven years later.

ChatGPT was released in Nov 2022. They are expecting $12B in revenue this year[2].

[1] https://adainsights.com/blog/when-did-amazon-start-making-mo...

[2] https://www.cnbc.com/2025/03/26/openai-expects-revenue-will-...

by nl

4/1/2025 at 11:02:36 AM

>Amazon was founded in 1995 and became cash flow positive in 2002[1], seven years later.

Thank you for the correction. My memory was faulty and Jeff Bezos actually said, "we always had positive gross margins". Deep link: https://www.youtube.com/watch?v=zN1PyNwjHpc&t=36m11s

The positive gross margins allowed enough discretionary use of cash to take out loans and service that debt. I just looked at the 1998 10k filing and the page on "Consolidated Statements of Cash Flows" has "Net cash provided by (used in) operating activities" of positive $31 million compared to negative -$6 million in 1996.

by jasode

4/1/2025 at 11:49:53 PM

The use of debt vs capital is interesting, but one isn't clearly better than the other. You can look at OpenAI's revenue as allowing them to raise investment capital.

To invest in OpenAI is to bet that they won't need to keep investing more in hardware than they bring in. To me that doesn't seem a sure thing, but it isn't obviously wrong either. They are growing revenue very quickly and already have significant cashflow, and it isn't clear to me that they'll need to sustain the CapEx forever.

by nl

4/1/2025 at 1:01:54 PM

This is the textbook definition of survivorship bias.

Amazon out-priced everybody when it arrived, because it didn't charge any sales tax for years, until the laws had to be re-written to close the loophole. It didn't have the eye-watering sums poured into it that AI has had, nor did it have any significant competition internationally. Things couldn't be more different for OpenAI.

by rchaud

4/1/2025 at 8:12:17 AM

I could be wrong, but I believe Amazon's business model was very simple: do everything as cheaply as possible, run at a loss until all competition is dead, and then raise prices once we dominate the market.

I don't think OpenAI has that option.

by paulluuk

4/1/2025 at 11:03:06 AM

This wasn't the business model.

The business model was you could sell books over the internet at a much cheaper cost compared to Barnes and Noble or Borders because they weren't paying for physical locations and there was no sales tax on the transactions because it was on the internet.

by rongrobert

4/1/2025 at 8:20:06 AM

Also, I think Amazon invested everything into growth, for which there was a lot of potential. Seems different for the AI companies.

by karmakurtisaani

4/1/2025 at 8:15:09 AM

I don't think that's true, even in rural areas without choices Amazon is pretty cheap.

by dukeyukey

4/1/2025 at 10:30:21 AM

Eh, it's more that per-transaction they were profitable, so they needed to scale enough to make up for their fixed costs. OpenAI's fixed costs are enormous, and they're only bringing in pennies from subscriptions.

by delecti

4/1/2025 at 11:03:13 AM

A slow pruning here seems healthy.

The more interesting question to me is how gpu vs tpu plays out. Plus the other npu like approaches. Sambanova cerebras groq etc

by Havoc

4/1/2025 at 2:00:35 PM

If only. I think the more likely path is enshittification (ads etc. inside the llm's)

by internet_points

4/1/2025 at 11:12:08 AM

It sounds like you probably mean OPEX, I mean unless you explicitly are talking about loan payments.

by meltyness

4/1/2025 at 7:17:32 AM

Found funny that for something that is pretty much a commodity at this point, adoption seems to be the most important metrics.

Yes, there are differences between the models, and yes some may work better.

But picking the model at this point is just picking the cheapest option. For most use cases any model will do.

by siscia

4/1/2025 at 7:30:23 AM

That is not my experience at all.

Models are still leapfrogging each other every month in e.g. coding or research capability, or even in more mundane tasks such as summerizing long multi topic texts.

Depending on which side of the issue you fall, you're hoping this will go on for a long time to come, or praying that it will end asap.

I'm not using the cheapest in neither my own support, nor in my production systems.

by PeterStuer

4/1/2025 at 8:07:38 AM

from what I've seen the "leapfrogging" is very very incremental.

They all seems to racing to the plateau... It doesn't look like there will ever be a "stand out" leader and the product that each company presents to the market appears to be essentially the same product that everyone else presents to the market. Maybe with some slight twist to it that is easily recruitable or exceedable within a few months.

This is the issue really. at some point the investors are all going to realize that non of their investments are going to be market leaders. When they get to that stage the bubble will well and truly pop.

by senectus1

4/1/2025 at 9:41:16 AM

> They all seems to racing to the plateau.

To me it feels there is no plateau and the models are already very useful and impactful.

I believe there is no plateau because there is nothing objectively special or magical about the human mind and it all can and will be eventually solved, one hack at a time.

by raducu

4/1/2025 at 11:14:13 AM

There seems to be some part of the LLM capabilities being lost by hacking some benchmarks.

Claude 3.7 is a great example of a model clearly beating 3.5 in all benchmarks, but slowly destroying my code base by adding lots of extra lines or hacking around my instructions (adding ,,if'' statements when I want it to change the code to handle a case instead of understanding what change is really needed to be done for it).

I still prefer o1 pro and a lot of those leapfrogging in benchmarks don't translate to being smarter anymore.

by xiphias2

4/1/2025 at 7:51:10 AM

There are a ton of users that just want help generating emails, or adding some stock photo-like images to a blog post.

If the choice is between something that costs $10 a month or $20 a month, and both solve those use cases, it's rational to pick the cheap one.

by nitwit005

4/1/2025 at 8:07:32 AM

And there will probably also be choice of something free and something trial... Which mean even less money spend.

by Ekaros

4/1/2025 at 9:27:39 AM

Adoption is critical for these LLM corporations, because unlike in other industries, here free tier users incur almost the same costs as the paid tier users. They really can't degrade free tier experience too much, or their customers will flee to the competitors. I've read one guy calculating expenses of these corpos and they are truly insane by now and are constantly rising.

by Yizahi

4/1/2025 at 7:22:33 AM

> and yes some may work better.

Isn't that where the cost lies? Data, annotation, and model generation all have mostly linear responses to changes in spending.

> For most use cases any model will do.

They'll operate. They will not produce reliable results. Adoption is one metric, but intentional avoidance should be another.

by timewizard

4/1/2025 at 9:19:11 AM

Full agree!

Being close to the edge of AI usage, it's important to realize that most AI use cases are not "fully autonomous AI software engineer" or "deep research into a niche topic" but way more innocuous: Improve my blog post, what's the capital of France, what are some nice tourist sites to see around my next vacation destination.

For those non-edge use cases, costs are an issue, but so are inertia and switching costs. A big reason OpenAI and ChatGPT are so huge is that it's still their go-to model for all of these non-edge use cases as it's well known, well adopted, and quite frankly very efficiently priced.

by alexdoesstuff

4/1/2025 at 7:59:18 AM

How do you compose the cheapest models to create a software engineer?

by acchow

4/1/2025 at 3:28:38 PM

You don't have to create a real software engineer, you just have to create one that looks close enough to get some executive his bonus and won't fall over before he's moved on to another company...

by dickersnoodle

4/1/2025 at 10:36:24 AM

and how much value would be lost?

by seydor