5/30/2026 at 1:49:18 PM
The abrupt swing in many non-technology company IT departments from "hey developer, you aren't using enough tokens" to this is just too funny.And I'm seeing almost no self-awareness from leaders. They are making decisions about things that they just don't understand. And are completely unworried about it. Just blindly following whatever the news cycle is about AI.
by tyingq
5/30/2026 at 1:52:54 PM
The closer people live to the consequences of their decisions the more rational they become. Until leaders(and I use that term loosely) are held accountable, the insanity will continue.by datakan
5/31/2026 at 12:44:06 AM
In addition to being true, this observation is profound. When designing any multi-step system that relies on humans making decisions, whether in governance, organizations or economies, placing root causes as close to end effects as possible is almost always better.by mrandish
5/30/2026 at 2:03:03 PM
Their only accountability is to the stock price. The insanity will continue.by greesil
5/30/2026 at 3:01:42 PM
As long as our stock price continues to... Continues to rise... Which... Hmm... I'm just now reading our balance sheet. Is this number right? Great, thanks.As I was saying, you're all fired.
by dfedbeef
5/30/2026 at 8:23:15 PM
I’m willing to bet that most of us here are capable of acquiring pitchforks and torches.I predict that will be their comeuppance; it will begin a new era in history.
by Henchman21
5/30/2026 at 9:08:15 PM
[dead]by cindyllm
5/30/2026 at 2:28:59 PM
I’m sorry you are used to working with out of touch leadership. Not all companies are like that. Even big ones can have smart, empathetic leaders. Although very often money gets in the way of empathy.by oofbey
5/30/2026 at 3:21:04 PM
Money alao has the problematic tendency to warp the people around you, it's its own kind of gravity. The more powerful you are the more you attract yesmannerism and the more you lose touch with what's going on.by rf15
5/30/2026 at 7:30:10 PM
Also notably these attributes don’t make one infallible. I see a lot of engineers judging from the sidelines without any sense of how to run large orgs and how you have to make tough calls with imperfect info all the time.by therealdrag0
5/30/2026 at 5:55:52 PM
You hiring?by pdimitar
5/31/2026 at 3:08:51 PM
Being out of touch is the default state for leadership. They mostly just parrot the news with a multi-month lag.by OutOfHere
5/30/2026 at 4:17:58 PM
I've been enjoying journalist Ed Zitron's recent diatribes about how impossible it is to find a business leader who had a plan for measuring their ROI from adopting AI coding.What he says he's consistently hearing from them mirrors what I saw at my own employer: they thought they had ROI metrics, but they actually only had usage metrics such as "lines of code committed" or "number of pull requests". The only way those could possibly work as an ROI measure is if your business charges customers by the line of code.
by bunderbunder
5/30/2026 at 10:20:59 PM
What they really means is they previously had no valid metric to measure productivity of developers before either. AI or not.by conception
5/31/2026 at 12:54:19 PM
Measuring productivity of developers isn’t really in line with what needs to happen, either. A team can be incredibly productive and still generate negative 100% ROI if what they are building so industriously is stuff that nobody wants to buy.Which reflects another thing I’ve seen at work. A lot of what AI coding has enabled is diving headfirst into quagmires. Our costs have spiked - not just because of the token spend, also because we gotta pay the cloud platform to run all these new services, operators to operate them, marketers to market them, etc. - but revenue hasn’t budged.
by bunderbunder
5/31/2026 at 2:58:52 AM
But at least pre AI, most managers presumably subjectively measured devs on relevant performance. Using systems where employees who burn the most tokens ($) per week ‘win’ is crazy - just ask the AI to spin up a subagents to implement every conceivable approach to a task, then spin up n agent judge to pick the winner, and repeat. You've immediately got 50x or whatever your previous usage from that alone.by no-name-here
5/31/2026 at 3:19:40 AM
I had cynically done this sort of tokenmaxxing for a while as a burnt offering to the token-hungry non-leadership.Eventually I got tired of it and got back to work.
by canyp
5/30/2026 at 2:07:29 PM
Groups resist to change - the bigger the group, the most resistance there is.As a leader, pushing for rapid change cannot really be nuanced lest the push dissipates into the organization's entropy.
by sdeframond
5/30/2026 at 2:33:01 PM
Perhaps, but the change you get (if any) is most likely to be what you push for and reward/punish.It's irrational to push for tokenmaxxing (literally "please increase our AI spending") and not expect that this is the result you are going to get. You won't get productivity increase, since that is not what you are pushing for - you will get token usage maximization (engineers running inane agentic tasks against your code base to increase usage, using company paid AI for their side projects, etc, etc).
by HarHarVeryFunny
5/30/2026 at 3:37:08 PM
The evidence suggests that many tech leaders do not realize that an immediate result of heavy handed uninformed top down decision making is transforming the “work together, succeed together, giving quality” ethos into a cynical game theory minimax effort to game whatever stupid arbitrary metrics are used to implement the top down fad of the quarter; do it consistently and you get a work force that can be given a metric and immediately, instinctively, tell you how the work flow will be adjusted for the new metric, and where the difficult problems will be shunted to.by lanstin
5/30/2026 at 2:41:43 PM
I'm not sure the leaders would disagree with what you're saying. They tokenmaxxed to understand what it looks like when AI gets into every corner of the business; now they feel they've gotten enough info (or at least that more info wouldn't be worth the cost), so they're adding in cost controls. As the article says, this is not great for AI model providers trying to predict what their future revenue is going to be, but it's not obvious that there's any mistake here for AI users.by SpicyLemonZest
5/30/2026 at 3:01:17 PM
> They tokenmaxxed to understand what it looks like when AI gets into every corner of the businessPerhaps that is what they were trying to do, but the reality is that all they will have got is a large token bill. The decision makers may have hoped that tokens would be used in most productive fashion possible so they could evaluate if the cost was worth it, but what they will have actually got is what they asked for and measured, high token usage (applied to whatever people needed to do to get their usage stats up, regardless of productivity).
The other business-as-usual factor is that there will be false reporting up the chain, so if the company understands the CEO want to see high AI usage and productivity gains, then s/he will see high AI usage (a large token bill) and will be fed success reports of corresponding productivity gains.
In a typical corporate environment, if all your peers are reporting success, achieving what the CEO wants, do you want to be the only one reporting failure? So - everyone reports high AI usage (easy for the employees to make happen), and most everyone also reports productivity gains if they understand this is the expectation.
by HarHarVeryFunny
5/30/2026 at 3:44:40 PM
I’m imagining a lot of programmers suddenly being given the impossible task of reporting what worked and what didn’t, and middle management making up some retrospective evaluation with fat PowerPoint decks and meaningless graphs in an effort to present to C-levels some measures of success other than token use.by cratermoon
5/30/2026 at 3:52:07 PM
As the saying goes "figures can't lie, but liars can figure".If you want to report productivity gains or cost savings from some initiative (increased AI usage or whatever) and need some stats to point to, then you just point to whatever is working, for whatever reason, and attribute the success to the new initiative.
In a company I used to work for, one manager, when pushed to increase machine learning usage (a few years back, before ML became AI), just renamed his product from foo to foo-ML (with ZERO ML usage), and reported how well it is working. He has since been promoted twice.
by HarHarVeryFunny
5/30/2026 at 3:40:17 PM
It’s not clear companies were measuring anything but token usage. What information could leadership have collected to determine what worked, what didn’t, and what needs more data? Other than the balance sheet and revenue, do companies actually have sufficient information to understand the results?by cratermoon
5/30/2026 at 3:59:56 PM
Were they trying to measure other things? Definitely. The COO at Uber, one of the examples in the source article, has talked publicly about how they've searched for (and so far failed to find) a link between micro-level metrics driven by AI and concrete improvements in high level project velocity.Do these measurements have sufficient information? As much as any, I'd guess. It sounds like you already know that it's pretty hard in general to measure the productive output of software development organizations.
by SpicyLemonZest
5/30/2026 at 9:09:23 PM
I have no doubt a few companies, like Uber, were measuring other things and had applicable metrics in place before adopting Clod or CoPilot or whatever automation. I'm speaking in the general sense of companies adopting the latest hype without reflection.by cratermoon
5/30/2026 at 1:55:02 PM
I feel like most successful businesses have such a moat of required capital to compete with them that even tho in theory poor decisions like this is supposed to give opportunities for entreprenuers to hit when the big dogs make a wrong move, it doesn't end up happening.by qoez
5/30/2026 at 5:27:03 PM
> leadersDon’t play their game and call them leaders. They are management, bosses, executives.
> They are making decisions about things that they just don't understand. And are completely unworried about it.
Clowns, even.
> Just blindly following whatever the news cycle is about AI.
But followers might be most apt.
——
This is such a huge pet peeve of mine. Describing management goofs using their language that makes them sound all-so-brilliant. We constantly watch these people do the dumbest shit and then they go around describing themselves as “thought leaders” and “servant leaders”. When, really, most are just clowns with fragile egos.
And, while I’m rambling, they’ve tried to take away the fact we are workers by calling us individual contributors. Using language to attempt and hide the hierarchy and power dynamic at play. It just…bothers me so much.
by morgan814
5/30/2026 at 5:59:23 PM
I don't hear them refer to themselves as "job creators" much these days.And many of them still claim they are "risk takers", but have effectively insulated themselves from risk by socializing losses.
by joquarky
5/31/2026 at 2:49:19 PM
> Don’t play their game and call them leaders. They are management, bosses, executives.You're falling into a common trap here: the ambiguity of the English language.
"Leader" means multiple different things. Yes, it means someone who has leadership qualities—who genuinely inspires those around them to do better, or who boldly marches into the unknown and gets people to follow them.
But it also means "someone in charge of a thing."
Now it's certainly true that many people in charge of things who are also really bad at actually inspiring or getting people to follow them (aside from with threats of destitution) also play on that ambiguity to try to convince people that because they're in charge of things, they must also be Good Leaders, and that's crappy...but yelling at others for using the term casually is very much an "old man yells at cloud" situation.
by danaris
5/30/2026 at 2:43:34 PM
During ZIRP they discovered that the way to lead companies nowadays is to become a maxxer of whatever current fad is, and the more you maxx the better. And then when things change and you're wrong, you'll be a strong leader and, in ZIRPs case fire everyone you over-hired, with AI will be similar.Why be a normal guy that waits to see what happens and is measured and pragmatic when you can get attention basically through the whole cycle by being the earliest adopter, adopt it to the maxx, then also be the loudest big brain when the tide changes and be praised for "taking hard decisions" when you revert everything you said so far?
The fakemaxxing economy.
by vasco
5/30/2026 at 3:31:53 PM
A special case of the more general cringe economy we're in. The dumbest, most outrageous ideas win, amplified by social media. Say stupid sh*t loudly, be wrong, profit.by janussunaj
5/30/2026 at 2:41:33 PM
That's nothing new though. It's just very obvious this time.by steve1977
5/30/2026 at 2:54:30 PM
I've never seen self-awareness from leaders. They always lead on vibes.Understanding this was one of the most important things in my career.
by surgical_fire
5/30/2026 at 3:51:35 PM
Having studied control theory I think it makes perfect sense. When trying to make a system target a new level it's quite natural for there to be overshoot that needs to be reigned in. It's also natural for the correction to go too far and need to be corrected in turn. This is not indicative of stupidity it's completely normal.It would only be laughable if they waited way too long to reverse course, but I don't think that's the case.
by im3w1l
5/30/2026 at 8:59:09 PM
Suppose I'm driving at 20 kph, and I set my cruise control to 40 kph. My car then goes WOT, overshoots my target speed and hits 120 kph, at which point it slams on the brakes[0], dropping my speed to 15 kph. It repeats until it finally settles at my target speed. (Rhetorical question) would that be considered "completely normal"?Over/undershoots and corrections are of course unavoidable and normal; the absurdity is at the magnitude and rate of change. Furthermore, this is giving it the benefit of the doubt, that measuring AI spend is a good indicator; that's arguably also in dispute. To stretch my car analogy a bit more: it would be like the cruse control system has to hit the target speed, but it only has data from the O2 sensors.
[0] I know that the "classic" cruise control system cannot apply the brakes, but hey no analogy's perfect.
by RJIb8RBYxzAMX9u
5/30/2026 at 9:05:46 PM
It's not like they accidentally overshot, they were telling people to tokenmax, they didn't even know you could overshoot they thought it was exponential gains all the way. Subtle ideas like balance were not on their minds.by adammarples
5/31/2026 at 3:41:30 AM
Intentionally overshooting can be a legitimate strategy.by im3w1l
5/30/2026 at 1:52:14 PM
The actual cost is going to drop 99% in ~4 years.How much that makes it into enterprise pricing is TBD, since none of the hyper scalers are making money yet of selling AI inference.
Almost all businesses are ahead of the gun. For most of their use cases, AI is either not yet good enough on its own, or good enough but too expensive.
No one wants to get left behind, so everyone's trying to get onto it now, even though it's not ready for what most enterprises want to do with it.
It's easy for them to look at a small startup without billions of lines of legacy business logic debt and see them having success and wonder why they can't have just as much - or more - why they're bigger so they should have better and more success, right???
Wrong...
But when it gets ~99% cheaper for local inference over the next 4 years, at the same time the price per watt improve 4x -> a lot of those cases will start to pencil out.
by onlyrealcuzzo
5/30/2026 at 2:27:46 PM
Going from Opus 4.5 to 4.7 secretly required 6x more compute to run. 4.8 is apparently 30% more on top. I haven't seen any optimizations lately aside from distillation. Nobody's optimizing, they're just scaling up.by BearOso
5/30/2026 at 2:43:59 PM
> Nobody's optimizingThe Chinese, since they lack computing hardware due to US export controls, are.
by rescbr
5/30/2026 at 2:51:46 PM
And our export controls are going to turn China into a winner in the AI arms race if we're not careful.by trollbridge
5/30/2026 at 7:02:48 PM
I retired a few years ago, but I still write a fair bit of code. I was using Copilot's code completion before I retired, but coding agents hadn't come around yet. I've been wanting to try them, but I kept putting it off, and now the price increases make it hard to justify.So I just started trying CodeWhale (https://github.com/Hmbown/CodeWhale) with DeepSeek V4. I expected to be impressed by the abilities (which still require plenty of oversight). I didn't expect to be completely shocked by how cheep it is. After most of a week of using it 4-8 hours a day, which would amount to a full week of coding in many jobs after you account for non-coding activities, I'm about to hit $3 in total usage. So we're talking $10-20 per month for single-agent use by a full time software developer? And I'm sure some of my usage is waste as I'm still getting my head around things like compaction. If I take a break for a few weeks, I pay nothing because there is no subscription.
If DeepSeek and Xiaomi MiMo stay within a few months of the US-based models in terms of capabilities and US companies don't figure out how to drastically cut prices, I can't see how China hasn't already won. Protectionism would be one reason, but that might be ceding 50-90% of the total addressable market, and bring us closer to moving knowledge work out of the US the same way we did with manufacturing because it's too expensive in the US.
by rented_mule
5/30/2026 at 9:51:05 PM
Holy F.. $3 .. once I'm done with my base cursor allocation, each nontrivial question costs $5 . And yes, I'm now switching to a mix of codex and ds4proby zzleeper
5/31/2026 at 12:21:59 AM
How are you using it? More to complete specific functions or scripts, or for larger architectural design and longer implementation runs?by sgc
5/31/2026 at 8:30:57 AM
My initial use was in a repo where I create models for 3d printing using a library called build123d. There are a handful of parametric models and then many instances of those models with parameters (one that's 24 mm in diameter with a cutout, another that's 42 mm in diameter but no cutout, etc.). I tend to be in a hurry when I want a new parametric model, so I've ended up just copying the one that's the most similar and changing what I want to be different.The first big task was to find the common bits and abstract them out. It did a great job of creating a plan, summarized in a table, that gave a name to shared chunks, the line numbers in various files where they appeared, line counts of new functions vs. removed bits, and some pros/cons about splitting out each chunk. It was very well "thought out", so I told it to go ahead. It did a nice job other than straying from my coding conventions. That gave me a chance to build out my AGENTS.md file (it helped with that, too).
Once that was done, I had it create automated tests for the newly abstracted parts. I think this is probably a bad practice... I believe humans should at least define what the tests are testing so that there is a deeper understanding of what oversight is in place. But I was just trying things. It surprised me how well it did. The biggest surprise was that the tests seemed quite inspired by vision. It would try different parameters and then have comments about making sure the shape protruded in a certain way, then code that did that. I expected it to refactor a bunch of the code to make it more testable. It found a way to not touch the code while testing everything I asked it to with just two simple mocks - I hadn't foreseen that, but it felt quite practical. It was passing around several opaque tuples in the tests and accessing items in them by index. I prompted it to replace the first one with a frozen, kw-only dataclass. Then a second. On the second request, it saw the pattern and did the rest without me asking. It created 44 tests across a handful of files.
The next part is where I was the least happy. I use ruff and ty to check my code with almost all checks enabled. It was mostly good about the ruff issues. But for the type checking, it just wanted to disable 6-8 rules for the entire repo in pyproject.toml, or at least for all the tests. I had to repeatedly tell it not to and it kept telling me it wasn't recommended. When it finally gave in, it fixed most of the type issues (build123d has lots of types specified, but many operations result in type conflicts because things are so deeply overloaded). The things it didn't fix, it just left a comment to ignore type checking altogether on that line. After I did a little more brow beating, it finally changed the comments to only disable specific rules. To be fair, and unlike most of my other repos, I've had to spend way too much time getting types right in this repo myself.
My last task involved a small library management system for our little town library (tracking library cards, books, DVDs, check-outs/check-ins, etc.). I inherited it from someone who had built the entire web app out of bash/awk/troff scripts with the data in text files burdened by a lot of schema changes that he didn't really know how to deal with. I'm halfway through moving it to Python/FastAPI/SQLite. I asked it to do a security audit of the entire code base, both the newer parts and the old parts that are still in bash/awk/troff. It found everything I knew about and a few things I didn't know about. It made a decent assessment of the risks/impact of each issue. It also called out design decisions that were good security practices. One of the next big tasks will be to see how it does at continuing the migration - it has enough examples of how I've done it that I suspect it can do something fairly consistent with my thinking. I'll probably have it do one or two web pages. When I feel like it understands what I'm after, I'll tell it to use sub-agents to do the rest. I'll be very happy if I don't have to tease apart any more troff scripts that are generating PDF files!
by rented_mule
5/30/2026 at 2:51:31 PM
DeepSeek and Alibaba would like to have a word.by trollbridge
5/31/2026 at 2:02:47 PM
Hasn't everything DeepSeek and Alibaba created thus far been distilled from the results of many, many accounts logging into Claude and ChatGPT? And that's why there's so much bot detection now at US frontier labs? Doesn't that make the Chinese labs dependent until some unknown point in the future on advancements of US frontier labs? While what they currently provide is cheap, it seems like it's artificially cheap and somewhat static because they took others' intellectual property (no comment needed about US frontier labs stealing the world's knowledge... that's a separate topic).by whatthesmack
5/31/2026 at 3:44:56 PM
> Hasn't everything DeepSeek and Alibaba created thus far been distilled from the results of many, many accounts logging into Claude and ChatGPT?I doubt it is really any different to what the US labs do [1]. I never really bought the "they were basically all just distilling from us" shtick from Anthropic, I just assumed they were either comparing or also creating training data as basically any lab is doing.
[1]: https://www.reddit.com/r/ClaudeCode/comments/1tqaist/opus_48...
by NekkoDroid
5/30/2026 at 3:26:52 PM
[dead]by new_account_102
5/30/2026 at 1:56:33 PM
> The actual cost is going to drop 99%Do you mean the marginal cost by the producer, or the cost on the consumer? I can't see the price of electricity falling much, and the demand curve is apparently exponential if the hype is to be believed.
by krona
5/30/2026 at 2:51:21 PM
DeepSeep V4 Pro is 99% cheaper than similarly performing models were 2 years ago (if such a model even existed).Computing has always been about how to wring out more efficiency. The ENIAC was 150,000 watts, with 3 phase 240 volt power, and cost about $500,000.
My day to day laptop (a year old) is 35 watts, with 1 phase 20 volt power, and cost $1,000, so that's 99.98% less power consumption, 99.8% cheaper, and it has about 10 orders of magnitude more computing power, all on a time span of 80 years.
by trollbridge
5/30/2026 at 3:46:00 PM
Moore’s law is dead.by cratermoon
5/30/2026 at 11:29:00 PM
It died before AI came around and today's coding agents are somewhere upwards of twice as competent as whatever the state of the art of automatic coding was in 2020. 8Iby HappMacDonald
5/31/2026 at 1:20:13 AM
A good chunk of that was one-time gains from shifting GPU and memory architectures to better match what LLMs need at scale as well as some algorithmic improvements. Most of the low-hanging architecture optimization has already been harvested. We'll certainly have more algorithmic gains but the consensus is they'll generally be smaller and less frequent.There's always a chance we'll have some dramatic gains far larger than DeepSeek's optimizations a year ago, but it hasn't happened again yet at even that scale. It would be nice but I certainly wouldn't count on it.
by mrandish
5/30/2026 at 1:56:48 PM
I don't see how this is even remotely true. Unless there's some super breakthrough into a fundamentally different architecture, there's not really a path to a 50% reduction in price, much less a 99% reduction.by packetlost
5/30/2026 at 3:52:13 PM
In fairness, I think _current_ capabilities will be cheaper. So the models of today will be run drastically cheaper in 4 years.by kilroy123
5/30/2026 at 2:34:58 PM
And yet 90% drops for the same level of quality every 18 months have happened like clockwork...And the technology already exists on the algorithmic front TODAY to lock in another 10x gain -> when, typically, algorithmic gains only account for ~30% of that drop and the other ~70% comes from better data (often synthetic) and knowledge distilation from frontier models.
Just look at DeepSeek's pricing...
by onlyrealcuzzo
5/30/2026 at 1:53:45 PM
What makes you think prices will drop? Everyone I’ve spoken to believes they will only skyrocket. Genuinely curiousby datakan
5/30/2026 at 1:59:24 PM
The technology already exists now on the algorithmic front for the next 10x drop between everyone adopting DeepSeek's MLA, MoE (mostly already done), Medusa (a better version of Google's speculative decoding), Kimi's Attn Residuals, and Mimo's Sliding Window Attn, and (possibly) Microsoft's 1.58b (this may be a nothing burger).Historic trends, every 18 months, performance for the same level of quality has gone down 90%.
See: https://www.reddit.com/r/LocalLLaMA/comments/1gpr2p4/llms_co...
And Chart 13 here: https://www.rdworldonline.com/ais-great-compression-20-chart...
And here: https://epoch.ai/data-insights/llm-inference-price-trends
Historically, algorithmic gains are only ~30% of the pie, but there's enough out there to get to 10x, with just what's available already. The other ~70% of the pie is better training data (often synthetic) and distilling frontier knowledge. There's no sign we are tapped out on that front.
Additionally, GRAM (from ~10 days ago) is likely to be a 5-10x on its own (if not substantially more for smaller models). It's unlikely within 4 years LeCun's JEPA ideas and similar ideas like GRAM applied to LLMs have ZERO impact. The preliminary results are absolutely astounding (5000x better reasoning - this is not peanuts).
Further, that's not even counting that cost per watt is still dropping ~2x every 2 years on its own on the hardware front.
If you look at the "cost" of inference. People think it's electricity - but it's currently almost ~80% hardware amortization. The memory shortage is not going to last, nor are Nvidia's ~80-90% margins.
The human brain is still 8-10 orders of magnitude more efficient than the best LLMs of today. With ~1/10th of global capex riding on AI, if you don't think they're going to knock of 2 orders of magnitude more, when it's this obvious and easy... I don't know what to tell you...
Sure, it might take 6 years instead of 4. My crystal ball isn't perfect.
by onlyrealcuzzo
5/30/2026 at 2:41:04 PM
Sure, the price will come down a lot, even if we can argue about the timeline.I think what will also happen, once we get past this current CEO AI FOMO mania, is that companies will start to look at AI spending more rationally like any other company expense, and will revert to more rational decision making.
Even if the cost comes down considerably over the next few years, that's plenty of time for companies to look at their financial results and question why AI expenditure isn't resulting in increase in revenue and/or profitability.
by HarHarVeryFunny
5/30/2026 at 2:25:48 PM
This is great food for thought, thank youby datakan
5/30/2026 at 2:40:13 PM
Additionally, on the context front -> all the labs are aware that for many tasks you can get 10x+ increases in output quality by feeding better context.See https://arxiv.org/abs/2604.04364.
This won't really show up in benchmarks, but it will impact real world usage on the most common use cases.
I'm doing a study right now on the impacts of better context for small models to fix bugs.
A very dumb algorithm can make small models perform at 10x+ model sizes. I'll be surprised if it can't get to 20x+
by onlyrealcuzzo
5/30/2026 at 2:39:04 PM
I didn't take you seriously initially but after reading this, i think you are the real deal.Thank you for sharing this and for having the intellectual courage to hold to a sound reasoning that may be unpopular initially.
by rednb
5/30/2026 at 2:44:36 PM
This is mostly slop. But you may be directionally correctby Nimitz14
5/30/2026 at 9:48:21 PM
> The actual cost is going to drop 99% in ~4 years.And fusion power is just 2 decades into the future!
by AllegedAlec
5/30/2026 at 10:06:50 PM
Full self driving guaranteed here before the end of the year (every year).by jjav
5/31/2026 at 1:03:20 AM
> The actual cost is going to drop 99% in ~4 years.We have little visibility into current frontier model costs at mass scale. As a broad historical trend, tech costs tend to fall over longer time periods but your claim far exceeds Moore's Law rates in its heyday - and that heyday is long gone.
In 2021 TSMC announced it was increasing it's price per gate for new nodes for the first time in its history. In the past five years cutting edge nodes have delivered ~8-15% real-world performance gains on average at costs at least 10-20% more than the last node. If you're positing a string of unprecedented efficiency breakthroughs in LLM algorithms - such extraordinary claims require extraordinary evidence.
by mrandish
5/30/2026 at 1:56:47 PM
Prices have been very obviously trending up, not down. Even open weights models are becoming more expensive with every release. Computer hardware is ballooning in price.by bakugo
5/30/2026 at 2:44:10 PM
Prices are going up for BETTER quality -> not for the SAME level of quality.People are willing to pay more for BETTER quality.
You obviously haven't seen DeepSeek v4 Pro's pricing if you think pricing only goes up...
by onlyrealcuzzo
5/31/2026 at 10:58:32 AM
Maybe so, but that becomes irrelevant when you consider that the new, better quality instantly becomes the expected baseline. So the price of the "baseline" quality is going up regardless.Let's look at GPU prices as an example. Around 12 years ago, I bought a GTX 970 for around $350. That was considered a very good GPU at the time. Today, the "equivalent" GPU model (RTX 5070) now costs almost double. Of course, the newer GPU is much more powerful (more than double, in fact), but all the things you'd use a GPU for have also advanced and now expect an entirely new level of performance as a baseline, such that the older GPU is fairly worthless today. So most people agree that GPUs in general have become more expensive.
Regarding DeepSeek's price: it's obviously subsidized, and unlikely to match the actual inference cost right now.
by bakugo
5/30/2026 at 2:24:40 PM
Just wait for the next model and the next model architecture. Just wait for it, bro.by abalashov
5/30/2026 at 5:31:41 PM
Gemini 3.5 flash is 25% cheaper than 3.1 pro, and outperforms it on almost every benchmark, most by a pretty wide margin...by onlyrealcuzzo
5/31/2026 at 4:15:56 AM
It's still 5x more expensive than 2.5 flashby Rebelgecko
5/30/2026 at 7:46:40 PM
Cool.by abalashov
5/31/2026 at 2:23:56 AM
There has never yet been a new model which actually improved over the previous ones. They suck just as much, and in the same ways, as the models of 3 years ago.by bigstrat2003
5/30/2026 at 2:55:03 PM
Grab a 5090 and run Qwen 3.6 35b on it (6 parameter seems to work best for me).Then buy $10 (or $2, if you're cheap, and they take PayPal) of DeepSeek credits.
Whilst you're at it spring for a Claude subscription too and GPT.
Switch models between Qwen, DeepSeek Flash, DeepSeek Pro, and you can meet 99% of your code generation needs.
Hop over to Opus 4.7 (or 4.8, but I haven't really used it yet) and GPT-5.5 when doing very complex architecture/design or troubleshooting something where DeepSeek Pro is getting stuck.
It is ridiculous how cheap this stuff is now. It's affordable at third world prices.
by trollbridge
5/31/2026 at 3:30:23 AM
None of that is cheap.> spring for a Claude subscription too and GPT.
You started with some random pricing then veered off into impractical hand waving. Far above third world prices...unless you count the USA as third world, I guess.
by Supermancho