5/22/2026 at 3:37:42 PM
This is where open source models are important.The latest deepseek v4 pro model is 2-5x cheaper than Claude Sonnet 4.6. Cursor's Compose 2.5 that was just recently released is 6x cheaper than Sonnet.
The state of the art models are going to get better and more expensive and smaller models are going to get cheaper.
There will be a point where the intelligence of both the cheap and state of the art models are indistinguishable by humans like it is indistinguishable for me to understand the difference the difference between Terrance Tao and my university math professor.
I don't always need the smartest and most expensive models. I will need it every once in awhile and will gladly pay that price if I had to. What I do need is the model that will solve the current problem I have in a reasonable amount of time.
by PiRho3141
5/22/2026 at 4:02:59 PM
I know it comes off as pedantic to point this out but: Those are open weight models not open source models.Closed weight models are the equivalent of SaaS. Open weight models are the equivalent of binary driver blobs or Windows software. We don't really have actual open source LLMs, which would need to publicly release their training data and technique so you could train a similar model yourself, or use their work as a baseline for your own model.
This distinction matters because an actual open source LLM would be extremely important from an ecosystem point of view, if someone ever actually released one.
by clhodapp
5/22/2026 at 4:44:32 PM
There are absolutely fully open source models. These are not frontier models, but they very much do exist. OLMo is one of the models explicitly mentioned as having passed the OSI's validation phase. Pythia was also validated by the OSI as meeting its requirements for an open-source AI system. Lucie-7B is a multilingual model is one of the first LLM compliant with the OSI AI definition. Its creators explicitly state that the training dataset, data preparation code, and model weights are all publicly available under open licenses.by yogthos
5/22/2026 at 4:52:44 PM
I know this is highly contested, but I'll try explaining it anyway, because I keep seeing this and it's ... wrong.Your comment is wrong both theoretically and practically.
First, the theory. The idea that model weights are "binary driver blobs" is technically wrong. I don't know why this is so common on a technical site, but anyway. An LLM model consists of 3 main parts: The architecture, the inference code, and some values. All of these, combined, make an LLM.
Another important aspect, that is widely misunderstood and will become apparent later is that a model is created by deciding the architecture, and then initialised with some values. Those values can be all 0s, all 1s, or random. (in practice it's random but that's irrelevant). Technically, once a model is initialised, that's it. That is a model. If released, that would be, even for the most pedantic absolutists, undoubtably open source.
Then, that model is being adapted. The most important thing to understand here, is that this is the preferred way of modifying a model. Actually, the only way. You can't (yet) come later and decide to change something in the architecture. Youc an only change the values. That process is called training (pre, mid, post, etc). The process itself is the same for the model creators, as it is for you. The technical process. The means, know-how, etc. is different.
Now, what licensing does, and the only thing that licensing can do is to give you rights to inspect, modify and release that model. That's it. A license will never give you (it cannot) the right to have the internal IP, knowledge, know-how or the "why's" on how the model was edited. That's on you. You have the right to modify, but you can't get the right to know how others have modified it, from a license file. Never had, never will.
(a simplified version of this is to think about an algorithm to control a drone. Usually that'd be a pid controller. Imagine someone releases under an open source license, an algorithm. That algorithm consists of architecture, loop code, and some values. Even if those values are all set to 0.5 (in which case your drone might crash) or any other values, the values themselves do not change the status of the code. It's still open source, even if the values are fixed, or random, or dreampt up by the original coder, or received from the aliens themselves)
I mentioned above that editing the values of a model is the preffered way of modifying the model, and that's exactly what Apache 2.0 defines as "source code".
> "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
----
Now, the practice. In practice, we do have fully open (open data, open training code, open source models) models. Apertus, from Switzerland and Olmo from the US. Don't get me wrong, it's absolutely great that we have these models, they are very important for the community, and they do help inform everyone about what works, what doesn't, and so on. But ... no-one uses them. Because they are not at the top, compared to other models.
And, on a technical note, the idea that "dataset" + training code = bit-for-bit recreation is also not true. Anyone that has done any large scale training can tell you that. Between the randomness inherent in the process, the occasional training run re-starts and so on, you will never get the same model twice (at reasonable scales), even if you'd have the available compute. Which, let's be serious, no-one at home has. So... yeah. It's a pointless aspect to care for anyway.
by NitpickLawyer
5/22/2026 at 10:36:18 PM
I don’t see how models can be licensed at all. There is no creative element in them.As you say, you start with a random array and start mutating it until you get something that magically does interesting things.
Sure, you can hold copyright over all the software used to train the thing. And trade secrets or patents around your data selection, training methods, and infrastructure and such.
But unlike typical software compilation, the model isn’t a rote translation of something that has a creative element. Ordinary software has creative source code as input, mechanically processed into an output.
Models start with a bunch of inputs that are not the creative property of the model maker. Those non-creative inputs are not imbued with novel creativity, no matter how advanced the intermediate machinery may be.
By analogy, you may hold a copyright on the layout and creative elements of a phone book, but you have no rights over the actual data of phone numbers. Nor will any amount of ingenious layout engines or ad placement algorithms or complex printing press methods turn those numbers into something that can be licensed.
IANAL. This is truly baffling to me and it seems like everyone is going along with it because some corporate lawyer probably said “Iunno, let’s just say we are licensing this thing before release. Worst case, a court throws out the license”.
by abtinf
5/22/2026 at 7:08:46 PM
| Technically, once a model is initialised, that's it. That is a model. If released, that would be, even for the most pedantic absolutists, undoubtably open source.That is true. But it is not the same model as the LLM created by combining the released weights with the released architecture. The thing that is the "binary blob" is the weights. It is pretty much exactly akin to a Linux driver that depends on linux-firmware. It is wonderful that it exists! But it is only partly open.
| Now, what licensing does, and the only thing that licensing can do is to give you rights to inspect, modify and release that model. That's it. A license will never give you (it cannot) the right to have the internal IP, knowledge, know-how or the "why's" on how the model was edited. That's on you. You have the right to modify, but you can't get the right to know how others have modified it, from a license file. Never had, never will.
| In practice, we do have fully open (open data, open training code, open source models) models. Apertus, from Switzerland and Olmo from the US. Don't get me wrong, it's absolutely great that we have these models, they are very important for the community, and they do help inform everyone about what works, what doesn't, and so on.
You seem to contradict yourself here. That said: I appreciate the correction of my perception that there aren't truly open large language models.
by clhodapp
5/22/2026 at 7:47:46 PM
> It is pretty much exactly akin to a Linux driver that depends on linux-firmware.The key distinction between the two is "is that the preferred form of modifying that linux driver"? And then "does the license allow you to inspect, modify and re-release that linux driver"? If the answer to any of those questions is "no", then it's not "exactly the same".
> You seem to contradict yourself here.
I don't think so. There are open source models (released under Apache2.0, MIT, like qwens, some mistrals, deepseek, etc.), weights-available models (released under restrictive licenses i.e. llamas, some mistrals, some from cohere, etc) and there are open data models (Apertus, Olmo, etc). The license dictates if a model is open source or weights available. The difference is what you are allowed to do with th emodel.
by NitpickLawyer
5/22/2026 at 5:29:11 PM
There are still things you can't do with an open-weight model without the training data, like modifying the architecture and training from scratch. That's different from true open-source code, where you can do anything the authors could do.by tlb
5/22/2026 at 6:28:19 PM
The inference code is not part of a LLM and there can be multiple different implementations of it. The model, code to train the model, and code to run the modal are different things.by charcircuit
5/22/2026 at 6:53:33 PM
> The inference code is not part of a LLMWhile that might be true in a majority of cases, it's not necessarily universal. Recently model providers have worked with inference libraries to support their models at launch, but say in transformers you can include code for a new architecture, and if you load it with "trust_remote_code=True" it will still work. You can modify the forward pass or whatever you want to do. In that sense, code can be part of a model.
by NitpickLawyer
5/22/2026 at 4:58:31 PM
Good read thanksby dmbche
5/22/2026 at 3:48:23 PM
> The state of the art models are going to get better and more expensive and smaller models are going to get cheaper.Why do you think this will be true?
Right now I see the major US labs betting on gaining an advantage from having way more compute, and I see Chinese labs competing with one another in a resource-scarce environment, so they place much more emphasis on compute-efficiency.
But the supply chains that feed into the massive data center growth in the US are strained; there are energy, memory, and logistical bottlenecks to name a few.
In the medium-long run, compute capacity will not grow exponentially forever. Somehow it has for decades, but there can be no infinite exponential growth, and that point may be when the planet really starts to cook itself.
Maybe the US labs will become more compute-constrained, and then have to compete on efficiency.
Or maybe things change fundamentally in some other way I'm not thinking of.
by greenmilk
5/22/2026 at 3:55:01 PM
The labs have a perverse incentive to make things as expensive compute wise as possible. The only thing keeping this somewhat in check is competition, but it's intentionally being gatekept by locking up the supply of computing infrastructure. With 3 players it's pretty easy to collude even if indirectly. They can't burn trillions forever. Nvidia's 75% profit margins are not sustainable forever.Things will normalize, but it will take time.
by nightski
5/22/2026 at 4:17:03 PM
>The labs have a perverse incentive to make things as expensive compute wise as possible. The only thing keeping this somewhat in check is competition, but it's intentionally being gatekept by locking up the supply of computing infrastructure. With 3 players it's pretty easy to collude even if indirectly.By all accounts the AI capex boom is justified up by actual usage, rather than some nefarious plan for "locking up the supply of computing infrastructure". Just look at people complaining about claude availability and anthropic adding various load-shedding measures a few months ago.
by gruez
5/22/2026 at 5:17:44 PM
Right but that could be more evenly distributed. There is a circular trade right now giving these few players near infinite resources that is blocking that from happening.by nightski
5/22/2026 at 5:01:54 PM
[dead]by kxkdkdisjsn
5/22/2026 at 4:48:52 PM
Commoditize your complement - I expect to see this most in consumer AI (after that starts actually working...)It will be important for Apple to have good enough, cheap local LLM models that run on-device.
If the barrier to performance shifts from fundamental model capability to context collection and management I would expect to see folks focused on that problem continuing to drive open-weight LLM model development in some shape or form.
by theodorewiles
5/22/2026 at 4:15:15 PM
>so they place much more emphasis on compute-efficiency.Maybe on training, but on inference they use more tokens than comparable western models.
https://artificialanalysis.ai/?output-tokens=intelligence-vs...
by gruez
5/22/2026 at 4:14:16 PM
>The latest deepseek v4 pro model is 2-5x cheaper than Claude Sonnet 4.6. Cursor's Compose 2.5 that was just recently released is 6x cheaper than Sonnet.It's ironic how in a thread about "AI subsidies" that people don't think free model releases from AI don't count as subsidies. Whatever AI winter that would cause AI companies to stop subsiding tokens, would probably cause other AI labs to stop doing free model releases. They might not be able to un-release the current crop of open models, but assuming proprietary model development still happens, they'll quickly go obsolete.
by gruez
5/22/2026 at 4:19:50 PM
The currently-released models don't really go away. Even if they collectively only release a new model every few years for the sake of influence and public image, that's plenty enough to keep the competitive aspect going.by zozbot234
5/22/2026 at 4:26:31 PM
>Even if they collectively only release a new model every few years for the sake of influence and public image, that's plenty enough to keep the competitive aspect going.This is unpersuasive. Why would AI companies (American or Chinese) stop subsidizing tokens, but keep doing open model releases? At least for the former you can argue it's a lead generation tool for enterprise contracts (eg. hobbyist uses claude code personal plan, then asks the company to buy claude code enterprise, which are billed at API rates), but what's the business case for doing open model releases? You might get some mindshare, but are also arming your competitors in the process. Moreover what makes you think the model releases will be at all competitive to frontier models? Google released gemma 4 a few weeks ago to acclaim, but it's in no way competitive to even GPT-5.4 or Opus 4.6.
by gruez
5/22/2026 at 4:53:59 PM
Are Chinese companies subsidizing tokens?M2.7 is 230B and was designed to run inference on two (2!) shity Ascend GPUs (Huawei's first GPU manufactured in China). That's why they can offer a plan at 1/2 of the price of Antropic and probably still make a revenue.
by throwa356262
5/22/2026 at 3:46:41 PM
Deepseek V4 Flash is far cheaper still, and a better model to compare to Sonnet 4.6. I'm finding it a reliable workhorse.by squidbeak
5/22/2026 at 3:59:55 PM
Yep, people who never used it say it is not good.by anonzzzies
5/22/2026 at 4:07:01 PM
sorry to nitpick (I totally agree with what ur saying btw, I run Ministral-3b on my hardware as my go-to bc I don't usually need the "smartest and most expensive models")> This is where open source models are important
open-weights, the training data isn't public
by sometimelurker
5/22/2026 at 4:59:56 PM
oss models don't directly matter when multiple at-scale frontier API providers have to compete on price: they are limited in defensible marginThey do matter in that oss researchers enable faster cross-pollination of good inferencing efficiency improvements to help the big boys adapt ideas from the community
Long-term local ai may matter more, but imo not there until models + hw get way better (1-2 years?) . Reasoning grade quality at speed is still $$$: we need fast opus, not slow sonnet.
by lmeyerov
5/22/2026 at 4:14:14 PM
>The latest deepseek v4 pro model is 2-5x cheaper than Claude Sonnet 4.6. Cursor's Compose 2.5 that was just recently released is 6x cheaper than Sonnet.The only way you're running Deepseek V4 with comparable quality/performance is through OpenRouter, at which point you're still susceptible to being price gouged in the future, or by spending >$20k on hardware.
by jplusequalt
5/22/2026 at 5:10:03 PM
There is still a difference though. If some company decides to raise prices on OR, you can just switch to any other provider of the same model since there is no moat.by driese
5/22/2026 at 4:34:00 PM
[dead]by throwaway613746