5/18/2026 at 5:32:20 PM
I don't think I can handle another small model release by qwen, I'm still trying to find the limits of 3.6 27B and they are already threatening us with a new one?But jokes aside, I love the fast iteration, these are most probably again finetunes on the 3.5 architecture that appear better in internal testing, which is still very nice to see. Putting more and more pressure on the bigger labs to perform better is always a good thing.
by sleepyeldrazi
5/18/2026 at 6:01:02 PM
How good must their training pipelines be? Releasing publicly and at this rate has made them very efficient.by genxy
5/18/2026 at 6:10:23 PM
Finetuning takes little resources, the base model training is the slow and expensive part. Architecturally 3.5 models are identical to their 3.6 counterparts, that is why there is a consensus that those are probably finetunes and not re-trained from scratch, like you will se many people publish their own on huggingface.by sleepyeldrazi
5/18/2026 at 6:49:17 PM
Understood, but look at their larger cadence over the years and the breadth of models. They are clearly not all finetunes. Meta for all its billions, doesn't have anything comparable.by genxy
5/18/2026 at 10:15:22 PM
In the china AI scene, there seem to be two separate types of companies.Companies or labs like deepseek that produce less but larger and more innovative models, so seem to be more research oriented.
then there are companies like z.ai (GLM), Minimax, and Qwen which focus more on commercializing the AI and so produce far more versions, but with far less improvements between them (usually fine tunes)
Commercial providers like anthropic probably do the same thing, maybe even without labeling it like a different version if the model is similiar enough.
by fgonzag
5/18/2026 at 9:07:54 PM
> Meta for all its billions, doesn't have anything comparable.Maybe nothing released to the public. I don't know that all of their models are public. I think all they really care about is that they aren't relying on one or two cloud providers for a critical piece of their infrastructure.
by bachmeier
5/18/2026 at 8:56:28 PM
competent leadership goes a long wayby Computer0
5/19/2026 at 5:57:25 PM
That was true up to Qwen 3.5, everything after that is made by the same people that made Gemini 3.1 suckby throwa356262
5/18/2026 at 11:17:08 PM
still waiting for a update to Qwen3-Coder-Nextby cyanydeez
5/18/2026 at 5:44:39 PM
[dead]by plutokras