12/10/2025 at 5:37:33 PM
This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]You can expect this model to have similar performance to the non-omni version. [2]
There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.
1. https://huggingface.co/Qwen/Qwen2.5-Omni-7B
2. https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct
by gardnr
12/10/2025 at 7:35:51 PM
This is a stack of models:- 650M Audio Encoder
- 540M Vision Encoder
- 30B-A3B LLM
- 3B-A0.3B Audio LLM
- 80M Transformer/200M ConvNet audio token to waveform
This is a closed source weight update to their Qwen3-Omni model. They had a previous open weight release Qwen/Qwen3-Omni-30B-A3B-Instruct and a closed version Qwen3-Omni-Flash.
You basically can't use this model right now since none of the open source inference framework have the model fully implemented. It works on transformers but it's extremely slow.
by red2awn
12/10/2025 at 6:05:06 PM
Looks like it's not open source: https://www.alibabacloud.com/help/en/model-studio/qwen-omni#...by olafura
12/10/2025 at 6:11:30 PM
No... that website is not helpful. If you take it at face value, it is claiming that the previous Qwen3-Omni-Flash wasn't open either, but that seems wrong? It is very common for these blog posts to get published before the model weights are uploaded.by coder543
12/10/2025 at 7:37:25 PM
The previous -Flash weight is closed source. They do have weights for the original model that is slightly behind in performance https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instructby red2awn
12/10/2025 at 8:00:19 PM
Based on things I had read over the past several months, Qwen3-Flash seemed to just be a weird marketing term for the Qwen3-Omni-30B-A3B series, not a different model. If they are not the same, then that is interesting/confusing.by coder543
12/10/2025 at 8:15:34 PM
It is an in-house closed weight model for their own chat platform, mentioned in Section 5 of the original paper: https://arxiv.org/pdf/2509.17765I've seen it in their online materials too but can't seem to find it now.
by red2awn
12/10/2025 at 5:56:49 PM
I can't find the weights for this new version anywhere. I checked modelscope and huggingface. It looks like they may have extended the context window to 200K+ tokens but I can't find the actual weights.by gardnr
12/10/2025 at 6:01:12 PM
They link to: https://huggingface.co/collections/Qwen/qwen3-omni-68d100a86... from the blog post but it does seem like this redirects to their main space on HF so maybe they didn't yet make the model public?by pythux
12/10/2025 at 6:22:11 PM
> There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.last i checked (months ago) claude used to do this
by tensegrist
12/10/2025 at 7:40:06 PM
I dont think the Flash model discussed in the article is 30BTheir benchmark table shows it beating Qwen3-235B-A22B
Does "Flash" in the name of a Qwen model indicate a model-as-a-service and not open weights?
by plipt
12/10/2025 at 7:49:36 PM
Flash is a closed weight version of https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct (it is 30B but with addtional training on top of the open weight release). They deploy the flash version on Qwen's own chat.by red2awn
12/10/2025 at 7:59:48 PM
ThanksWas it being closed weight obvious to you from the article? Trying to understand why I was confused. Had not seen the "Flash" designation before
Also 30B models can beat a semi-recent 235B with just some additional training?
by plipt
12/10/2025 at 8:19:56 PM
They had a Flash variant released alongside the original open weight release. It is also mentioned in Section 5 of the paper: https://arxiv.org/pdf/2509.17765For the evals it's probably just trained on a lot of the benchmark adjacent datasets compared to the 235B model. Similar thing happened on other model today: https://x.com/NousResearch/status/1998536543565127968 (a 30B model trained specifically to do well in maths get near SOTA scores)
by red2awn
12/11/2025 at 9:54:44 AM
Haha, you could hear how it’s mind thinks, maybe by putting a lot of reverb on the thinking tokens or some other effect…by andy_ppp
12/10/2025 at 6:37:01 PM
> This is a 30B parameter MoE with 3B active parametersWhere are you finding that info? Not saying you're wrong; just saying that I didn't see that specified anywhere in the linked page, or on their HF.
by andy_xor_andrew
12/13/2025 at 5:20:33 PM
I was wrong. I confused this with their open model. Looking at it more closely, it is likely an omni version of Qwen3-235B-A22B. I wonder why they benchmarked it against Qwen2.5-Omni-7B instead of Qwen3-Omni-30B-A3B.I wish I could delete the comment.
by gardnr
12/10/2025 at 7:36:17 PM
The link[1] at the top of their article to HuggingFace goes to some models named Qwen3-Omni-30B-A3B that were last updated in September. None of them have "Flash" in the name.The benchmark table shows this Flash model beating their Qwen3-235B-A22B. I dont see how that is possible if it is a 30B-A3B model.
I don't see a mention of a parameter count anywhere in the article. Do you? This may not be an open weights model.
This article feels a bit deceptive
by plipt