Qwen-Robot Suite: A Foundation Model Suite for Physical World Intelligence

6/16/2026 at 7:39:10 PM

Was it expected that Qwen is working on this? What are the current alternatives?

The TAM for robots is much, much larger than for coding or services, and much more strategic when you think about manufacturing and war-making.

The Qwen "suite" is a workmanlike breakdown with demonstrated tasks that seems to me as an outsider to suggest that one could start building integrated systems this year, and have simple products next year. I'd be very interested in an assessment from engineers from the robotics companies (cars, biomedical robots, manufacturing...).

Elsewhere on HN I see hundreds of comments on SpaceX's long-telegraphed merger with Cursor but no serious evaluation of this.

by w10-1

6/16/2026 at 8:11:09 PM

I come from a regular swe background, but I've spent the last few months getting into robotics and trying to build a snow-clearing robot, so here's my noob notes:

First, very much expected. Both Google and Qwen have been building explicit spatial reasoning and spatial output capabilities in their models since last fall, gemini 3 was released with support for outputting trajectories for example. I only took a look at Robonav (more relevant for my needs) and its architecture and capabilities are inline with other similar models (eg nVidia's alpamayo).

Second, the overall architecture they describe mirrors what I've been working on: You have general purpose LLM that takes a look at the works and the task in front of it and reasons to break it down into subtasks and tool calls, and you can think of RoboNav and RoboManip as tool calls here. The harness keeps a memory and manages the context of the LLM and tools and keep looping until the objective is complete.

Consider the task of clearing snow off a driveway using this suite: An LLM (Qwen 3.7 plus) takes look at the driveway and decides which areas to clear. The harness then tells robotnav to go to an certain location, then robotnav takes over an runs in a loop until the robot is that that location. Then the harness tells robotmanip to use the plow to clear strip of snow. The harness will then call the planner LLM to plan an execute the next clearing and repeats until the driveway is clear.

So what' the issues? Well, they didn't release the weights, nor the training scripts so you can't actually use it. But also, it's all very research-y still, the models are "small" but still huge/expensive for current edge hardware. You'd still need lots of data collection, HITL, and fine-tuning and evals to make it work for your task. You'd also need a secondary safety system to make sure the models don't wreck something. But overall, I do expect robots to use an agent/model combo like this in prod in a few years.

by martythemaniak

6/16/2026 at 11:43:00 PM

This is bananas to me. Theres been successful entries to snow plow competitions for ages. What a world that people now expect networks to handhold through it. Irresistable to all parties I suppose.

Well I guess I'll have to have a look!

by jvanderbot

6/17/2026 at 2:44:38 PM

Yeah, there's commercially available snow plow robots, you can buy a Yarbo for your house today. As far as I can tell, they all operate on a classical robotics stack - for the Yarbo you install an RTK antenna to give the robot cm-level precision, define a map and a routine, then the Yarbo can execute that routine by itself.

But can it deal with arbitrary lots without extensive premapping, manage piles, handle obstacles intelligently, correct itself (ie spot needs a second clearing ), tackle windrows, etc? It can't, and my hunch is that LLMs are the first tech we have that can plausibly handle all the various cases that a proper robot would need to handle.

by martythemaniak

6/17/2026 at 6:47:10 PM

My hunch is that some kind of planning stack with environmental awareness at a network level is a good solution to this. My hunch is that LLMs aren't really it. Maybe VLA but I'd bet lower.

Robotics probably will absorb a lot of Rl/diffusion-based tech, with LLM at a high level interface at best.

by jvanderbot

6/17/2026 at 8:13:50 PM

Yeah, afaik the approach people take today is always some form of bi or tri level hierarchical control, with a slow LLM doing planning and sub task management and diffusion or VLA doing the motor control at higher frequencies. Major differences seem like where and how you draw the boundaries. For my project I'm personally trying to use ROS2 as a low level tool call (instead of diffusion), with an agent /LLM doing the main decisions.

Having said that, this scheme seems like it might just be a reaction to current hardware limitations. When I saw Talaas demonstrate a 8B model running on a custom chip at 17k Tok/sec, first thing I thought was "wow, you can just run an LLM in a control loop"

by martythemaniak

6/17/2026 at 6:59:52 AM

> This is bananas to me. Theres been successful entries to snow plow competitions for ages.

Why do you hate subscriptions? What if you get a summertime snow storm?

by officialchicken

6/17/2026 at 9:38:57 PM

I wonder if there is hope for clearing ice off asphalt and concrete. It's a real problem in Scando, where temps can hover around freezing longtime, for repeated thaw/freeze cycles.

by euroderf

6/17/2026 at 5:01:38 AM

> The TAM for robots is much, much larger than for coding or services

how do you figure?

by rsalus

6/17/2026 at 6:00:44 AM

The physical work in the world far outstrips the information work. Most information work is simply organizing physical work, attempting to make physical work more efficient.

by Schiendelman

6/17/2026 at 6:18:36 AM

The promise of intelligence might be larger still. By scaling and using superintelligent LLMs to write code for itself, it's possible that the whole field of robotics is just another problem you can point LLM agents at and expect to be solved by afternoon, just like one of those math puzzles. "Traditional" robotics R&D (or any R&D really) would be worthless due to abundance.

by crazylogger

6/17/2026 at 9:22:45 AM

You're just confirming GP's point. If AI agents make those software problems trivial, the physical tasks are all that's left.

Regard it as market segments. It's not hard to envision eg. agriculture & food processing robotized to the point where no human ever touches your food. A few generations in, and people would see potatoes as "nutrient-containing object that comes from a factory" and forgot how to grow potatoes.

I'm rooting for the 'market segment' where AGI (or ASI) finds solutions to long-standing science questions, that are hard to obtain but easy to verify. Or makes new discoveries. Stuff like cancer research, protein folding, synthetic biology, new materials, battery tech, number theory, particle physics, etc etc.

by RetroTechie

6/17/2026 at 10:17:25 AM

I'm with you. As these models grow in ability and commoditize across the TAM of basically every business in the world, it's going to get cheaper and cheaper to solve everything.

by Schiendelman

6/17/2026 at 6:01:43 AM

How many people have homes that require chores to be done? Laundry, cleaning, setting/clearing tables, yard work, some consider cooking a chore.

If I could get an affordable robot to do a subset of them, I'm in the market for one.

by ragebol

6/17/2026 at 1:06:58 PM

Makes complete sense! In fact, it seems to me that the most value sits in those messy, unstructured environments like cluttered homes.

I wonder... How can these foundational models actually learn to deal with that without being deployed in those scenarios? It feels like a chicken-and-egg problem: you need a ton of real-world data from chaotic homes to train robust models, but you also need robust models before you can safely deploy robots into those exact homes at scale

by aamdias

6/16/2026 at 8:07:56 PM

This sounds incredible. Have these models effectively solved the problem of trying to use a fast-processing network to predict the world's state ahead? For example, to catch a ball?

by aliljet

6/17/2026 at 12:58:59 AM

In my opinion, training through embodiment and constructing an internal world model makes it possible to do genuine reasoning about how objects behave in the physical world. We have a continuous feedback loop where we take an action, and see the result giving basis to our shared context we lean on when communicating with each other. Having context is key for being able to explain why you made a particular decision, and allows for error correction and guidance towards better decisions through conversation. This is largely what we mean by having understanding in a human sense. So, having a world model in the context of robotics is the most likely path towards creating a genuine artificial intelligence.

by yogthos

6/16/2026 at 7:52:01 PM

I can't view the videos on my phone. How much existential terror should I feel?

by idiotsecant

6/16/2026 at 10:26:54 PM

https://qianwen-res.oss-accelerate.aliyuncs.com/qwenrobot/ro...

by xdennis

6/17/2026 at 11:38:04 AM

That looks like BlackMirror

by theplumber

6/16/2026 at 8:29:47 PM

Moderate

by ypeterholmes

6/16/2026 at 5:13:53 PM

qwen just keeps delivering, it's too good

by lukewarm707

6/16/2026 at 6:14:34 PM

The qwen.ai webpage should learn to deliver plain HTML with its content instead of overcomplicated JavaScript and CSS to display obnoxious pulsating rectangles where the content ought to be but hasn’t managed to load.

by amluto

6/16/2026 at 6:40:53 PM

Bike shedding exhibit a:

by halJordan

6/16/2026 at 7:55:34 PM

I don’t think it’s bike shedding if I literally cannot load the content. This is a recurring problem on these qwen.ai announcements.

by amluto

6/17/2026 at 7:31:29 AM

i would read this stuff if it arrived by pigeon.

by lukewarm707

6/16/2026 at 8:23:07 PM

This is brilliant, the very future of humanity and huge market share for the next 30 years. The cards on the table too early can be a mistake. With Qwen background this can be mass production like 1 Million units/year in the next 3 years. Think of excavators but in minisize for human use. OMG Europe look at this and take note, every industry dream, the robot suit. It will take over the car market by X10 fold in the next decade. Please Europe get on this fast.

by trilogic

6/17/2026 at 11:07:04 AM

I wonder how one gets data for training foundational models like this. Any idea?

by ltononro

6/17/2026 at 11:18:00 AM

Control the means of production?

by andy_ppp

6/17/2026 at 10:11:41 AM

Really cool but I can only imagine this will be used for military purposes. Then again I am not working in this space, so perhaps this is a bit short-sighted of me.

by ramon156

6/17/2026 at 11:35:09 AM

I am not that pessimist. We have stuff like Roborock so why not a super Roborock ?

by theplumber

6/17/2026 at 12:57:38 AM

With Chinese robotics capabilities and silicone doll capabilities they can be the forefront in the humanoid robotics if everything converge.

by karunamurti

6/17/2026 at 1:08:45 AM

I certainly would not bet on anyone else to win this game. It will be a decisive tipping point, revealing US tech hegemony has fallen.

by voakbasda

6/17/2026 at 2:24:15 PM

I think they should get into the biological skin market. Artificially grown and maintained human skin over a robot frame.

by dyauspitr

6/16/2026 at 6:19:34 PM

Nice! What are some hardware platforms that leverage these models?

by wiremine

6/16/2026 at 7:01:35 PM

The site says they're running them on a NVIDIA Jetson Thor, which dev kits start at $3,000.

by weberer

6/17/2026 at 11:45:21 AM

Hmmm Jetson Thor looks like a cheaper version of DGX spark, minus the network interfaces.

by spwa4

6/16/2026 at 6:28:42 PM

I'm pretty sure we're going to see HASS and valetudo using it soon, fingers crossed

by agilob

6/16/2026 at 9:53:31 PM

Is it open source?

by toephu2

6/17/2026 at 12:59:53 AM

there's a github link at the bottom of the page https://github.com/QwenLM

by yogthos

6/17/2026 at 1:40:23 AM

They've released quite a bit of models... Based on that organization page, seems like the answer is no, Qwen-Robot Suite isn't open source.

by embedding-shape

6/17/2026 at 2:56:07 AM

[flagged]

by pcell

6/16/2026 at 9:13:36 PM

Think, move, sense, at a neck breaking pace. Awareness is just an iteration away.

by Kuyawa