5/11/2026 at 11:24:01 PM
These videos are worth a watch. There are tons of impressive moments, but they had me at the very first one where a woman says: "I'm going to tell you a story," and then pauses for a long, luxurious sip from a cup of coffee, and the model ... does nothing, just waits. Take my money.Speaking of taking my money, what's the economic model for a company like this? They've published a fair amount about their architecture - enough that I imagine frontier labs could implement. Patents? Trade secrets? It's hard for me to understand how you'd be able to beat that training compute and knowhow at Anthropic/GOOG/oAI/Meta without some sort of legal protection.
I can't wait to see what these model architectures do with like 30-40% lower latency and more model intelligence. Very appealing. For reference, these look to be roughly 1/10 the size of Opus 4.7 / GPT 5.x series -- 275B, 12B active. So there's lots of room to add intelligence, and lots of hope that we could see lower latency.
by vessenes
5/12/2026 at 1:28:24 AM
> They've published a fair amount about their architecture - enough that I imagine frontier labs could implement.i think the real ones know this is the tip of the iceberg? hparam tuning, data recipes, data collection, custom kernels, rl/eval infra, all immensely deep topics that would condense multiple decades of phd lifetimes to produce SOTA performance (in both senses of the word) like this.
i would also calibrate what you are impressed by. simply waiting is a posttrain thing - the fact that gemini and oai have not prioritized it is not something you should overindex on as hard. what they showed with full duplex is technically far far harder to achieve
by swyx
5/12/2026 at 11:54:11 AM
I agree that full duplex is the amazing bit. For instance, the three engineers shouting trivia questions while a timer is running — that’s extremely novel as far as I can tell.I’d like to believe from the demos that this ability to wait kind of falls out of the model as an emergent property — perhaps coming out of a small RL loop - rather than a specific behavior trained, a-la a VAD component in a stack. Either way, I would guess that VAD absolutely cannot do this right now — interruptions are highly annoying on all voice interaction experiences, and if it were a simple matter of better post training, SOMEONE would have done this, e.g. elevenlabs.
But, I disagree on your idea that this is too expensive/too hard to replicate. For me, yes. But, there’s an existence proof — a small team at a new company just did this without a real roadmap, certainly for less than $1b dollars and probably in less than two years. They are almost certainly less skilled at your list of needs to replicate than teams at the frontier labs, who have been given a roadmap.. So I don’t think it’s as difficult as you propose, from an organizational skills perspective.
by vessenes
5/12/2026 at 5:40:55 PM
SOTA is very much about both training on well catered corpus (having it) and also hundreds of iterations which eventually make you into… several PHDs really.This is ML/AI. Is not calling third party APIs. If you want any SOTA in any AI area you need to design your own strategy and models. Drilling down to get there is super painful and perhaps not something a paid-for-course can teach you.
Random is everywhere and so are unexpected engineering challenges. Mastering linear algebra alongside some geometry and still knowing classic algos is the starting point.
by larodi
5/12/2026 at 4:36:13 AM
In China it's become well known that promising new companies will get an offer from either Alibaba or Tencent. In the US, it's probably simmilar. Everything that's out in the open can get acquired or simply copied. Maybe that is what Thinking Machines is hoping as well?by edg5000
5/12/2026 at 11:58:40 AM
Publish a Demo -> acquihire for anthropic/oAI/GOOG/META stock and cash is an understandable economic model. In this case, I feel like they built more than would be needed though — and I hope they deploy something useful, I’d love to play with it.by vessenes
5/12/2026 at 2:54:09 PM
Purely out of curiousity, I see you are using an em dash. Did you use voice transcription or something? It looks hand-typed though. I'm confused.by edg5000
5/12/2026 at 4:39:08 PM
On the presumption that this isn't a joke: em dashes appear in LLM outputs because LLMs were trained on human text which included them organically. It's not as unordinary as memes suggest.by niam
5/12/2026 at 3:47:10 PM
I just typed two single hyphens from my iOS device. One: - two: —Edit: when I edit this comment they have been merged in the form so I speculate this is an iOS keyboard feature.
by vessenes
5/12/2026 at 9:37:29 PM
Mira Murati, the founder of Thinking Machines, was CTO at OpenAI during the birth of ChatGPT. Very unlikely their goal is to just cash out.by ricardobeat
5/12/2026 at 2:46:15 PM
hasn't the economic model always been enterprise llms?tinker - for fine tuning a custom enterprise model,
interaction models - for working as a digital paired employee (as opposed to a company having to reinvent their entire process around ai agents)
by htrp
5/12/2026 at 2:55:06 AM
they hire leading researchers, and leading researchers won't work for you unless they're able to publishby babelfish
5/12/2026 at 11:46:46 AM
That was true 10 years ago. It’s most definitely not true now. The arms race is very real.by vessenes
5/12/2026 at 4:15:54 AM
> leading researchers won't work for you unless they're able to publishoh, honey.
by swyx
5/12/2026 at 8:03:56 AM
Do we want the whole humanity to get richer, or few individuals (company owners)?by leonidasrup
5/12/2026 at 3:19:18 AM
Which seems bizarre. Companies can’t afford to just give things away right?by SilverElfin
5/12/2026 at 3:16:19 PM
> Companies can’t afford to just give things away right?Let's say a cutting-edge young researcher is making a name for themselves in their field and earning $300k/yr at a company where they're encouraged to publish and speak. You're trying to headhunt them for a company where they'll be forbidden from sharing their work which will likely stall their career and reputation outside of that company. How much do you think you'd have to offer? $600k? $1M? $1.5M?
When faced with the choice to paying significant salaries, hiring lower-tier researchers, or just letting their people publish, many companies conclude that giving away some of their work is the best option. (And that doesn't even include the benefits of boosting the company's profile which makes it easier to attract other cutting-edge researchers.)
by angiolillo
5/12/2026 at 3:39:31 AM
Yes they can. Your research papers are not the whole story. It’s like google could open source their entire monorepo and very little would change. No one else could operate it.by rokob