We rewrote our Rust WASM parser in TypeScript and it got faster

3/21/2026 at 12:03:41 AM

Something not unlike this happened to me when moving some batch processing code from C++ to Python 1.4 (this was 1997). The batch started finishing about 10x faster. We refused to believe it at first and started looking to make sure the work was actually being done. It was.

The port had been done in a weekend just to see if we could use Python in production. The C++ code had taken a few months to write. The port was pretty direct, function for function. It was even line for line where language and library differences didn't offer an easier way.

A couple of us worked together for a day to find the reason for the speedup. Just looking at the code didn't give us any clues, so we started profiling both versions. We found out that the port had accidentally fixed a previously unknown bug in some code that built and compared cache keys. After identifying the small misbehaving function, we had to study the C++ code pretty hard to even understand what the problem was. I don't remember the exact nature of the bug, but I do remember thinking that particular type of bug would be hard to express in Python, and that's exactly why it was accidentally fixed.

We immediately started moving the rest of our back end to Python. Most things were slower, but not by much because most of our back end was i/o bound. We soon found out that we could make algorithmic improvements so much more quickly, so a lot of the slowest things got a lot faster than they had ever been. And, most importantly, we (the software developers) got quite a bit faster.

by rented_mule

3/21/2026 at 4:16:20 AM

My experience is the exact opposite.

This was particularly true for one of the projects I've worked with in the past, where Python was chosen as the main language for a monitoring service.

In short, it proved itself to be a disaster: just the Python process collecting and parsing the metrics of all programs consumed 30-40% of the processing power of the lower end boxes.

In the end, the project went ahead for a while more, and we had to do all sorts of mitigations to get the performance impact to be less of an issue.

We did consider replacing it all by a few open source tools written in C and some glue code, the initial prototype used few MBs instead of dozens (or even hundreds) of MBs of memory, while barely registering any CPU load, but in the end it was deemed a waste of time when the whole project was terminated.

by ameixaseca

3/21/2026 at 10:34:59 AM

Ditto for me. I had gotten so used to building web backends in Ruby and running at 700MB minimum. When I finally got around to writing a rust backend, it registered in the metrics as 0MB, so I thought for sure the application had crashed.

Turns out the metrics just rounded to the nearest 5MB

by czhu12

3/21/2026 at 8:05:43 AM

> but in the end it was deemed a waste of time when the whole project was terminated.

The main lesson of the story. Just pick Python and move fast, kids. It doesn’t matter how fast your software is if nobody uses it.

by wiseowise

3/21/2026 at 8:27:01 AM

This is it. Getting something on the table for stakeholders to look at trumps anything else.

by stephantul

3/21/2026 at 3:41:41 PM

It would have taken the same time, if not less, given the extra time for mitigations, trying different optimization techniques, runtimes, etc.

One of the reasons the project was killed was that we couldn't port it to our line of low powered devices without a full rewrite in C.

Please note this was more than a decade ago, way before Rust was the language it was today. I wouldn't chose anything else besides Rust today since it gives the best of both worlds: a truly high level language with low level resource controls.

by ameixaseca

3/21/2026 at 8:32:18 AM

[flagged]

by shevy-java

3/21/2026 at 11:09:59 AM

You can use Go and get the best of both worlds.

by Aeolun

3/21/2026 at 1:10:30 PM

One of the slowest, most ineficient code bases I've ever worked on was in Go.

The mentality was "the language is fast, so as long as it compiles we're good"... Yeah that worked out about as well as you'd expect.

by nickserv

3/21/2026 at 3:40:16 PM

But that has nothing to do with the language.

by zeroc8

3/21/2026 at 7:53:12 PM

Absolutely, and it's a good language when used properly. This was more of a problem with the hype surrounding it.

by nickserv

3/21/2026 at 6:47:48 PM

> Just pick Python and move fast, kids. It doesn’t matter how fast your software is if nobody uses it.

The reason nobody uses your software could be that it is too slow. As an example, if you write a video encoder or decoder, using pure Python might work for postage-stamp sized video because today’s hardware is insanely fast, but even, it likely will be easier to get the same speed in a language that’s better suited to the task.

by Someone

3/21/2026 at 6:59:29 PM

Learning that it’s too slow takes users.

by wiseowise

3/22/2026 at 7:18:04 AM

In some cases, common sense of developers can do that, too.

by Someone

3/22/2026 at 12:53:43 AM

They were the users and it was too slow for them so they switched to python. Not C++ of course, what they meant was "the libraries we wrote in C++ were too buggy and slow that using them was slower than if we just used Python."

by casey2

3/22/2026 at 7:58:27 AM

Terrible advice. Really.

Most the the business I do is rewriting old working python prototypes to C++. Python sucks, is slow and leaks. The new C++ code does not leak, meets our performance requirements, processes items instead of 36 hour in 8 hours, and such.

We are also rewriting all the old python UI in typescript. That went not so easy yet.

And when there are still old simple python helpers, I rewrite them into perl, because this will continue to run in the next years, unlike python.

by rurban

3/21/2026 at 8:37:24 PM

if input() == "dynamic scope?": defined = "happyhappy" print(defined)

I'd rather not use python. The ick gets me every time.

by bjoli

3/22/2026 at 6:29:58 AM

It killed ny formatting

    if input() == "dynamic scope?":
        defined = "happyhappy"
    print(defined)

by bjoli

3/21/2026 at 3:43:17 PM

I would agree except for the python part. Sure, you gotta move fast, but if you survive a year you still gotta move fast, and I’ve never seen a python code base that was still coherent after a year. Expert pythonistas will claim, truthfully, that they have such a code base but the same can be said of expert rustaceans. I would stick to typescript or even Java. It will still be a shitshow after a year but not quite as fucked as python.

by lowbloodsugar

3/21/2026 at 4:03:37 PM

https://github.com/polarsource/polar/tree/main/server

If you're writing FastAPI (and you should be if you're doing a greenfield REST API project in Python in 2026), just s/copy/steal/ what those guys are doing and you'll be fine.

by miki123211

3/21/2026 at 10:03:43 AM

And this is why pretty much all commercial software is terrible and runs slower than the equivalent 20 years ago despite incredible advance in hardware.

by littlestymaar

3/21/2026 at 4:56:25 PM

For lots of software there wasn't an equivalent 20 years ago because there wasn't a language that would let developers explore semi-specified domains fast enough to create something useful. Unless it was visual basic, but we can't use that, because what would all the UX people be for?

by philipallstar

3/21/2026 at 6:32:48 AM

Another anecdote, the team couldn’t improve concurrency reliably in Python, they rewrote the service in about a month (ten years ago) in Go, everything ran about 20x faster.

by serial_dev

3/21/2026 at 2:24:04 PM

He struggled with the algorithms, you struggled with the runtime.

You are not the same.

by Traubenfuchs

3/21/2026 at 12:42:42 PM

> just the Python process collecting and parsing the metrics of all programs consumed 30-40% of the processing power of the lower end boxes.

Just write the parsing loop in something faster like C or Rust, instead of the whole thing.

by naasking

3/21/2026 at 1:12:05 AM

> After identifying the small misbehaving function, we had to study the C++ code pretty hard to even understand what the problem was. I don't remember the exact nature of the bug, but I do remember thinking that particular type of bug would be hard to express in Python, and that's exactly why it was accidentally fixed.

Pure speculation, but I would guess this has something to do with a copy constructor getting invoked in a place you wouldn't guess, that ends up in a critical path.

by asveikau

3/21/2026 at 2:01:52 AM

Given the context, I'm thinking bad cache keys resulting in spurious cache misses, where the keys are built in some low-level way. Cache misses almost certainly have a bigger asymptotic impact than extra copies, unless that copy constructor is really heavy.

by andrewflnr

3/21/2026 at 2:07:09 AM

I'm just remembering a performance issue I heard of eons ago where a sorting function comparison callback inadvertently allocated memory. It made sorting very slow. Someone said in a meeting that sorting was slow, and we all had a laugh about "shouldn't have used the bubble sort!" But it was the key comparison doing something stupid.

by asveikau

3/21/2026 at 11:54:17 AM

My guess would be bad hashing, resulting in too many collisions.

by branko_d

3/21/2026 at 1:15:48 AM

good ol' shallow-vs-deep copy

by NooneAtAll3

3/21/2026 at 6:17:05 AM

Ome advantage of python is that it is so slow that if you choose the wrong algorithm or data structure that soon gets obvious. And for complicated stuff this is exactly where I find the LLMs struggle. So I make a first version in Python, and only when I am happy with the results and the speed feels reasonable compared to the problem complexity, I ask Claude Code to port the critical parts to Rust.

by tda

3/21/2026 at 6:51:11 AM

The last part is really interesting. It feels like the whole world will soon become Python/JS because thats what LLMs are good at. Very few people will then take the pain of optimizing it

by rabisg

3/21/2026 at 9:11:20 AM

The LLMs are pretty good at optimising.

Not because they are brilliant, but because they are pretty good at throwing pretty much all known techniques at a problem. And they also don't tire of profiling and running experiments.

by eru

3/21/2026 at 4:09:23 PM

If there's one thing LLMs are really, really good at, it's having a target and then hitting / improving upon that target.

If you have a comprehensive test suite or a realistic benchmark, saying "make tests pass" or "make benchmark go up" works wonders.

LLMs are really good at knowing patterns, we still need programmers to know which pattern to apply when. We'll soon reach a point where you'll be able to say "X is slow, do autoresearch on X" and X will just magically get faster.

The reason we can't yet isn't because LLMs are stupid, it's because autoresearch is a relatively new (last month or so) concept and hasn't yet entered into LLM pretraining corpora. LLMs can already do this, you just need to be a little bit more explicit in explaining exactly what you need them to do.

by miki123211

3/22/2026 at 4:41:24 AM

> The reason we can't yet isn't because LLMs are stupid, it's because autoresearch is a relatively new (last month or so) concept [...]

I'm not so sure. People have been doing stuff like (hyper) parameter search for ages. And profiling and trying out lots of things systematically has been the go-to approach for performance optimisation since forever; making an LLM instead of a human do that is the obvious thing to try?

The concept of 'autoresearch' might bring with it some interesting and useful new wrinkles, but on a fundamental level it's not rocket science.

by eru

3/21/2026 at 5:47:50 PM

I've not tried this yet, but doesn't it use up loads of tokens? How do you do it efficiently?

by philipallstar

3/22/2026 at 4:40:11 AM

It uses a lot of minutes on your computer(s), since you need to run lots and lots of experiments.

I'm not sure if it's particularly token hungry.

by eru

3/21/2026 at 10:00:47 AM

Not just profiling, but decoding protocols too.

Recently I tried Codex/GPT5 with updating a bluetooth library for batteries and it was able to start capturing bluetooth packets and comparing them with the libraries other models. It was indefatigable. I didn't even know if was so easy to capture BLE packets.

by elcritch

3/22/2026 at 8:30:50 AM

Could you ask the LLM to do a write-up on the process and post it? (Or you can write a blog post by hand. Like a caveman. ;)

by eru

3/21/2026 at 11:57:51 AM

Wireshark would do that. But you need to understand low level tools because in case on some BGP attack you all LLM developers will be fired in the spot.

Flakey internet connection: most of current 'soy devs' would be useless. Even more with boosted up chatbots.

by anthk

3/22/2026 at 4:37:16 AM

> Flakey internet connection: most of current 'soy devs' would be useless.

We used to make the same jokes about Googling Stackoverflow since before many users on this site were born.

by eru

3/22/2026 at 9:49:57 AM

And it's partially true. Offline documentation should be mandatory everywhere. Networks can be degraded tomorrow in the current 2nd Cold War we are living. And, yes, the states and goverments have private backbones for the military/academia/healthcare and so on, but the rest it's screwed.

When the blackout the only protocols which worked fine where IRC, Gopher and Gemini. I could resort to using IRC->Bitlbee to chat against different people of the world, read news and proxy web sites over Gemini (the proto, not the shitty AI). But, for the rest, the average folk? half an our to fetch a non-working page.

That with a newspaper, go figure with the rest. And today tons of projects use sites with tons of JS and unnecesary trackers and data. In case of a small BGP attack, most projects done with LLM's will be damned. Because they won't even have experience on coding without LLM's. Without docs it's game over.

Also tons of languages pull dependencies. Linux distros with tons of DVD's can survive offline with Python, but good luck deploying NPM, Python and the rest projects to different OSes. If you are lucky you can resort to the bundled Go dependencies in Debian and cross compile, and the same with MinGW cross compiling against Windows with some Win32, SDL, DX support but that's it.

With QT Creator and MinGW, well, yes, you could build something reliable enough -being cross platform- and with Lazarus/Free Pascal, but forget about current projects downloading 20000 dependencies.

by anthk

3/21/2026 at 1:40:34 PM

Not in my experience. They're pretty good at getting average performance which is often better than most programmers seem to be willing to aim for.

by mirsadm

3/22/2026 at 5:59:18 AM

What kind of 'average' is this, if it's better than what seems to be typical?

by eru

3/21/2026 at 1:35:30 PM

> JS because thats what LLMs are good at.

That has not been my experience. JS/TS requires the most hand-holding, by far. LLMs are no doubt assumed to be good at JS due to the sheer amount of training data, but a lot of those inputs are of really poor quality, and even among the high quality inputs there isn't a whole lot of consistency in how they are written. That seems to trip up the LLMs. If anything, LLMs might finally be what breaks the JS camel's back. Although browser dominance still makes that unlikely.

> Very few people will then take the pain of optimizing it

Today's LLMs rarely take the initiative to write benchmarks, but if you ask it will and then will iterate on optimizing using the benchmark results as feedback. It works fairly well. There is a conceivable near future where LLMs or LLM tools will start doing this automatically.

by 9rx

3/21/2026 at 3:16:56 PM

My experience is from trying to get the React Native example to work with OpenUI. Felt Sonnet/Opus was much better at figuring out whats wrong with the current React implementation and fixing it than it was with React Native

But yes I see what you mean and I think people are trying to solve it with skills and harnesses at the application layer but its not there yet

by rabisg

3/21/2026 at 12:30:41 AM

Fun story! Performance is often highly unintuitive, and even counterintuitive (e.g. going from C++ to Python). Very much an art as well as a science.

Crazy how many stories like this I’ve heard of how doing performance work helped people uncover bugs and/or hidden assumptions about their systems.

by asa400

3/21/2026 at 5:53:45 AM

It doesn't come off as unintuitive by my read. They had a bug that led to a massive performance regression. Rewriting the code didn't have that bug so it led to a performance improvement.

They found that they had fewer bugs in Python so they continued with it.

by staticassertion

3/21/2026 at 7:52:41 AM

I think a lot of people (especially those who are only peripherally involved in development, like management) don't really consider performance regressions at all when thinking about how to get software to go faster.

Meanwhile my experience has been that whenever there has been a performance issue severe enough to actually matter, it's often been the result of some kind of performance bug, not so much language, runtime, or even algorithm choices for that matter.

Hence whenever the topic of how to improve performance comes up, I always, always insist that we profile first.

by harpiaharpyja

3/21/2026 at 7:56:02 AM

My experience has been that performance bugs show up in lots of places and I'm very lucky when it's just a bug. The far more painful performance issues are language and runtime limitations.

But, of course, profiling is always step one.

by staticassertion

3/21/2026 at 3:50:16 PM

> We soon found out that we could make algorithmic improvements so much more quickly

It's true that writing code in C doesn't automatically make it faster.

For example, string manipulation. 0-terminated strings (the default in C) are, frankly, an abomination. String processing code is a tangle of strlen, strcpy, strncpy, strcat, all of which require repeated passes over the string looking for the 0. (Even worse, reloading the string into the cache just to find its length makes things even slower.)

Worse is the problem that, in order to slice a string, you have to malloc some memory and copy the string. And then carefully manage the lifetime of that slice.

The fix is simple - use length-delimited strings. D relies on them to great effect. You can do them in C, but you get no succor from the language. I've proposed a simple enhancement for C to make them work https://www.digitalmars.com/articles/C-biggest-mistake.html but nobody in the C world has any interest in it (which baffles me, it is so simple!).

Another source of slowdown in C is I've discovered over the years that C is not a plastic language, it is a brittle one. The first algorithm you select for a C project gets so welded into it that it cannot be changed without great difficulty. (And we all know that algorithms are the key to speed, not coding details.) Why isn't C plastic?

It's because one cannot switch back and forth between a reference type and a value type without extensively rewriting every use of it. For example:

    struct S { int a; }
    int foo(struct S s) { return s.a; }
    int bar(struct S *s) { return s->a; }

If you want to switch between reference and value, you've got to go through all your code swapping . and ->. It's just too tedious and never happens. In D:

    struct S { int a; }
    int foo(S s) { return s.a; }
    int bar(S *s) { return s.a; }

I discovered while working on D that there is no reason for the C and C++ -> operator to even exist, the . operator covers both bases!

by WalterBright

3/22/2026 at 2:18:26 AM

Well clearly there is use for these - how do you distinguish what you are accessing in smart-pointer-like types.

by garaetjjte

3/22/2026 at 4:16:11 AM

You'd still use the "." operator. Value, reference, or smart pointer use the same syntax. This means you can refactor them easily.

by WalterBright

3/21/2026 at 4:44:27 PM

I ported Python to C++ one time and it ran 10c faster with 10x less memory usage with no architectural changes

by zeroonetwothree

3/22/2026 at 12:48:02 AM

This is the difference between scripting and programming. If you use C++ as a scripting language you're gonna have a bad time.Of course a scripting language is faster for scripting! That doesn't mean you go full Graham and throw away real programming languages, it just means you aren't writing systems software.

The usual strategy is to write a script then if it's slow see how you could design a program that would

The usual strategy in the real world is to copy paste thousands of lines of C++ code until someone comes along and writes a proper direct solution to the problem.

Of course there are ideas on how to fix this, either writing your own scripting libraries (stb), packages (go/rust/ts), metaprogramming (lisp/jai). As for bugs those are a function of how you choose to write code, the standard way of writing shell it bug prone, the standard way of writing python is less so, not using overloading & going wider in c++ generally helps.

by casey2

3/21/2026 at 12:29:58 AM

[flagged]

by DaleBiagio

3/21/2026 at 3:37:08 AM

This comment comes from a bot account. One of the more clever ones I’ve seen that avoids some of the usual tells, but the comment history taken together exposes it.

I hit the flag button on the comment and suggest others do too.

by Aurornis

3/21/2026 at 3:18:18 AM

Thanks, Programming History Facts Bot

I was not actually sure this one was a bot, despite LLM-isms and, sadly, being new. But you can look at the comment history and see.

by furyofantares

3/21/2026 at 3:09:07 AM

Until at some point in a language like python all the things that allowed you write software faster start to slow you down like the lack of static typing and typing errors and spending time figuring out whether foo method works with ducks or quacks or foovars or whether the latest refactoring actually silently broke it because now you need bazzes instead of ducks. Yeah.

by samiv

3/21/2026 at 2:06:26 AM

I don't think the better software part is playing out

by apitman

3/21/2026 at 2:33:53 AM

There’s a lot of really great software out there right now, and a lot that’s terrible and I think powerful abstractions enable both.

by ch4s3

3/21/2026 at 2:36:19 AM

you're thinking of the programs in low-level langs that survived their higher-level-lang competitors; if you plot the programs on your machine by age, how does the low quartile compare on reliability between programs written in each group

by remexre

3/21/2026 at 12:28:58 AM

[flagged]

by envguard

3/21/2026 at 9:07:04 AM

AI account

by sincerely

3/21/2026 at 8:31:05 AM

> We immediately started moving the rest of our back end to Python. Most things were slower, but not by much because most of our back end was i/o bound.

Would be kind of cool if e. g. python or ruby could be as fast as C or C++.

I wonder if this could be possible, assuming we could modify both to achieve that as outcome. But without having a language that would be like C or C++. Right now there is a strange divide between "scripting" languages and compiled ones.

by shevy-java

3/21/2026 at 12:02:37 PM

@dang this is an ai slop account, check his other comments

by nubg

3/21/2026 at 8:42:48 AM

I suspect that you used highly optimized algorithms written for python, like the vector algorithms in numpy? You will struggle to write better code, at least I would.

by peter_retief

3/21/2026 at 8:59:22 AM

Python 1.4 would be mid-late 90s long before numpy and vector algorithms would have been available.

I suspect it’s more likely to be something like passing std::string by value not realising that would copy the string every time, especially with the statement that the mistake would be hard to express in Python.

by masklinn

3/21/2026 at 9:57:05 AM

Everything is new to the uninitiated. :P

by johnisgood

3/20/2026 at 9:58:59 PM

The real win here isn't TS over Rust, it's the O(N²) -> O(N) streaming fix via statement-level caching. That's a 3.3x improvement on its own, independent of language choice. The WASM boundary elimination is 2-4x, but the algorithmic fix is what actually matters for user-perceived latency during streaming. Title undersells the more interesting engineering imo.

by blundergoat

3/21/2026 at 3:23:18 PM

They even directly conclude at the end of the article that improvements in algorithm are more important than the choice of language:

> Algorithmic complexity improvements dominate language-level optimisations. Going from O(N²) to O(N) in the streaming case had a larger practical impact than switching from WASM to TypeScript.

Yet they still have chosen to put the “Rust rewrite” part in the title. I almost think it's a click bait.

by zahrevsky

3/21/2026 at 12:01:22 AM

Yeah the algorithmic fix is doing most of the work here. But call that parser hundreds of times on tiny streaming chunks and the WASM boundary cost per call adds up fast. Same thing would happen with C++ compiled to WASM.

by nulltrace

3/21/2026 at 7:43:07 AM

WASM boundary overhead is only half the story. Once you start bouncing tiny chunks across JS and WASM over and over, the data shuffling and memory layout mismatch can trash cache behavior, pile on allocation churn, and turn a nice benchmark into something that looks nothing like a parser living inside a streaming pipeline. That's why most 'language duel' posts feel beside the point.

by hrmtst93837

3/20/2026 at 11:46:04 PM

O(N²) -> O(N) was 3.3x faster, but before that, eliminating the boundary (replacing wasm with JS) led to speedups of 2.2x, 4.6x, 3.0x (see one table back).

It looks like neither is the "real win". both the language and the algorithm made a big difference, as you can see in the first column in the last table - going to wasm was a big speedup, and improving the algorithm on top of that was another big speedup.

by azakai

3/21/2026 at 6:33:00 AM

[dead]

by hrmtst93837

3/20/2026 at 10:55:09 PM

same for uv but no one takes that message. They just think "rust rulez!" and ignore that all of uv's benefits are algo, not lang.

by socalgal2

3/20/2026 at 11:09:48 PM

Some architectures are made easier by the choice of implementation language.

by estebank

3/21/2026 at 4:37:56 AM

UV also has the distinct advantage in dependency resolution that it didn't have to implement the backwards compatible stuff Pip does, I think Astral blogged on it. If I can find it, I'll edit the link in.

edit wasn't Astral, but here's the blog post I was thinking of. https://nesbitt.io/2025/12/26/how-uv-got-so-fast.html

That said, your point is very much correct, if you watch or read the Jane Street tech talk Astral gave, you can see how they really leveraged Rust for performance like turning Python version identifiers into u64s.

by EdwardDiego

3/21/2026 at 12:27:35 AM

In my experience Rust typically makes it a little bit harder to write the most efficient algo actually.

by crubier

3/21/2026 at 3:26:45 AM

That’s usually ok bc in most code your N is small and compiler optimizations dominate.

by catlifeonmars

3/21/2026 at 3:37:34 AM

Would you be willing to give an example of this?

by Defletter

3/21/2026 at 8:27:26 AM

Not OP, but one example where it is a bit harder to do something in Rust that in C, C++, Zig, etc. is mutability on disjoint slices of an array. Rust offers a few utilities, like chunks_by, split_at, etc. but for certain data structures and algorithms it can be a bit annoying.

It's also worth noting that unsafe Rust != C, and you are still battling these rules. With enough experience you gain an understanding of these patterns and it goes away, and you also have these realy solid tools like Miri for finding undefined behavior, but it can be a bit of a hastle.

by lukeweston1234

3/21/2026 at 2:53:16 PM

Has no one written a python! macro for this use case?

by catlifeonmars

3/21/2026 at 10:47:36 AM

Mutating tree structures tends to be a fiddle (especially if you want parent pointers).

by foldr

3/21/2026 at 11:36:55 AM

Just the fact that I can install a single binary is 10x better than an equally fast Python implementation.

by coldtea

3/20/2026 at 11:48:56 PM

That's a pretty big claim. I don't doubt that a lot of uv's benefits are algo. But everything? Considering that running non IO-bound native code should be an order of magnitude faster than python.

by rowanG077

3/21/2026 at 12:32:28 AM

Its a pretty well-supported claim. uv skips doing a number of things that generate file I/O. File I/O is far more costly than the difference in raw computation. pip can't drop those for compatibility reasons.

https://nesbitt.io/2025/12/26/how-uv-got-so-fast.html

by jeremyjh

3/21/2026 at 5:22:24 AM

Do you actually believe that UV would be as fast if it were written in Python?

by staticassertion

3/21/2026 at 7:59:39 AM

It would come pretty close, probably close enough that you wouldn't be able to tell the difference on 90% of projects.

by tinco

3/21/2026 at 8:11:02 AM

Vague. What's pretty close? I mean, even for IO bound tasks you can pretty quickly validate that the performance between languages is not close at all - 10 to 100x difference.

by staticassertion

3/21/2026 at 8:24:26 AM

Sure, within 100ms. Who cares what the performance multiples are?

by tinco

3/21/2026 at 8:26:57 AM

That literally makes no sense. 100ms... out of what? Is it 1ms vs 100ms? 100000ms vs 100100ms?

Anyway, dubious claim since a Python interpreter will take 10s of milliseconds just to print out its version.

Do you have any evidence? I can point at techempower benchmarks showing IO bound tasks are still 10-100x faster in native languages vs Python/JS.

by staticassertion

3/21/2026 at 8:28:48 AM

I'm saying that the Rust might execute in 50ms and the Python in 150ms. You are the one not making sense, we are talking about application performance, why are you not measuring that in milliseconds.

That is assuming Rust is 100x faster than Python btw, 49ms of I/O, 1ms of Rust, 100ms of Python.

by tinco

3/21/2026 at 12:44:27 PM

> I'm saying that the Rust might execute in 50ms and the Python in 150ms.

Okay, so the Rust code would be 3x as fast. Feels arbitrary, but sure.

> You are the one not making sense, we are talking about application performance, why are you not measuring that in milliseconds.

I explained why your post made no sense already...

> That is assuming Rust is 100x faster than Python btw, 49ms of I/O, 1ms of Rust, 100ms of Python.

That's not how anything works. Different languages will perform differently on IO work, different runtimes will degrade under IO differently, etc. That's why even basic echo HTTP servers perform radically differently in Python vs Rust.

This isn't how computers work and it's not even how math works.

This conversation has become nonsensical. The thing we can agree with is this - no, uv would not be as fast if it were written in Python.

by staticassertion

3/21/2026 at 2:25:12 PM

> That's not how anything works. Different languages will perform differently on IO work, different runtimes will degrade under IO differently, etc. That's why even basic echo HTTP servers perform radically differently in Python vs Rust.

> This isn't how computers work and it's not even how math works.

What are you disagreeing with? There's some baseline amount of I/O that the kernel does for you, that's what I'm assuming is 50ms, and everything else like runtime degrading is overhead due to the language/platform choice. I'm saying Rust is upwards of 100x faster in that regard thanks to its zero cost abstraction philosophy. You can't just include the I/O baseline in a claim about Rust's performance advantage. You'll be really disappointed when Rust doesn't download your files 100x as fast as the Python file downloader.

Anyway, I'm sorry I provoked your antagonism with my terse messages, I wasn't trying to be blase. I believe uv is the sort of tool that wouldn't suffer much from the downsides of Python and that in most situations the reduced runtime overhead of Rust would have a negligible impact on the user experience. I'm not arguing that they shouldn't build uv in Rust. Most situations is not all situations, and when a tool is used so widely you'll hit all edge cases, from the point where the 10s of milliseconds of startup time matters to the point where Pythons I/O overhead matters at scale.

by tinco

3/21/2026 at 11:43:15 PM

I think a missing piece here is that you think that Rust won't download a file faster than Python but it absolutely can. This seems to just be a misconception people have about IO, like "download a file" is a thing that exists wholly outside of your process.

by staticassertion

3/22/2026 at 6:57:54 AM

I know it can, but it can't download it faster than the network card can write it into its buffers. That's the part I would count as the 50ms that both can't improve upon.

by tinco

3/22/2026 at 12:22:01 PM

Of course. But why would that matter if Python can't get there to begin with? You're not going to hit NIC bottlenecks with Python, not without a ton of work and tradeoffs at least.

by staticassertion

3/21/2026 at 2:05:44 PM

> Different languages will perform differently on IO work,

IO is executed by kernel, file system or network drivers. IO performance is not dependent at all on which language makes the syscalls.

> The thing we can agree with is this - no, uv would not be as fast if it were written in Python.

In this thread, we are talking about the speed of uv in terms of user experience - how long a person waits for command line operations to complete. Things that pip takes multiple seconds to do, uv will do in dozens of milliseconds. If uv were written in python, it would take dozens of ms + a few dozens more, which means absolutely fuck all nothing in the context of the thousands of milliseconds saved over pip.

Its possible a user might perceive a slight difference in larger projects, but if pip had been uv-but-in-python, the uv-in-rust project would never have been started in the first place because no one would have bothered switching.

> This conversation has become nonsensical.

Agreed. No one in this thread is disputing that Rust code is faster than Python, only that in this case it is completely insignificant in the face of all the useless file and network I/O that pip is doing, and uv is not.

by jeremyjh

3/21/2026 at 11:22:59 PM

> IO is executed by kernel, file system or network drivers. IO performance is not dependent at all on which language makes the syscalls.

I think your posts on this topic can not possibly be worth responding to if you're coming to the conversation with this level of not understanding things.

Your post is a combination of not understanding computers and then hand waving about fake numbers and user expectations. IO is not magic, it is not some distinct process that you have no control over from userland, it is exactly the sort of thing that Python does very poorly at, in fact.

I'll just reference techempower again, or you can look up those system calls you referenced like how epoll works and then look into what is involved for Python to use epoll effectively.

by staticassertion

3/21/2026 at 1:00:27 AM

I don't think the article you linked supports the claim that none of UV performance improvements are related to using rust over python at all. In fact it directly states the exact opposite. They have an entire section dedicated to why using Rust has direct performance advantages for UV.

by rowanG077

3/21/2026 at 2:24:20 AM

What it says is this:

> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.

by jeremyjh

3/21/2026 at 2:33:07 AM

Yes exactly! That quote directly disproves that all of the improvements UV has over competitors is because of algos, not because of rust.

So the claim is not well supported at all by the article as you stated, in fact the claim is literally disproven by the article.

by rowanG077

3/21/2026 at 3:55:27 AM

This is either an overly pedantic take or a disingenuous one. The very first line that the parent quoted is

> uv is fast because of what it doesn’t do, not because of what language it’s written in.

The fact that the language had a small effect ("a bit") does not invalidate the statement that algorithmic improvements are the reason for the relative speed. In fact, there's no reason to believe that rust without the algorithmic version would be notably faster at all. Sure, "all" is an exaggeration, but the point made still stands in the form that most readers would understand it: algorithmic improvements are the important difference between the systems.

by kyralis

3/21/2026 at 4:08:36 AM

I think we might be talking past each other a bit.

The specific claim I was responding to was that all of uv’s performance improvements come from algorithms rather than the language. My point was just that this is a stronger claim than what the article supports, the article itself says Rust contributes “a bit” to the speed, so it’s not purely algorithmic.

I do agree with the broader point that algorithmic and architectural choices are the main reason uv is fast, and I tried to acknowledge that, apparently unsuccessfully, in my very my first comment (“I don't doubt that a lot of uv's benefits are algo. But everything?”).

by rowanG077

3/21/2026 at 5:45:57 PM

You are being very pedantic here.

by ambicapter

3/21/2026 at 2:53:25 AM

You are right. 99% is not 100%.

by jeremyjh

3/21/2026 at 3:20:03 AM

I don't think the article has substantive numbers. You'd have to re-implement UV in python to do that. I don't think anyone did that. It would be interesting at least to see how much UV spends in syscalls vs PIP and make a relative estimate based on that.

by rowanG077

3/21/2026 at 12:10:08 AM

More than one, I'd think.

by thfuran

3/21/2026 at 3:22:58 AM

You’re not wrong, but that win would not get as many views. It’s not clickbaity enough

by catlifeonmars

3/21/2026 at 10:29:44 AM

> The real win here isn't TS over Rust

Kinda is. We came up with abstractions to help reason about what really matters. The more you need to deal with auxillary stuff (allocations, lifetimes), more likely you will miss the big issue.

by wolvesechoes

3/21/2026 at 11:35:53 AM

The opposite: the more you rely on abstractions the more you miss the lower level optimization opportunities and loose understanding of algorithms and hardware.

by coldtea

3/21/2026 at 11:53:56 AM

> of algorithms

Yes, sprinkling your code logic with malloc, .clone() or lifetime annotations on the other hand brings algorithmic enlightenment.

by wolvesechoes

3/21/2026 at 2:11:59 PM

Dealing and having to think about the cost of malloc, clone() and lifetimes, brings algorithmic enlightenment more than working on an high abstraction ivory tower where things "magically happen".

Is your argument that the average Python or Typescript dev gets to think and care more about algorithms than the average C/C++/Rust dev?

by coldtea

3/20/2026 at 10:42:44 PM

Yeah, though the n^2 is overstating things.

One thing I noticed was that they time each call and then use a median. Sigh. In a browser. :/ With timing attack defenses build into the JS engine.

by sroussey

3/20/2026 at 11:52:09 PM

For those of us not in the know, what are we expecting the results of the defenses to be here?

by fn-mote

3/21/2026 at 4:28:23 AM

Jitter. It make precise timings unreliable. Time the entire time of 1000 runs and divide by 1000 instead of starting and stopping 1000 timers.

by sroussey

3/21/2026 at 5:14:59 AM

No AI generated comments on HN please.

by adastra22

3/20/2026 at 10:23:30 PM

More like a misleading clickbait.

by shmerl

3/20/2026 at 10:45:32 PM

> Title undersells the more interesting engineering imo.

Thanks for cutting through the clickbait. The post is interesting, but I'm so tired of being unnecessarily clickbaited into reading articles.

by Aurornis

3/21/2026 at 3:03:16 AM

Yeah if you're serializing and deserializing data across the JS-WASM boundary (or actually between web workers in general whether they're WASM or not) the data marshaling costs can add up. There is a way of sharing memory across the boundary though without any marshaling: TypedArrays and SharedArrayBuffers. TypedArrays let you transfer ownership of the underlying memory from one worker (or the main thread) to another without any copying. SharedArrayBuffers allow multiple workers to read and write to the same contiguous chunk of memory. The downside is that you lose all the niceties of any JavaScript types and you're basically stuck working with raw bytes.

You still do get some latency from the event loop, because postMessage gets queued as a MacroTask, which is probably on the order of 10μs. But this is the price you have to pay if you want to run some code in a non-blocking way.

by simonbw

3/21/2026 at 5:49:08 AM

Strongly agree from an Emscripten C++ wasm pov: it's key to minimise emscripten::val roundtrips. Caches must be designed for rectilinear data geometry, and SharedArrayBuffers are the way for bulk data. But only JS allows us to express asynchrony, so we need an on_completion callback design at the lang boundary.

by osullivj

3/21/2026 at 10:26:49 AM

Indeed a whole class of issues become moot if you just don't use javascript anywhere. In the browser world this is obviously difficult/impossible; I look forward to the day when WASM can run natively in a browser and doesn't need javascript at all, DOM, network, etc, etc. On the server side? Just steer clear of the javascript ecosystem altogether.

by tankenmate

3/21/2026 at 11:21:43 AM

So the actual processing is faster in rust/c/c++ but the marshaling costs are so big so ts is faster in this case? No vlue how something like swc does this but there it's way faster then babel.

by fHr

3/21/2026 at 4:45:46 AM

This should be the top comment

by jesse__

3/20/2026 at 10:28:33 PM

"We rewrote this code from language L to language M, and the result is better!" No wonder: it was a chance to rectify everything that was tangled or crooked, avoid every known bad decision, and apply newly-invented better approaches.

So this holds even for L = M. The speedup is not in the language, but in the rewriting and rethinking.

by nine_k

3/20/2026 at 10:33:40 PM

Now they just need a third party who's never seen the original to rewrite their TypeScript solution in Rust for even more gains.

by MiddleEndian

3/20/2026 at 10:42:52 PM

Indeed! But only after a year or so of using it in production, so that the drawbacks would be discovered.

by nine_k

3/20/2026 at 11:49:08 PM

You're generally right - rewrites let you improve the code - but they do have an actual reason the new language was better: avoiding copies on the boundary.

They say they measured that cost, and it was most of the runtime in the old version (though they don't give exact numbers). That cost does not exist at all in the new version, simply because of the language.

by azakai

3/21/2026 at 9:16:49 AM

It's doing copies and (de)serialization on both sides into native data types.

If they used raw byte structures, implemented the caching improvements on the wasm side, the copies might not be as bad.

But they still have an issue with multi-language stack: complexity also has a cost.

Python/C combo does not have this issue because you can work with Python types natively in C, but otherwise, this is a cross-language conversion issue, and not a Rust issue at all.

by necovek

3/21/2026 at 7:01:48 AM

One of the authors here. While that’s generally true, in this case it wasn’t time that helped us learn what worked. It was a nagging sense that the architecture wasn’t right, just days before launch, along with heavy instrumentation to test our assumptions.

by rabisg

3/20/2026 at 10:59:55 PM

Truth. You can see improvement, even rewriting code in the same language.

by baranul

3/21/2026 at 1:25:50 AM

I think that they were honest about that to a degree, they pointed out that one source of the speed up was caused by the python fixing a big they hadn't noticed in the C++

Edit: fixed phone typos

by awesome_dude

3/21/2026 at 9:43:29 AM

I have been saying this for a while now (thought it was obvious), and often I get downvoted when I point this out.

by johnisgood

3/20/2026 at 10:58:43 PM

By the way, I did a deeper dive on the problem of serializing objects across the Rust/JS boundary, noticed the approach used by serde wasn’t great for performance, and explored improving it here: https://neugierig.org/software/blog/2024/04/rust-wasm-to-js....

by evmar

3/21/2026 at 12:23:14 AM

Did you try something like msgpack or bebop?

by slopinthebag

3/20/2026 at 10:30:10 PM

I was wondering why I hadn't heard of Open UI doing anything with WASM.

This new company chose a very confusing name that has been used by the Open UI W3C Community Group for over 5 years.

https://open-ui.org/

Open UI is the standards group responsible for HTML having popovers, customizable select, invoker commands, and accordions. They're doing great work.

by spankalee

3/21/2026 at 11:47:11 AM

“We saw huge speed-ups when changing technology.”

Looks inside

“The old implementation had some really inappropriate choices.”

Every time.

by moomin

3/21/2026 at 2:22:22 PM

The real lesson here isn't "TypeScript beats Rust" - it's that WASM has non-trivial overhead that's easy to underestimate. The JS engine has spent decades being optimized specifically for the patterns JS/TS code tends to produce. When you cross the WASM boundary, you pay for it: serialization, memory copies, the impedance mismatch between WASM's linear memory model and JS's garbage-collected heap.

For a parser specifically, you're probably spending a lot of time creating and discarding small AST nodes. That's exactly the kind of workload where V8's generational GC shines and where WASM's manual memory management becomes a liability rather than an asset.

The interesting question is whether this scales. A parser that runs on small inputs in a browser is a very different beast from one processing multi-megabyte files in a tight loop. At some point the WASM version probably wins - the question is whether that workload actually exists in your product.

by diablevv

3/20/2026 at 11:52:15 PM

God I hate AI writing.

That final summary benchmark means nothing. It mentions 'baseline' value for the 'Full-stream total' for the rust implementation, and then says the `serde-wasm-bindgen` is '+9-29% slower', but it never gives us the baseline value, because clearly the only benchmark it did against the Rust codebase was the per-call one.

Then it mentions: "End result: 2.2-4.6x faster per call and 2.6-3.3x lower total streaming cost."

But the "2.6-3.3x" is by their own definition a comparison against the naive TS implementation.

I really think the guy just prompted claude to "get this shit fast and then publish a blog post".

by joaohaas

3/21/2026 at 5:12:07 AM

This. It’s so annoying to read these types of blogs now where the writer clearly didn’t put the effort to understand things fully or atleast review the blog their LLM wrote. Who is this useful for?

by chvish

3/21/2026 at 5:29:12 AM

The article as a whole makes no sense. They are generating UI with an LLM. How fast the UI appears to the user is going to be completely dictated by the speed of the LLM, not the speed of the serialisation.

by JimDabell

3/21/2026 at 7:25:51 AM

as an author of the blog - ouch did a little bit more than prompt claude but a lot of claude prompting was definitely involved

I understand your frustration with AI writing though. We are a small team and given our roadmap it was either use LLMs to help collate all the internal benchmark results file into a blog or never write it so we chose the former. This was a genuinely surprising and counterintuitive result for us, which is why we wanted to share it. Happy to clarify any of the numbers if helpful.

by rabisg

3/21/2026 at 11:31:12 AM

This is why, when a programming language already has tooling for compilers, being it ahead of time, or dynamic, it pays off to first go around validating algorithms and data structures before a full rewrite.

Additionally even after those options are exhausted, only a key parts might need a rewrite, not the whole thing.

However, I wonder how many care about actually learning about algorithms, data structures and mechanical sympathy in the age of Electron apps.

It feels quite often that a rewrite is chosen, because knowing how to actually apply those skills is the CS stuff many think isn't worthwhile learning about.

by pjmlp

3/21/2026 at 11:33:33 AM

>However, I wonder how many care about actually learning about algorithms, data structures and mechanical sympathy in the age of Electron apps.

Never mind the age of Electron apps, even fewer care about those in the age of agents.

by coldtea

3/21/2026 at 11:49:05 AM

Agreed, however I would assert that in the age of agents, programming languages will become irrelevant to most, other those lucky enough druids to write AI runtime stack, at the AI overlords.

And those will still care about CS.

by pjmlp

3/21/2026 at 12:45:29 AM

Not directly related to the post but what does OpenUI do? I'm finding it interesting but hard to understand. Is it an intermediate layer that makes LLMs generate better UI?

by vmsp

3/21/2026 at 7:07:38 AM

Its the library that bridges the gap between LLMs and live UI. Best example would be to imagine you want to build interactive charts within your AI agent (like Claude)

The most obvious approach would be to let LLMs generate code and render it but that introduces problems like safety, UI consistency and speed. OpenUI solves those problems and provides a safe, consistent and token optimized runtime for the LLMs to render live UI

by rabisg

3/21/2026 at 11:08:05 AM

Is it kinda similar to the new GenUI SDK for Flutter in that sense?

https://docs.flutter.dev/ai/genui

by aquariusDue

3/21/2026 at 1:33:11 PM

Haven't looked in depth but yes it feels like they are solving the same problem.

This is an alternative to json-render by Vercel or A2UI by Google which I'm guessing the flutter implementation is based on

by rabisg

3/21/2026 at 12:18:27 AM

> The openui-lang parser converts a custom DSL emitted by an LLM into a React component tree.

> converts internal AST into the public OutputNode format consumed by the React renderer

Why not just have the LLM emit the JSON for OutputNode ? Why is a custom "language" and parser needed at all? And yes, there is a cost for marshaling data, so you should avoid doing it where possible, and do it in large chunks when its not possible to avoid. This is not an unknown phenomenon.

by jeremyjh

3/21/2026 at 5:56:29 AM

Its also worth underlining that it's not just "The parsing computation is fast enough that V8's JIT eliminates any Rust advantage", but specifically that this kind of straight-forward well-defined data structures and mutation, without any strange eval paths or global access is going to be JITed to near native speed relatively easily.

by athrowaway3z

3/21/2026 at 7:52:28 PM

Why weren't you able to use WASM shared heaps to get zero-copy behavior?

AFAIK, you can create a shared memory block between WASM <-> JS:

https://developer.mozilla.org/en-US/docs/WebAssembly/Referen...

Then you'd only need to parse the SharedArrayBuffer at the end on the JS side

by gavinray

3/21/2026 at 12:30:07 AM

The WASM story is interesting from a security angle too. WASM modules inheriting the host's memory model means any parsing bugs that trigger buffer overreads in the Rust code could surface in ways that are harder to audit at the JS boundary. Moving to native TS at least keeps the attack surface in one runtime, even if the theoretical memory safety guarantees go down.

by envguard

3/21/2026 at 3:47:13 AM

I’m more of a dabbler dev/script guy than a dev but Every. single. thing I ever write in javascript ends up being incredibly fast. It forces me to think in callbacks and events and promises. Python and C (or async!) seem easy and sorta lazy in comparison.

by horacemorace

3/21/2026 at 12:16:53 AM

This article is obviously AI generated and besides being jarring to read, it makes me really doubt its validity. You can get substantially faster parsing versus `JSON.parse()` by parsing structured binary data, and it's also faster to pass a byte array compared to a JSON string from wasm to the browser. My guess is not only this article was AI generated, but also their benchmarks, and perhaps the implementation as well.

by slopinthebag

3/21/2026 at 12:23:25 AM

It's vibe code all the way down!

by StilesCrisis

3/20/2026 at 11:48:15 PM

Why not a shared buffer? Serializing into JSON on this hot path should be entirely avoidable

by nallana

3/21/2026 at 12:28:03 AM

I think a shared array just avoids the copy, not the serialization which is the main problem as they showed with serde-wasm-bindgen test

by mavdol04

3/21/2026 at 8:02:38 PM

You can avoid the serialization in WASM by pushing structured bytes to the SharedArrayBuffer, then do serialization in JS which should be relatively cheap compared to pushing JSON strings across the boundary.

by notnullorvoid

3/21/2026 at 12:05:09 AM

[dead]

by devnotes77

3/21/2026 at 4:50:02 AM

This somehow reminds me of the days when the fastest way to deep copy an object in javascript was to round trip through toString. I thought that was gross then, and I think this is gross now

by jesse__

3/20/2026 at 11:48:45 PM

Good software is usually written on 2nd+ try.

by ivanjermakov

3/21/2026 at 9:02:39 AM

JS and WASM share the main arraybuffer. It's just very not-javascript-like to try to use an arraybuffer heap, because then you don't have strings or objects, just index,size pairs into that arraybuffer.

Anyway, Javascript is no stranger to breaking changes. Compare Chromium 47 to today. Just add actual integers as another breaking change, then WASM becomes almost unnecessary.

by Dwedit

3/21/2026 at 3:38:51 PM

When there is a solid test harness, AI Coding can do magic!

It was able to beat XZ on its own game by a good margin:

https://github.com/mohsen1/fesh

by mohsen1

3/21/2026 at 3:48:26 PM

> I had no idea how any of this works.

This is apparent. xz's own game is not "a specialized compression pre-processor for x86_64 ELF binaries.". xz's own game is a general-purpose compression utility suited for a range of tasks, not optimized for one ridiculously specific domain. Also, any compression benchmark really ought to include speed of de/compression, not only compression ratio, as compression algorithms occupy along a scale trying to maximize one trade-off or another.

by applfanboysbgon

3/21/2026 at 4:18:40 PM

I never claimed to beat xz as a general-purpose compressor. .tar.xz is the dominant format for Linux source tarballs and distro packages. So optimizing for ELF + x86_64 is optimizing for a very real and common case, not some toy benchmark.

btw goal of the project was not building a production ready solution. It was curious case of black box software development. Compression is great because input and output are precise bits. As for speed, I think it's comparable since it's using most of XZ infra anyways.

by mohsen1

3/21/2026 at 3:22:24 AM

I heard a lot of similar stories in the past when I started using Python 20+ years ago. A number of people claimed their solutions got faster when develop in Python, mainly because Python make it easier to quickly pivot to experiment with various alternative methods, hence finally yield at more efficient outcome at the end.

by sakesun

3/21/2026 at 8:50:08 AM

Is this an outlier or has Rust started to be part of the establishment and being 'old' so that people want to share their "moving away from Rust" stories?

I didn't mind reading articles that are not about how Rust is great in theory (and maybe practice).

by bulbar

3/21/2026 at 9:43:05 AM

This story is about moving away from WASM for an application that's unsuitable for it. It's not really about Rust.

by zozbot234

3/21/2026 at 7:57:32 PM

It's not an unsuitable application for WASM. They could've drastically reduced the WASM boundary impact if instead of mapping to JSON in Rust they streamed out structured bytes to JS then mapped to JSON there. And the streaming fix was language independent.

So it's more so a story about architectural mistakes.

by notnullorvoid

3/21/2026 at 9:04:25 AM

There's a certain segment of the industry that's always chasing the newest thing. Many of them like Zig for some ghastly reason.

That said, Rust does have real problems. Manual memory management sucks. People think GC is expensive? Well, keep in mind malloc() and free() take global locks! People just have totally bogus mental models of what drives performance. These models lead them to technical nonsense.

by quotemstr

3/21/2026 at 6:06:33 AM

I hope we can still get to a point where wasm modules can directly access the web platform APIs and get JS out of the picture entirely. After all, those APIs themselves are implemented in C++ (and maybe some Rust now).

by mwcampbell

3/21/2026 at 11:21:37 AM

In ye olden days of WASM just added to the browser, the difference between native JS and boost::spirit in WASM was x200.

In their worst case it was just x5. We clearly have some progress here.

by gettingoverit

3/21/2026 at 11:57:44 AM

This has been known by Node.js developers for a while with many C++ core and NPM modules being rewritten in JavaScript to improve performance.

by LunaSea

3/20/2026 at 10:25:36 PM

That blog post design is very nice. I like the 'scrollspy' sidebar which highlights all visible headings.

Claude tells me this is https://www.fumadocs.dev/

by dmix

3/20/2026 at 10:43:59 PM

Interesting, thanks. I need make some good docs soon.

by sroussey

3/20/2026 at 10:53:11 PM

Good documentation is always worth the effort. Markdown explaining your products is gold these days with LLMs.

by dmix

3/21/2026 at 7:15:31 AM

[dead]

by rabisg

3/20/2026 at 11:54:30 PM

Rewrite bias. Yoy want to also rewrite the Rust one in Rust for comparison.

by nssnsjsjsjs

3/21/2026 at 12:09:03 AM

It would be surprising if rewriting in Rust could change the WASM boundary tax that the article identified as the actual problem.

by jeremyjh

3/21/2026 at 7:13:20 AM

(author here) We'd be really surprised if a rewrite could fix the boundary tax but if it does, we'd happily move over to it. People (including me) really underestimate how insanely fast browser's JSON.parse is

by rabisg

3/21/2026 at 1:50:03 AM

So this is an issue with WASM/JS interop, not with Rust per se?

by owenpalmer

3/21/2026 at 12:41:30 AM

I dream of the day in which there is no need to pass by JS and Wasm can do all the job by itself. Meanwhile, we are stuck.

by kennykartman

3/20/2026 at 10:33:40 PM

What is the purpose of the Rust WASM parser? Didn't understand that easily from the article. Would love a better explanation.

by caderosche

3/20/2026 at 11:16:10 PM

They use a bespoke language to define LLM-generated UI components. I think that this is supposed to prevent exfiltration if the LLM is prompt-injected. In any case, the parser compiles chunks streaming from the LLM to build a live UI. The WASM parser restarted from the beginning upon each chunk received. Fixing this algorithm to work more incrementally (while porting from Rust to TypeScript) improved performance a lot.

by joshuanapoli

3/21/2026 at 12:39:25 AM

It would be great if people stopped dismissing the problem that WASM not being a first-class runtime for the web causes.

by marcosdumay

3/21/2026 at 11:19:52 AM

I almost can't believe this swc for example is 80x faster then babeljs.

by fHr

3/21/2026 at 12:18:39 PM

Great write up. It feels like craft in the age of slop.

Not sold about the fundamental idea of OpenUI though. XML is a great fit for DSLs and UI snippets.

by bluelightning2k

3/21/2026 at 1:37:17 PM

We tried all formats - XML, json, jsonl, even toon - before deciding that we need to invest in OpenUI Lang

The primary motivation was speed and schema cohesion. We were running a JSON based format, Thesys C1, in production for a year before we realized we cannot add features fast enough because we were fighting the LLMs at multiple levels. It's probably too much to write in a comment but we'd like to write about the motivation and all the things we tried ona a separate blog soon

by rabisg

3/21/2026 at 12:38:26 PM

Are you kidding? To the extent this was “crafted” it was by an LLM from somebody’s notes in a prompt.

The other day, someone linked back to this 2018 post on finding a cache coherency bug in the Xbox 360 CPU:

https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-d...

So much more genuinely engaging than any of the AI-“enhanced” sloppy, confused, trite writing that gets to the front page here daily because it’s been hyper-optimized for upvotes.

by twoodfin

3/21/2026 at 5:37:17 PM

Press x to doubt

by rpodraza

3/20/2026 at 11:28:33 PM

> Attempted Fix: Skip the JSON Round-Trip > We integrated serde-wasm-bindgen

So you're reinventing JSON but binary? V8 JSON nowadays is highly optimized [1] and can process gigabytes per second [2], I doubt it is a bottleneck here.

[1] https://v8.dev/blog/json-stringify [2] https://github.com/simdjson/simdjson

by szmarczak

3/21/2026 at 12:10:25 AM

No, serde-wasm-bindgen implements the serde Serializer interface by calling into JS to directly construct the JS objects on the JS heap without an intermediate serialization/deserialization. You pay the cost of one or more FFI calls for every object though.

https://docs.rs/serde-wasm-bindgen/

by kam

3/21/2026 at 10:39:36 AM

Indeed, you're right. However, it still needs to encode and decode strings. WASM just needs native interop.

by szmarczak

3/21/2026 at 8:29:17 AM

So ...

Rust.

WASM.

TypeScript.

I am slowly beginning to understand why WASM did not really succeed.

by shevy-java

3/21/2026 at 3:01:21 AM

I tried a similar experiment recently w/ FFT transform for wav files in the browser and javascript was faster than wasm. It was mostly vibe coded Rust to wasm but FFT is a well-known algorithm so I don't think there were any low hanging performance improvements left to pick.

by measurablefunc

3/21/2026 at 1:07:49 PM

It looks like FFTW3 is working on wasm support: https://github.com/FFTW/fftw3/issues/293

You could also try pretty fast fft: https://github.com/JorenSix/pffft.wasm

by wintermute4282

3/21/2026 at 6:01:23 PM

[dead]

by Yanko_11

3/21/2026 at 9:01:21 AM

[dead]

by Yanko_11

3/21/2026 at 2:32:27 PM

[flagged]

by ryguz

3/21/2026 at 2:25:51 PM

[dead]

by abitabovebytes

3/21/2026 at 2:54:01 PM

[dead]

by ata-sesli

3/21/2026 at 1:35:08 PM

[dead]

by arthurjean

3/21/2026 at 12:40:29 AM

[dead]

by dualblocksgame

3/21/2026 at 2:01:44 PM

[dead]

by wangnaihe

3/20/2026 at 11:54:06 PM

[dead]

by patapim

3/21/2026 at 2:42:24 AM

[dead]

by derodero24

3/21/2026 at 2:12:31 AM

[flagged]

by aimarketintel

3/21/2026 at 6:05:03 PM

[dead]

by leontloveless

3/20/2026 at 11:09:22 PM

They should rewrite it in rust again to get another 3x performance increase /s

by SCLeo

3/21/2026 at 12:28:14 AM

[dead]

by DaleBiagio

3/21/2026 at 1:05:58 AM

[dead]

by ConanRus

3/20/2026 at 11:14:22 PM

Am I mistaken or isn’t TypeScript just Golang under the hood these days?

by slowhadoken

3/21/2026 at 12:10:55 AM

There is too much wrong here to call it a mistake.

by jeremyjh

3/20/2026 at 11:32:03 PM

Hmm, there's an in-progress rewrite of the TypeScript compiler in Go; is that what you mean?

I don't think that's actually out yet, and more importantly, it doesn't change anything at runtime -- your code still runs in a JS engine (V8, JSC etc).

by iainmerrick

3/21/2026 at 1:58:12 AM

npm i -D @typescript/native-preview

You can use it today.

by koakuma-chan

3/21/2026 at 3:19:21 PM

Yes, you've uncovered grand conspiracy.

by wiseowise

3/20/2026 at 11:38:51 PM

This is very unusual statement :-D

by neuropacabra