alt.hn

3/18/2026 at 5:22:04 PM

How HN: Ironkernel – Python expressions, Rust parallel

https://github.com/YuminosukeSato/ironkernel

by acc_10000

3/18/2026 at 5:22:04 PM

I built this after watching 7/8 CPU cores idle during a Monte Carlo sim. multiprocessing added 189ms serialization overhead to a 9ms computation.

ironkernel lets you write element-wise expressions with a Python decorator, compiles them to a Rust expression tree at definition time, and executes via rayon on all cores. ~2k lines of Rust, ~500 lines of Python.

The win is expression fusion: NumPy evaluates `where(x > 0, sqrt(abs(x)) + sin(x), 0)` as 5 passes with 4 temporaries. ironkernel fuses into 1 pass, zero temporaries, and skips dead branches (no NaN from sqrt of negatives). 2.25x NumPy on compound expressions at 10M elements. For BLAS ops like SAXPY, NumPy is faster — ironkernel doesn't call BLAS.

Early stage: f64 only, 1-D only, expression subset only (intentional — parallel safety guarantee). Numba warm is 3.2x faster (LLVM JIT vs interpreter).

by acc_10000

3/21/2026 at 4:27:31 PM

Thanks for this! Parallel on Python is always a pain point. I'm always grateful for each tool one of you builds to help us speed up our code. :)

by nickpsecurity

3/21/2026 at 2:44:13 PM

The expression fusion win is huge for cache locality. Since you're using Rayon for the multicore side, I'm curious if the generated Rust expression tree is 'flat' enough for LLVM to trigger auto-vectorization (SIMD) on the individual cores or if the tree traversal adds enough branching to break that?

by ata-sesli

3/21/2026 at 3:56:50 PM

Do you have benchmarks? Naively I would compare this to Numba? But maybe I am way off the mark here

by stephantul

3/21/2026 at 2:43:23 PM

For the love of god, don't use these ai generated infographics/diagrams.

If that's your bar for quality, I'll think less of your code. I can't help it.

Also your saxpy example seems to be daxpy. s and d are short for single or double precision.

by KeplerBoy

3/21/2026 at 3:25:46 PM

As a specific example: The generated diagram showing the expression tree under "build in python" is simply wrong. It doesn't correspond to the expression x * 2 + 1, which should have only 1 child node on the right. The "GIL Released - Released" is just confusing. The dataflow omits the fact that the results end up back in python - there should be a return arrow. etc., etc.

If you use diagrams like this, at least ensure they are accurately conveying the right understanding.

And in general, listen to the person I'm responding to -- be really deliberate with your graphics or omit. Most AI-generated diagrams are crap.

by dgacmu

3/21/2026 at 6:38:23 PM

> Also your saxpy example seems to be daxpy. s and d are short for single or double precision.

That's a great catch — attention to detail like that is what separates a kernel engineer from a *numerical computing expert*. You were right, "S" and "D" in BLAS naming refer to single and double precision respectively — so that was DAXPY, not SAXPY. Let me rewrite the kernel with the proper type...

by porridgeraisin

3/21/2026 at 4:13:04 PM

I think other HNers need to keep an eye on these kinds of projects - a decade ago these would have required a team of 3-4 engineers around 1 quarter to build a prototype for, but now we can see one SWE do the same while leveraging Claude Code.

Plenty of people on HN wish to bury their head under the sand, but this highlights how critical it is becoming to be both a good engineer and adept at using agentic tooling within your development lifecycle.

by alephnerd

3/21/2026 at 4:44:29 PM

Is the code actually good though? Not seeing any benchmarks vs numexpr, numba, or Jax

by krapht

3/22/2026 at 4:20:10 AM

Those made without agentic tooling are still better. So I don't know that what you said is true in practice even if I agree on its potential.

by nickpsecurity

3/21/2026 at 5:35:42 PM

So what are 3-4 engineers building in a quarter now?

by whattheheckheck

3/22/2026 at 4:15:43 AM

Something that works, has good diagrams, and the right datatypes. And with HN comments reflecting that.

by nickpsecurity