Retrofitting JIT Compilers into C Interpreters

4/16/2026 at 5:02:13 AM

Those interested in this type of work can also visit https://cfallin.org/blog/2024/08/28/weval/. The difference is that they use this technique to derive an AOT compiler.

by 9fwfj9r

4/16/2026 at 10:12:03 PM

Yeah, the strategy is literally the same

by syrusakbary

4/15/2026 at 11:35:59 PM

Took me a while to figure out whether it's interpreters for C programs or if there's a particular class of interpreters called "C". Turns out it's about interpreters implemented in C that they use modified LLVM to do the retrofitting, but couldn't it be applicable for other languages with LLVM IR, or other switch-in-a-loop patterns in C?

by fuhsnn

4/16/2026 at 7:00:33 AM

You're quite right that since we're working with LLVM IR, adapting to other languages is probably not _that_ difficult, though these things always end up taking more time than I expect! Since the majority of real-world problems in this area depend on C interpreters, we put our limited resources to that problem. You're also right that "interpreters" is a pretty vague category, and there are other parts of C (and other) programs that could be yk-ified, though I suspect it would be a fairly specialised subset of programs.

by ltratt

4/16/2026 at 12:10:42 AM

I've been a low level C and C++ programmer for 30 years. Even with your explanation and having read the webpage twice I have no idea what this technology does or how it works. So it takes normal interpreted code and jits it somehow? But you have to modify the source code of your program in some way?

by itriednfaild

4/16/2026 at 12:37:03 AM

I think the website does an amazing job explaining it, but it basically takes an interpreter written in C and turns it into a JIT with minimal changes to the code of the interpreter (i.e. not to the code of the program you're running in the interpreter). For example they took the Lua interpreter and with minimal changes were able to turn it into a JIT, which runs Lua programs about 2x faster.

by hencq

4/16/2026 at 4:13:38 AM

tracing jits are slightly harder to grasp than usual ones. The technique comes from real CPUs so the mindset of people behind the original idea is very different from the software world.

Metatracing ones are kind of an interesting twist on the original idea.

> So it takes normal interpreted code and jits it somehow?

Anyway, they use a patched LLVM to JIT-compile not just interpreted code but the main loop of the bytecode interpreter. Like, the C implementation itself.

> But you have to modify the source code of your program in some way?

Generally speaking, this is not normally the goal. All JIT-s try to support as much of the target language as possible. Some JIT-s do limit the subset of features supported.

by vkazanov

4/16/2026 at 12:27:43 AM

I don't fully grasp it either, the most appropriate analogy I can think of is like how OpenMP turns #pragma annotated loops into multi-threading, this work turns bytecode interpreting loops into JIT VM.

by fuhsnn

4/16/2026 at 2:34:52 PM

It's a promising technology, but it's still in the research domain. It's not an automated procedure. You need to use the yk fork of LLVM to compile and link your code, and you have to manually annotate and alter a fair amount of your interpreter loop with yk macros in non-trivial ways:

    while (true) {
      __yk_tracebasicblock(0);
      Instruction i = code[pc];
      switch (GET_OPCODE(i)) {
        case OP_LOOKUP:
          __yk_tracebasicblock(1);
          push(lookup(GET_OPVAL()));
          pc++; break;

...

    case OP_INT: push(yk_promote(constant_pool[GET_OPVAL(i)])); pc++; break;

Knowledge of tracing compilers, LLVM and SSA are needed by the user.

> added about 400LoC to PUC Lua, and changed under 50LoC

Lua 5.5.0 has 32106 lines of code including comments and empty lines. The changes amount to 1.4% of the code base. And then there's the code changes in the yk LLVM fork that you'd have to maintain which I'm guessing would be a few orders of magnitude larger.

If this project would be able to detect the interpreter hotspots itself and completely automate the procedure, it would be great.

by moardiggin

4/16/2026 at 2:42:25 PM

> If this project would be able to detect the interpreter hotspots itself and completely automate the procedure, it would be great.

I don't think that's realistic; or, at least, not if you want good performance. You need to use quite a bit of knowledge about your context to know when best to add optimisation hints. That said, it's not impossible to imagine an LLM working this out, if not today, then perhaps in the not-too-distant future! But that's above my pay grade.

by ltratt

4/16/2026 at 2:52:48 PM

Thanks for sharing this technology. I hope it gets upstreamed into LLVM.

by moardiggin

4/16/2026 at 7:34:12 AM

There were a couple of C interpreters since the 1990's, including with REPL support, but apparently never took off, most likely a community culture issue, that doesn't seem much value using them, beyond being in a debug session.

by pjmlp

4/16/2026 at 10:03:41 AM

I used to work on LabWindows/CVI an integrated C development environment. It included an "Interactive Execution Window" where you could build programs piecemeal. You added pieces of code, ran them, then appended more code, ran the new pieces, etc. It was as text window so you had more freedom than in a simple REPL.

It integrated with "Function panels". Function panels were our attempt at documenting our library functions. See the second link below. But you could enter values, declare variables, etc and then run the function panel. Behind the scenes, the code is inserted to the interactive window and then run. Results are added back to the function panel.

These also worked while suspended on a breakpoint in your project so available while debugging.

My understanding was that these features were quite popular with customers. They also came it handy internally when we wrote examples and did manual testing.

https://www.ni.com/docs/de-DE/bundle/labwindows-cvi/page/cvi...

https://irkr.fei.tuke.sk/PPpET/_materialy/CVI/Quick_manual.p...

by i_don_t_know

4/16/2026 at 10:56:07 AM

Thanks for sharing.

Yeah, I find this valuable regardless of the programming language, ideally the toolchain should be a mix of interpreter/JIT/AOT, to cherry pick depending on the deployment use case.

Naturally for dynamic languages pure AOT is not really worth it, althought a JIT cache is helpful as alternative.

by pjmlp

4/17/2026 at 1:02:13 AM

There were so many things that NI did that were great. Debugging in LabVIEW was also very easy with probes, conditional breakpoints etc.

It's really too bad that it's more or less dead now

by fluorinerocket

4/16/2026 at 12:09:32 AM

It's quite impressive they're able to take nearly arbitrary C and do this! Very similar to what pypy is doing here, but for C, and not a python subset.

However not without downsides. It sounds like average code is only 2x faster than Lua, vs. LuaJit which is often 5-10x faster.

by djwatson24

4/18/2026 at 2:37:19 AM

LuaJIT uses magic assembly.

by linzhangrun

4/16/2026 at 1:02:06 AM

Hmm I'm wondering how hard it would be to redo the old timey Microsoft jvm from the 90s for modern days....java > .net assembly runtime

by hypercube33

4/16/2026 at 7:31:36 AM

I find rather strange the complaint about compatibility across JIT implementations, there is exactly the same problem across any programming language with multiple implementations, interpreters, compilers, JIT, whatever.

by pjmlp

4/16/2026 at 2:30:18 AM

It's truly a good thing to see a project like this in the era of Vibe Coding taking flight :)

by linzhangrun

4/16/2026 at 7:44:22 AM

Sounds very promising. Although right now I’m working on a project together with MLIR.

by edmondx

4/16/2026 at 4:18:20 AM

Why do they need to change LLVM? Why can't they make this another LLVM IR pass?

by measurablefunc

4/16/2026 at 6:56:06 AM

Our fork of LLVM does add a pass, amongst other changes, but we also have to do things like change stackmaps in a way that breaks compatibility. Whether stackmaps in their current incarnation are worth retaining compatibility for is above my pay grade! So some of our changes are probably upstreamable, but some might be considered too niche for wider integration.

by ltratt

4/15/2026 at 12:36:39 PM

i tend to think of myself as a computing nerd, but posts like this one make me realize that i don't even rate on the computing nerd scale.

by sgbeal

4/16/2026 at 12:23:26 AM

Do you always make things about yourself? Have you written a parser or interpreter? You should, it’s an interesting exercise. The idea is to add meta tracing to the interpreter (the c code) that allows hot paths to be compiled to machine code and be then executed instead of being interpreted.

by throwaway1492

4/16/2026 at 5:52:59 AM

> Do you always make things about yourself?

That's an abrasive question but i dare say that we all do. It's our only constant point of reference.

> Have you written a parser or interpreter?

i have written many parsers, several parser generators, and a handful of programming languages. This article, however, covers a whole other level, way over my head (or well beyond any of my ambitions, in any case).

Pics or it didn't happen: fossil.wanderinghorse.net/r/cwal

by sgbeal

4/15/2026 at 11:35:16 PM

TL;DR compile with a fork of LLVM that enables runtime IR tracing. Very clever!

by mwkaufma

4/16/2026 at 4:20:10 AM

That's not what they're doing. They're directly modifying the IR to convert it into a tracing JIT. The final artifact is a binary w/ no IR. The problem is of course not introducing any subtle bugs in the process b/c they'd have to prove the modification they're making do not change actual runtime semantics for the final binary artifact.

by measurablefunc