alt.hn

2/16/2026 at 6:19:16 PM

The long tail of LLM-assisted decompilation

https://blog.chrislewis.au/the-long-tail-of-llm-assisted-decompilation/

by knackers

2/16/2026 at 9:47:39 PM

"Claude struggles with large functions and more or less gives up immediately on those exceeding 1,000 instructions." Well, yeah, that's the thing, an n64 game, that's C targetting an architecture where compiler optimizations are typically lacking, the idomatic style is lots of small tightly-scoped functions and the system architecture itself is a lot simpler than say a modern amd64 pc... These days I often just feel like, why is this person telling me how easy my job is now when they seemingly don't know much about it. I just find it arrogant and insulting... Perpetually demo season.

by decidu0us9034

2/16/2026 at 10:31:06 PM

Claude is doing the decompilation here, right? Has this been compared against using a traditional decompiler with Claude in the loop to improve decompilation and ensure matched results? I would think that Claude’s training data would include a lot more pseudo-C <-> C knowledge than MIPS assembler from GCC 2.7 and C pairs, and even if the traditional decompiler was kind of bad at N64 it would be more efficient to fix bad decompiler C than assembler.

by bri3d

2/16/2026 at 10:44:49 PM

It's wild to me that they wouldn't try this first. Feeding the asm directly into the model seems like intentionally ignoring a huge amount of work that has gone in traditional decompilation. What LLMs excel at (names, context, searching in high-dimensional space, making shit up) is very different from, e.g. coming up with an actual AST with infix expressions that represents asm code.

by titzer

2/17/2026 at 1:31:50 AM

Not Claude, but there are open-weight LLMs trained specifically on Ghidra decomp and tested on their ability to help reverse engineers make sense of it:

https://huggingface.co/LLM4Binary/llm4decompile-22b-v2

There's also a dataset floating around HF which is... I think a popular N64 decomp to pseudo-C? Maybe the Mario one?

by suprjami

2/17/2026 at 1:32:57 AM

I wonder how effective LLMs are going to be for decompiling i.e. games written in C++ targeting the PC platform. I’m not surprised one can get reasonably good results for N64 games, which have always been the easiest to reverse for a number of reasons.

by foxtacles

2/16/2026 at 9:45:54 PM

I'm really excited about this, especially for games for which the source code was lost like Red Alert 2.

by OptionOfT

2/17/2026 at 1:41:57 AM

Me too. I'm going to be reverse-engineering Elite PC (original version) and I can't help but think the source is lost. The developer seems to have totally dropped off the face of the Earth. I've contacted others who might know and nobody knows where they are.

Even the game I was a developer on which was published by Eidos in ~1998 is probably lost source. I can't think that anyone has the Visual Source Safe database backup CDs lying around, but I could be wrong.

by qingcharles

2/16/2026 at 10:09:21 PM

Does this technique limit the LLM to correctness-preserving transforms?

by amelius

2/16/2026 at 10:44:39 PM

Like all things related to LLMs, semantic correctness is left as an exercise for the reader.

by measurablefunc

2/16/2026 at 10:41:04 PM

IMO this is one of the best use cases for AI today. Each function is like a separate mini problem with an explicit, easy-to-verify solution, and the goal is (essentially) to output text that resembles what humans write -- specifically, C code, which the models have obviously seen a lot of. And no one is harmed by this use of AI; no one's job is being taken. It's just automating an enormous amount of grunt work that was previously impossible to automate.

I'm part of the effort to decompile Super Smash Bros. Melee, and a fellow contributor recently wrote about how we're doing agent-based decompilation: https://stephenjayakar.com/posts/magic-decomp/

by nemo1618

2/17/2026 at 1:42:53 AM

And the renaming of all the variables from the auto-gen ones into something human readable was always a thankless task which LLMs are really good for.

by qingcharles

2/16/2026 at 10:55:09 PM

> And no one is harmed by this use of AI; no one's job is being taken

what about: see cool app, decompile it, launch competing app.

(repeat)

by m463

2/16/2026 at 11:47:21 PM

Decompiling seems like the hard way to go here. Lots of clones pop up for popular games and apps all the time. I don't think you need to go down the decompile route to achieve that.

by _aavaa_

2/16/2026 at 10:28:24 PM

[dead]

by roelljr