3/8/2026 at 10:30:16 AM
Anyone here using semantic diffing tools in their daily work? How good are they?I use some for e.g. YAML [0] and JSON [1], and they're nice [2], but these are comparatively simple languages.
I'm particularly curious because just plain diffing ASTs is more on the "syntax-aware diffing" side rather than the "semantic diffing" side, yet most semantic tooling descriptions stop at saying they use ASTs.
ASTs are not necessarily in a minimal / optimized form by construction I believe, so I'm pretty sure you'll have situations where a "semantic" differ will report a difference, whereas a compiler would still compile the given translation unit to the same machine bytecode after all the optimization passes during later levels. Not even necessarily for target platform dependent reasons.
But maybe this doesn't matter much or would be more confusing than helpful?
[0] dyff: https://github.com/homeport/dyff
[1] jd: https://github.com/josephburnett/jd
[2] they allow me to ignore ordering differences within arrays (arrays are ordered in YAML and JSON as per the standard), which I found to be a surprisingly rare and useful capability; the programs that consume the YAMLs and JSONs I use these on are not sensitive to these ordering differences
by perching_aix
3/9/2026 at 7:08:13 AM
Fair point on AST vs semantic. sem sits somewhere in between. It doesn't go as far as checking compiled output equivalence, but it does normalize the AST before hashing (we call it structural_hash), so purely cosmetic changes like reformatting or renaming a local variable won't show as a diff. The goal isn't "would the compiler produce the same binary" but "did the developer change the behavior of this entity." For most practical cases that's the useful boundary. The YAML/JSON ordering point is interesting, we handle JSON keys as entities so reordering doesn't conflict during merges.By the way creator here.
by rs545837
3/9/2026 at 11:03:56 AM
Hey there, thanks for checking in.Regarding the custom normalization step, that makes sense, and I don't really have much more to add either. Looked into it a bit further since, it seems that specifically with programming languages the topic gets pretty gnarly pretty quick for various language theory reasons. So the solution you settled on is understandable. I might spend some time comparing how various semantic toolings compare, I'd imagine they probably aim for something similar.
> The YAML/JSON ordering point is interesting, we handle JSON keys as entities so reordering doesn't conflict during merges.
Just to clarify, I specifically meant the ordering of elements within arrays, not the ordering of keys within an object. The order of keys in an object is relaxed as per the spec, so normalizing across that is correct behavior. What I'm doing with these other tools is technically a spec violation, but since I know that downstream tooling is explicitly order invariant, it all still works out and helps a ton. It's pretty ironic too, I usually hammer on about not liking there being options, but in this case an option is exactly the right way to go about this; you would not want this as a default.
by perching_aix
3/9/2026 at 11:08:56 PM
Ah right, array ordering not key ordering. That's a different beast. You're making a deliberate semantic choice because you know your consumers are order-invariant. We can't really do that at our level since function ordering in code is usually meaningful to the language. Your use case needs domain knowledge about the consumer, which is exactly why an option makes sense there.If you do end up comparing semantic toolings I'd love to hear what you find. The space is weirdly fragmented between syntax-aware, normalized-AST, and domain-specific (dyff/jd). Everyone calls it "semantic" but they're solving pretty different problems.
by rs545837
3/8/2026 at 1:47:23 PM
I guess I don't understand the difference between semantic and syntax-aware, but I've been trying out difftastic which is a bit of an odd beast but does a great job at narrowing down diffs to the actual meaningful parts.by henrebotha