4/1/2026 at 8:04:09 PM
Putting domain separators in the IDL is interesting but you can also avoid the problem by putting the domain separators in-band (e.g. in some kind of "type" field that is always present).Tangentially, depending on what your input and data model look like, canonicalisation takes O(nlogn) time (i.e. the cost of sorting your fields).
Here I describe an alternative approach that produces deterministic hashes without a distinct canonicalization step, using multiset hashing: https://www.da.vidbuchanan.co.uk/blog/signing-json.html
by Retr0id
4/1/2026 at 8:18:24 PM
I think a lot of people assume that the "name" of the type, for protos, will be preserved somewhere in the output such that a TreeRoot couldn't be re-used as a KeyRevoke. It makes sense that it isn't - you generally don't want to send that name every time - but it's non-obvious to people with a object-oriented-language background who just think "ah, different types are obviously different types." The serialization cost objection is generally what I've often seen against in-bound type fields and such, as well, so having a unique identifier that gets used just for signature computation is clever.What's over my head possibly, from skimming it, about your multiset hashing is how it avoids the "these payloads have the same shape, so one could be re-sent as the other" issue? It seems like a solution to a different problem?
by majormajor
4/1/2026 at 9:56:50 PM
This is just a mismatch between nominal typing and structural typing. Protobuf is basically structural typing. You can serialize a message defined with one schema and deserialize the result to a message with a different schema if the two schemata are compatible enough. Almost all normal programming languages use nominal typing. If you have `struct A {int a; int b};` it is distinct from `struct B {int a; int b};`.by kccqzy
4/1/2026 at 10:18:23 PM
C does too as a language, but it’s fairly easy to slip up at link time or runtime. At some point the types melt away and you sit there with pointers and offsets. Again, it’s not strictly the language’s fault (I think, I’m far from a standards lawyer).by actionfromafar
4/2/2026 at 8:02:33 AM
I think it's nice to be able to do things like rename nested structs and keep wire compatibility when upgrading two parts of the system at different schedules. Protos are neat. Think like a proto.(Not saying the signing problem in OP is invalid of course. Just a different problem.)
by cousin_it
4/1/2026 at 9:08:48 PM
Multiset hashing is not related to the domain separation problem, but it is related to the broader "signing data structures" problem.(I realise my comment reads a bit unclearly, it's basically two separate comments, split after the first paragraph)
by Retr0id
4/2/2026 at 8:58:43 AM
Thats a fun read and a topic I've specifically looked into before and its one of those black holes of terms than cannot be searched. You get so much noise.by Already__Taken