3/19/2026 at 3:42:38 AM
This is really interesting. At first glance, I was tempted to say "why not just use sqlite with JSON fields as the transfer format?" But everything about that would be heavier-weight in every possible way - and if I'm reading things right, this handles nested data that might itself be massive. This is really elegant.My one eyebrow raise is - is there no binary format specification? https://github.com/creationix/rx/blob/main/rx.ts#L1109 is pretty well commented, but you can't call it a JSON alternative without having some kind of equivalent to https://www.json.org/ in all its flowchart glory!
by btown
3/19/2026 at 1:57:50 PM
Thanks. I had this for older versions, but forgot to write it up again for the latest version.One old version that is meant to be more human readable/writable is jsonito
https://github.com/creationix/jsonito
I'll add similar diagrams and docs for the format itself here.
by creationix
3/19/2026 at 7:33:11 PM
Initial format docs are now here:https://github.com/creationix/rx/blob/main/docs/rx-format.md
Railroad diagrams will come later when I have more time.
by creationix
3/20/2026 at 2:29:41 PM
Neat! In case you took me too literally: railroad diagrams are fun, but far from the only way to give spec level clarity, so don’t feel you need to overindex on my silly comment!I am curious why it’s parsed right to left. Is this so that you could add new data to a top-level JSONL-esque list, solely by rewriting the end of the data structure, and not needing to change the beginning (or worst-case shift every single byte of data, if you need a longer count)?
It’s an interesting design tradeoff, because you can’t show a partial parse if you’re streaming the content naively beginning to end, which is a bit odd in a world where streams that begin to render token-by-token are all the rage.
But if you have an ability to do range queries, it’s quite effective, and it does allow for those incremental updates!
by btown
3/20/2026 at 3:19:10 PM
Tha main reason for the reverse encoding is it makes it easier on the writer. You simply do a depth-first traversal of the data graph and emit data on the way back up the stack. Zero buffering is needed since this naturally means you write contents before the length prefix.But it does open up a future direction I want to make with mutable datasets using append-only persistent data structures. The chain primitive is currently only used for strings, but it will be used to do the equivalent of `{...oldObj, ...newObj}` as a single chain `(pointerToOldObj, newObj)`.
With chains and pointers, you can write new versions of a dataset and reuse all the existing values that are unchanged. This, combined with random-access reads and fixed-block caching makes for a fairly complete MVCC database.
by creationix
3/20/2026 at 3:20:02 PM
And don't worry about railroad diagrams. I already intended to create them, I've just been extra busy this week with other things.by creationix