RX – a new random-access JSON alternative

3/19/2026 at 3:42:38 AM

This is really interesting. At first glance, I was tempted to say "why not just use sqlite with JSON fields as the transfer format?" But everything about that would be heavier-weight in every possible way - and if I'm reading things right, this handles nested data that might itself be massive. This is really elegant.

My one eyebrow raise is - is there no binary format specification? https://github.com/creationix/rx/blob/main/rx.ts#L1109 is pretty well commented, but you can't call it a JSON alternative without having some kind of equivalent to https://www.json.org/ in all its flowchart glory!

by btown

3/19/2026 at 1:57:50 PM

Thanks. I had this for older versions, but forgot to write it up again for the latest version.

One old version that is meant to be more human readable/writable is jsonito

https://github.com/creationix/jsonito

I'll add similar diagrams and docs for the format itself here.

by creationix

3/19/2026 at 7:33:11 PM

Initial format docs are now here:

https://github.com/creationix/rx/blob/main/docs/rx-format.md

Railroad diagrams will come later when I have more time.

by creationix

3/20/2026 at 2:29:41 PM

Neat! In case you took me too literally: railroad diagrams are fun, but far from the only way to give spec level clarity, so don’t feel you need to overindex on my silly comment!

I am curious why it’s parsed right to left. Is this so that you could add new data to a top-level JSONL-esque list, solely by rewriting the end of the data structure, and not needing to change the beginning (or worst-case shift every single byte of data, if you need a longer count)?

It’s an interesting design tradeoff, because you can’t show a partial parse if you’re streaming the content naively beginning to end, which is a bit odd in a world where streams that begin to render token-by-token are all the rage.

But if you have an ability to do range queries, it’s quite effective, and it does allow for those incremental updates!

by btown

3/20/2026 at 3:19:10 PM

Tha main reason for the reverse encoding is it makes it easier on the writer. You simply do a depth-first traversal of the data graph and emit data on the way back up the stack. Zero buffering is needed since this naturally means you write contents before the length prefix.

But it does open up a future direction I want to make with mutable datasets using append-only persistent data structures. The chain primitive is currently only used for strings, but it will be used to do the equivalent of `{...oldObj, ...newObj}` as a single chain `(pointerToOldObj, newObj)`.

With chains and pointers, you can write new versions of a dataset and reuse all the existing values that are unchanged. This, combined with random-access reads and fixed-block caching makes for a fairly complete MVCC database.

by creationix

3/20/2026 at 3:20:02 PM

And don't worry about railroad diagrams. I already intended to create them, I've just been extra busy this week with other things.

by creationix

3/19/2026 at 2:19:43 AM

JSON is human-readable, why even compare it with this. Is any serialization format now just a "JSON alternative"?

by Levitating

3/19/2026 at 9:36:48 AM

Came to the same conclusion the moment I had to hunt to see the outputs https://github.com/creationix/rx/tree/main/samples

by jy14898

3/19/2026 at 2:17:40 PM

Reminds me of https://en.wikipedia.org/wiki/PHP_serialization_format

by ok123456

3/19/2026 at 2:38:57 PM

I was instantly suspicious that a “new better format” for serialization didn’t open with the input/output. And this is why (fucking lol, gtfo):

    Q^mSat,3^b:d+s+E,4Fri,3^u:h+k+u,6Thu,3^P:j+

If you are effectively going binary, do it. CBOR or Protobuf or any dozen other binary serializations that would be far more efficient.

The author claims this is because of copy and pasting… cool, remind me what BASE64 is again?

by SV_BubbleTime

3/19/2026 at 5:35:47 PM

It is also a format that can be read as-is without any preprocessing. In some cases base64 can do that, and this format does make heavy use of base64 varints.

Sure, you can encode as JSON, then compress with gzip and then base64 encode. You'll probably end up with something smaller than rx and be extremely safe to copy-paste. But your consumers are going to consume orders of magnitude more CPU reading data from this document.

RX is usable as-is, is compressed, and is copy-pasteable. It's the unique combination of properties that makes it interesting.

by creationix

3/19/2026 at 10:46:19 PM

>It is also a format that can be read as-is without any preprocessing.

>Q^mSat,3^b:d+s+E,4Fri,3^u:h+k+u,6Thu,3^P:j+

My man… no. I have no doubt you could kind of figure out what that sample is hot off the heels of writing this, and likely not in six months. And to consider that anyone else would fill their brain with the rules to decipher that, Nah 2.0.

by SV_BubbleTime

3/20/2026 at 2:06:22 AM

I meant computers can read it without any preprocessing. It's random access. You don't need to parse it, you don't need to decompress it. You just start at the end and follow pointers till you get to the desired value.

Even a trivial doc like this is challenging for me to read as a human.

by creationix

3/20/2026 at 5:46:50 AM

But... what sort of storage device does not allow your computers to use all 256 byte values? Why is random access data stored on teletype?

by 112233

3/20/2026 at 3:22:14 PM

> what sort of storage device does not allow your computers to use all 256 byte values

- clipboards

- logs

- terminal output

- alerts

- yaml configs

- JSON configs

- hacker news comments

- markdown documentation

- etc...

I assure you, this is not a solution looking for a problem. I started out with binary encodings first, but then realized it limits so many workflows.

by creationix

3/20/2026 at 4:22:11 AM

Ick, why are you talking to another person like this?

> Nah.com, fam.

by hombre_fatal

3/19/2026 at 3:34:13 AM

- this encodes to ASCII text (unless your strings contain unicode themselves) - that means you can copy-paste it (good luck doing that with compressed JSON or CBOR or SQLite - there is a scale where JSON isn't human readable anymore. I've seen files that are 100+MB of minified JSON all on a single very long line. No human is reading that without using some tooling.

by creationix

3/19/2026 at 4:30:34 AM

That kind of feels a bit worst of both worlds. None of the space savings/efficiency of binary but also no human readability.

Being able to copy/paste a serialization format is not really a feature i think i would care about.

by bawolff

3/19/2026 at 1:51:20 PM

It's a gradient. I did design several binary formats first, but for my use cases, this is actually better. There is nuance to various use cases.

> None of the space savings/efficiency of binary

For string heavy datasets, it's nearly the same encoding size as binary. I get 18x smaller sizes compared to JSON for my production datasets. This was originally designed as a binary format years ago (https://github.com/creationix/nibs) and then later after several iterations, converted to text.

> Being able to copy/paste a serialization format is not really a feature i think i would care about

Imagine being paged at 3am because some cache in some remote server got poisoned with a bad value (unrelated to the format itself). You load the value in dashboard, but it's encoded as CBOR or some binary format and so you have to download it in a binary safe way, upload that binary file to some tooling or install a cbor reader to your CLI. But then you realize that you don't have exec access to the k8s pods for security reasons, but do have access to a web-based terminal. Again, to extract a binary value you would need to create a shell, hexdump the file and somehow copy-paste that huge hexdump from the web-based terminal to your local machine, un-hex dump it, and finally load it into some CBOR reader.

A text format, however is as simple as copy-paste the value from the dashboard and paste into some online tool like https://rx.run/ to view the contents.

by creationix

3/19/2026 at 11:13:30 AM

if one of the advantages is making it copy-pastable then I would suggest the REXC viewer should give you the option to copy the REXC output, currently I have no way of knowing this by looking at your github or demo viewer

another thing, I put in a 400KB json and the REXC is 250KB, cool, but ideally the viewer should also tell me the compressed sizes, because that same json is 65kb after zstd, no idea how well your REXC will compress

edit: I think I figured out you can right click "copy as REXC" on the top object in the viewer to get an output, and compressed it, same document as my json compressed to 110kb, so this is not great... 2x the size of json after compression.

by mpeg

3/19/2026 at 1:53:43 PM

Thanks for testing it out! Yes, the website could use some love to make everything more discoverable.

The primary use case is not compression, it's just a nice side effect of the deduplication. This will never beat something like zstd, brotli, or even gzip.

My production use cases are unique in that I can't afford the CPU to decompress to JSON and then parse to native objects. But with this format, I can use the text as-is with zero preprocessing and as a bonus my datasets are 18x smaller.

by creationix

3/19/2026 at 5:37:29 PM

> 2x the size of json after compression

Right and that makes sense. There is more information in here. The entire thing is length prefixed and even indexed for O(1) array lookups and O(log2 N) object lookups.

If you don't care about random access and you don't mind the overhead of decompression, don't use RX.

by creationix

3/19/2026 at 5:56:28 PM

I think this makes sense, when you explain it like that, it might be a matter of cleaning up the docs a bit so the "why" of RX is more clear (admittedly, a README is not always the best channel for this!)

by mpeg

3/19/2026 at 7:34:21 PM

I've rewritten the framing in the README to first explain when you should use RX and when you should not. Most uses of JSON should probably stay JSON.

Let me know what you think

https://github.com/creationix/rx/blob/main/README.md#when-to...

by creationix

3/19/2026 at 8:20:22 AM

Are there any examples? If it's ASCII I'd expect to see some of the actual data in the readme, not just API.

Unless, to read that correctly, it only has a text encoding as long as you can guarantee you don't have any unicode?

by rendaw

3/19/2026 at 7:42:46 PM

> it only has a text encoding as long as you can guarantee you don't have any unicode?

The format is technically a binary format in that length prefixes are counts of bytes. But in practice it is a textual format since you can almost always copy-paste RX values from logs to chat messages to web forms without breaking it.

unciode doesn't break anything since strings are encoded as raw unicode with utf-8 byte length prefixes. It supports unicode perfectly.

If your data only contains 7-bit ASCII strings, the entire encoding is ASCII. If your data contains unicode, RX won't escape it, so the final encoding will contain unicode as UTF-8.

by creationix

3/19/2026 at 1:56:06 PM

oh, sorry about that. I forgot to include the description of the format with examples.

I did add some small examples to the repo.

https://github.com/creationix/rx/blob/main/samples/quest-log...

The older, slightly outdated, design spec is in the older rex repo (this format was spun out of the rex project when I realized it's actually a good standalone format)

https://github.com/creationix/rex/blob/main/rexc-bytecode.md

by creationix

3/19/2026 at 2:42:29 PM

'fdiscovered,aextreme,7danger,6+1A+16;6level_range,b:QThe Heap ,d'th

Oof.

by SV_BubbleTime

3/20/2026 at 12:34:04 AM

Very similar to bittorrent’s bencode. That has the benefit that it has a canonical encoding which this doesn’t (because of the different compression options). I wouldn’t be put off by how it looks as text.

by dontdoxxme

3/20/2026 at 2:08:55 AM

Very true. I had forgotten about bencode, I should read up on that again.

It makes sense they need a canonical form because they want same values to have same content hashes.

by creationix

3/19/2026 at 10:28:14 AM

You don't want to copy-paste anything like that as text anyway. Just copy and paste files.

No human is reading much data regardless of the format.

What is the benefit over using for example BSON?

by kukkamario

3/20/2026 at 3:27:38 PM

> Just copy and paste files

If all your workflows allow copying as binary files, more power to you! But there are a lot of workflows where that is not possible. This was inspired by years of hands-on operational incident handling in production systems. Every time we use a binary format, it's extra painful.

This particular format would be slightly more compact as binary, but not enough to justify closing the door on all the use cases that would preclude.

I'll probably add a binary variant for people who prefer that (or for people who want to be able to embed binary values in the data without base64 encoding it)

by creationix

3/19/2026 at 11:13:32 AM

I have an idea, why don't we all go back using XML at this point, as any initial selling point / differentiator has been slowly eroded away?

by soco

3/19/2026 at 7:31:57 PM

Thanks for the feedback. I've improved the framing to make the purpose/value more clear. What do you think about "RX is a read-only embedded store for JSON-shaped data"?

https://www.npmjs.com/package/@creationix/rx

by creationix

3/19/2026 at 12:09:47 PM

It's also quite odd to create a serialization format optimized for random access.

by Gormo

3/19/2026 at 1:59:45 PM

Serialized just means encoded as a stream of bytes so that it can be transferred between systems. There are absolutely cases where you want to be able to query a value directly like a database instead of parsing the entire thing to memory before you can read it. Think of this as no-sql sqlite.

by creationix

3/20/2026 at 3:39:35 AM

> Serialized just means encoded as a stream of bytes so that it can be transferred between systems.

Yes, serially. Which means no random-access across the transfer channel.

by Gormo

3/19/2026 at 1:31:56 PM

many serialization format are just a memory structure dump.

by j16sdiz

3/19/2026 at 1:16:09 PM

Not at all. What makes you say that?

by IshKebab

3/19/2026 at 2:33:01 AM

cat file.whatever | whatever2json | jq ?

(Or to avoid using cat to read, whatever2json file.whatever | jq)

by dietr1ch

3/19/2026 at 12:16:08 PM

That's not really random access, though. You're effectively just searching through the entire dataset for every targeted read you're after.

What might be interesting is to have a tool that processes full JSON data and creates a b-tree index on specified keys. Then you could run searches against the index that return byte offsets you can use for actual random access on the original JSON.

OTOH, this is basically just recreating a database, just using raw JSON as its storage format.

by Gormo

3/19/2026 at 5:39:51 PM

> What might be interesting is to have a tool that processes full JSON data and creates a b-tree index on specified keys. Then you could run searches against the index that return byte offsets you can use for actual random access on the original JSON.

I did build that once. But keeping track of the index is a pain. Sometimes I was able to generate the index on-demand and cache it in some ephemeral storage, but overall it didn't work out so well.

This system with RX will work better because I get the indexes built-in to the data file and can always convert it back to JSON if needed.

by creationix

3/19/2026 at 3:04:07 PM

Well, JSON had no random access to begin with, so maybe that's on needing JSON.

Maybe a query over the random-access file then converted into JSON would work?

by dietr1ch

3/19/2026 at 3:39:29 AM

Or in this case, just do `rx file.rx` It has jq like queries built in and supports inputs with either rx or json. Also if you prefer jq, you can do `rx file.rx | jq`

by creationix

3/19/2026 at 3:13:35 PM

wow, on that case then using `jq` is just a presentation preference at the very last step unless jq is more expressive (which might be the case given how long it has been around?).

by dietr1ch

3/19/2026 at 5:41:44 PM

right, the jq query language is much more complex and featureful than the simple selector syntax I added to the rx-cli. But more could be added later as needed or it could just stream JSON output. It would be pretty trivial to hook up a streaming JSON encoder to rx-cli which could then pipe to jq for low-latency lookups. The problem is jq would need to JSON parse all that data which will be expensive.

by creationix

3/19/2026 at 2:28:06 AM

Very cool stuff!

This did catch my eye, however: https://github.com/creationix/rx?tab=readme-ov-file#proxy-be...

While this is a neat feature, this means it is not in fact a drop in replacement for JSON.parse, as you will be breaking any code that relies on the that result being a mutable object.

by garrettjoecox

3/19/2026 at 3:31:44 AM

True, the particular use case where this really shines is large datasets where typical usage is to read a tiny part of it. Also there is no reason you couldn't write an rx parser that creates normal mutable objects. It could even be a hybrid one that is lazy parsed till you want to turn it mutable and then does a normal parse to normal objects after that point.

by creationix

3/19/2026 at 5:47:51 AM

It's not quite clear to me why you'd use this over something more established such as protobuf, thrift, flatbuffers, cap n proto etc.

by dtech

3/19/2026 at 7:00:43 AM

Those care about quickly sending compact messages over the network, but most of them do not create a sparse in-memory representation that you can read on the fly. Especially in javascript.

This lib keeps the compact representation at runtime and lets you read it without putting all the entities on the heap.

Cool!

by maxmcd

3/19/2026 at 2:00:34 PM

Exactly. Low heap allocations when reading values is one of the main driving factors in this design!

by creationix

3/19/2026 at 1:21:11 PM

Amazon Ion has some support for this - items are length-prefixed so you can skip over them easily.

It falls down if you have e.g. an array of 1 million small items, because you still need to skip over 999999 items to get to the last one. It looks like RX adds some support for indexes to improve that.

I was in this situation where we needed to sparsely read huge JSON files. In the end we just switched to SQLite which handles all that perfectly. I'd probably still use it over RX, even though there's a somewhat awkward impedance mismatch between SQL and structs.

by IshKebab

3/19/2026 at 5:42:45 PM

I did seriously consider SQLite, but my existing datasets don't map easily to relational database tables. This is essentially no-sql for sqlite.

by creationix

3/19/2026 at 10:05:35 AM

What if you are reading from a service which already have an established API?

It's not like you can just tell them to move to protobuf.

by konart

3/19/2026 at 2:45:20 PM

What about CBOR that can retain JSON compatibility?

If you are working with an end you don’t control, this “newer better” format isn’t in your cards either.

by SV_BubbleTime

3/19/2026 at 7:39:21 PM

How does CBOR retain JSON compatibility more than RX?

RX can represent any value JSON can represent. It doesn't even lose key order like some random-access formats do.

In fact, RX is closer to JSON than CBOR.

Take decimals as an example:

JSON numbers are arbitrary precision numbers written in decimal. This means it can technically represent any decimal number to full precision.

CBOR stores numbers as binary floats which are appriximations of decimal numbers. This is why they needed to add Decimal Fractions (Tag 4)

RX already stores as decimal base and decimal power of 10. So out of the box, it matches JSON

by creationix

3/19/2026 at 2:05:18 AM

You shouldn't be using JSON for things that'd have performance implications.

by barishnamazov

3/19/2026 at 3:38:11 AM

As with most things in engineering, it depends. There are real logistical costs to using binary formats. This format is almost compact as a binary format while still retaining all the nice qualities of being an ASCII friendly encoding (you can embed it anywhere strings are allowed, including copy-paste workflows)

Think of it as a hybrid between JSON, SQLite, and generic compression. This format really excels for use cases where large read-only build artifacts are queried by worker nodes like an embedded database.

by creationix

3/19/2026 at 8:44:16 AM

The cost of using a textual format is that floats become so slow to parse, that it’s a factor of over 14 times slower than parsing a normal integer. Even with the fastest simd algos we have right now.

by Asmod4n

3/19/2026 at 10:08:57 AM

So it depends. Float parsing performance is only a problem if you parse many floats, and lazy access might reduce work significantly (or add overhead: it depends).

by HelloNurse

3/19/2026 at 1:42:12 PM

Exactly. My for use cases, this format is amazing. I have very few floats, but lots and lots of objects, arrays and strings with moderate levels of duplication and substring duplication. My data is produced in a build and then read in thousands or millions of tiny queries that lookup up a single value deep inside the structure.

rx works very well as a kind of embedded database like sqlite, but completely unstructured like JSON.

Also I'm working on an extension that makes it mutable using append-only persistent data structures with a fixed-block caching level that is actually a pretty good database.

by creationix

3/19/2026 at 1:39:02 PM

if you data is lots and lots of arrays of floats, this is likely not the format for you. Use float arrays.

Also note it stores decimal in a very compact encoding (two varints for base and power of 10)

That said, while this is a text format, it is also technically binary safe and could be extended with a new type tag to contain binary data if desired.

by creationix

3/19/2026 at 8:49:57 AM

and with little data (i.e. <10Mb), this matters much less than accessibility and easy understanding of the data using a simple text editor or jq in the terminal + some filters.

by meehai

3/19/2026 at 9:08:18 AM

what do you mean by little data, most communication protocols are not one off

by xxs

3/20/2026 at 2:12:05 AM

Also good luck parsing 10 MiB of JSON in a loop that can't tolerate blocking the CPU for more than 10ms.

What's expensive is very relative to the use case.

by creationix

3/19/2026 at 10:18:40 AM

That rule sounds clean until the DB dump, API trace, or langauge boundary lands in your lap. Binary formats are fine for tight inner loops, but once the data leaks into logs, tooling, support, or a second codebase, the bytes you saved tend to come back as time lost decoding some bespoke mess.

by hrmtst93837

3/19/2026 at 2:04:07 PM

Yep. I did try binary formats first. I tried existing ones like CBOR, I tried making my own like Nibs. The text encoding is an operational concern, not a technical one.

This is the same reason I've been advocating for JSONL at work. It's not ideal technically, but it's a good balance of technically good enough while being also human friendly when things go wrong.

- https://vercel.com/blog/how-we-made-global-routing-faster-wi... - https://vercel.com/blog/scaling-redirects-to-infinity-on-ver...

RX is one step towards less human friendly, but more machine friendly. I try to keep things balanced in my designs.

by creationix

3/19/2026 at 4:45:35 AM

I agree in principle. However JSON tooling has also got so good that other formats, when not optimized and held correctly, can be worse than JSON. For example IME stock protocol buffers can be worse than a well optimized JSON library (as much as it pains me to say this).

by squirrellous

3/19/2026 at 6:05:10 AM

Yeah the raw parse speed comparison is almost a red herring at this point. The real cost with JSON is when you have a 200MB manifest or build artifact and you need exactly two fields out of it. You're still loading the whole thing into memory, building the full object graph, and GC gets to clean all of it up after. That's the part where something like RX with selective access actually matters. Parse speed benchmarks don't capture that at all.

by tabwidth

3/19/2026 at 12:23:39 PM

> The real cost with JSON is when you have a 200MB manifest or build artifact and you need exactly two fields out of it.

There are SAX-like JSON libraries out there, and several of them work with a preallocated buffer or similar streaming interface, so you could stream the file and pick out the two fields as they come along.

by magicalhippo

3/19/2026 at 1:22:41 PM

You still have to parse half the entire file on average. Much slower than formats that support skipping to the relevant information directly.

by IshKebab

3/19/2026 at 2:04:46 PM

yep, this is exactly the kind of use case that caused me to design this format.

by creationix

3/19/2026 at 9:10:02 AM

as parser: keep only indexes to the original file (input), dont copy strings or parse numbers at all (unless the strings fit in the index width, e.g. 32bit)

That would make parsing faster and there will be very little in terms on tree (json can't really contain full blow graphs) but it's rather complicated, and it will require hashing to allow navigation, though.

by xxs

3/19/2026 at 5:45:32 PM

yep. I built custom JSON parsers as a first solution. The problem is you can't get away from scanning at least half the document bytes on average.

With RX and other truly random-access formats you could even optimize to the point of not even fetching the whole document. You could grab chunks from a remote server using HTTP range requests and cache locally in fixed-width blocks.

With JSON you must start at the front and read byte-by-byte till you find all the data you're looking for. Smart parsers can help a lot to reduce heap allocations, but you can't skip the state machine scan.

by creationix

3/19/2026 at 2:19:09 AM

Can you imagine if a service as chatty and performance sensitive as Discord used JSON for their entire API surface?

by Spivak

3/19/2026 at 3:11:45 PM

A tiny note on the speed comparison: The 23,000x faster single-key lookup seems a bit misleading to me.

Once you get the computational complexity advantage, then you can make it as much times faster as you want. In these cases small instances matter to judge constants, and to the average (mean?) user, mean instance sizes.

I'm not sure how to sell the advantage succinctly though. Maybe just focus on "real-world" scenarios, but there's no footnote with details on the comparison

by dietr1ch

3/19/2026 at 6:26:38 PM

That benchmark is a fair comparison for a real-world production workload and use case. Sadly I can't share the details. But suffice it to say that the dataset is a huge object with tens of thousands of paths as keys and moderately large objects as values (averaging around 3KB of JSON each) all with slightly different shapes. The use is reading just a few entries by path an then looking up some properties within those entries.

The benchmark (or is supposed to) measures end-to-end parse + lookup.

JSON: 92 MB RX: 5.1 MB

Request-path lookup: ~47,000x faster

Time to decode a manifest and look up one URL path:

JSON: 69 ms REXC: 0.003 ms

Heap allocations: 2.6 million vs. 1

JSON: 2,598,384 REXC: 1 (the returned string)

by creationix

3/19/2026 at 8:29:13 AM

The biggest challenge for formats like this is usually tooling. JSON won largely because: every language supports it, every tool understands it.

Even a technically superior format struggles without that ecosystem.

by 50lo

3/19/2026 at 9:53:18 AM

And that in turn affects tool adoption. I have dabbled in Lua for interacting with other software such as mpv, but never got much into the weeds with it because it lacks native JSON support, and I need to interact with JSON all the time.

by latexr

3/19/2026 at 8:28:09 PM

yeah, LuaJIT is one of the use cases I had in mind working on this. JSON is pretty fast in modern JS engines, but in Lua land, JSON kinda sucks and doesn't really match the language without using virtual tables.

JSON has `null` values with string keyds, but lua doesn't have `null`. It has `nil`, but you can't have a key with a nil value. Setting nil deletes the key

Lua tables are unordered. But JS and JSON are often ordered and order often matters.

RX, however matches Lua/LuaJIT extremely well and should out-perform the JS Proxy based decoder using metatables. Since it's using metatables anyway do to the lazy parsing, it's trivial to do things like preserve order when calling `pairs` and `ipairs` and even including keys with associated null values.

You can round trip safely in Lua, which is not easy with most JSON implementations.

by creationix

3/19/2026 at 8:59:38 AM

So this is two things? A BSON-like encoding + something similar to implementing random access / tree walker using streaming JSON?

Docs are super unclear.

by jbverschoor

3/19/2026 at 7:49:50 AM

It doesn't seem the actual serialization format is specified? Other than in the code that is.

Is it versioned? Or does it need to be..

by _flux

3/19/2026 at 3:16:17 PM

The documentation reference a “decode” function, and it’s imported to the example code, but it’s never called. I’m not sure what the API is after reading the examples.

by killbot5000

3/18/2026 at 11:58:05 PM

A new random-access JSON alternative from the creator of nvm.sh, luvit.io, and js-git.

by creationix

3/19/2026 at 9:35:05 AM

Looks similar to https://github.com/7mind/sick

by pshirshov

3/19/2026 at 5:57:24 PM

You're right. Some important differences:

sick is binary, rx is textual (this matters for tooling)

sick has size limits (65534 max keys for example. I have real-world rx datasets reaching this size already) rx uses arbitrary precision variable-length b64 integers. There are no size limits anywhere inherit in the format, just in implementations.

sick does not preserve object key order rx preserves object key order, but still implements O(log2 N) lookups for object keys.

etc.

by creationix

3/19/2026 at 11:09:13 AM

It feels petty to show up with a naming not, but the name is unfortunately/confusingly similar to the already well-known RxJS.

Why is it called RX?

by bsimpson

3/19/2026 at 5:59:12 PM

I'm happy to hear suggestions. This format was actually the internal .rexc bytecode for Rex (routing expressions), but when I realized it was actually a pretty good standalone format, I renamed it `.rx` for short. I am aware of RxJS, but I think that `rx-format` is different enough and `.rx` file extensions are unique enough, it's not too confusing.

by creationix

3/19/2026 at 5:47:57 AM

Cool project.

The viewer is cool, took me a while to find the link to it though, maybe add a link in the readme next to the screenshot.

by WatchDog

3/19/2026 at 2:23:44 PM

could this be useful for embedding info in server generated web pages that are then picked up by a JavaScript. e.g. a tom-select country picker that gets its data from an embedded RX structure?

by TKAB

3/19/2026 at 5:46:11 PM

yes, this would work very well for any case where you have embedded databases of unstructured data that you want to query in a website or edge server

by creationix

3/19/2026 at 2:21:50 AM

I love these projects, I hope one of them someday emerges as the winner because (as it motivates all these libraries' authors) there's so much low hanging fruit and free wins changing the line format for JSON but keeping the "Good Parts" like the dead simple generic typing.

XML has EXI (Efficient XML Interchange) for precisely the reason of getting wins over the wire but keeping the nice human readable format at the ends.

by Spivak

3/19/2026 at 6:35:32 AM

TIL.

EXI looks useful. Now I just wish there was a renderer in the pugjs format as I find that terse format much pure readable than verbose XML. I also find indentation based syntax easier to visually parse hierarchical structure.

by snthpy

3/19/2026 at 7:36:24 AM

I am a little confused. Is this still JSON? Is it “binary“ JSON?

by transfire

3/19/2026 at 2:47:32 PM

It’s neither!

Sample output:

'fdiscovered,aextreme,7danger,6+1A+16;6level_range,b:QThe Heap ,d'th

Human unreadable, ascii output. Line up and get yours today!

by SV_BubbleTime

3/19/2026 at 5:47:28 PM

it's not really possible to stay human readable and get the compression levels and random access properties I was going for. But it is as human tooling friendly as possible given the constraints.

by creationix

3/19/2026 at 10:49:12 PM

>it's not really possible

I find it obvious that your first attempt failed. Try again, you have not even remotely failed enough if you are making the argument that this is kinda readable. Yes, ascii words are easy to pick out, you didn’t do that, you did the part that makes it all harder.

by SV_BubbleTime

3/19/2026 at 3:45:10 AM

Interesting. I've heard about cursors in reference to a Rust library that was mentioned as being similar to protobuf and cap'n proto.

Does this duplicate the name of keys? Say if you have a thousand plain objects in an array, each with a "version" key, would the string "version" be duplicated a thousand times?

Another project a lot of people aren't aware of even though they've benefitted from it indirectly is the binary format for OpenStreetMap. It allows reading the data without loading a lot of it into memory, and is a lot faster than using sqlite would be.

Edit: the rust library I remember may have been https://rkyv.org/

by benatkin

3/19/2026 at 5:50:57 PM

> Does this duplicate the name of keys?

Yes, the format allows for objects to be stored with a pointer to a shared schema (either an array of keys or another object that has the desired keys)

The current implementation is pretty close to ideal when deciding to use this encoding.

by creationix

3/19/2026 at 10:47:26 AM

I recently created my own low-overhead binary JSON cause I did not like Mongo's BSON (too hacky, not mergeable). It took me half a day maybe, including the spec, thanks Claude. First, implemented the critical feature I actually need, then made all the other decisions in the least-surprising way.

At this point, probably, we have to think how to classify all the "JSON alternatives" cause it gets difficult to remember them all.

Is RX a subset, a superset or bijective to JSON?

https://github.com/gritzko/librdx/tree/master/json

by gritzko

3/19/2026 at 5:49:42 PM

The current format version is the exact same feature set as JSON. I even encode numbers as arbitrary precision decimals (which JSON also does). This is quite different from CBOR which stores floats in binary as powers of 2.

I could technically add binary to the format, but then it would lose the nice copy-paste property. But with the byte-aware length prefixes, it would just work otherwise.

by creationix

3/19/2026 at 2:50:29 PM

You went from BSON to your own and skipped CBOR and Protobuf? … I wonder if you would have made different decisions without Claude vibing you in a direction?

by SV_BubbleTime

3/19/2026 at 9:42:57 AM

[dead]

by openclaw01

3/19/2026 at 3:24:50 PM

[dead]

by DaleBiagio

3/19/2026 at 10:39:35 AM

[dead]

by derodero24

3/19/2026 at 12:04:00 PM

[dead]

by AliEveryHour16

3/19/2026 at 8:52:16 AM

[flagged]

by StephenZ15ga67

3/19/2026 at 4:41:12 AM

[flagged]

by Shahbazay0719

3/19/2026 at 2:01:50 PM

Why do we need an "alternative" when JSON, itself, is so fantastic?

by NoSalt

3/19/2026 at 6:01:14 PM

the project framing needs some help perhaps. JSON is really good at a lot of use cases that this will never replace. But there are cases where JSON is currently used where this is much better. In particular large unstructured datasets where you only need to read a tiny subset of the data in a single request.

Maybe a better framing would be no-sql sqlite?

by creationix