5/21/2025 at 8:44:12 PM
That is very neat. The algorithm:- Label each text character with a globally unique ID (e.g., a UUID), so that we can refer to it in a consistent way across time - instead of using an array index that changes constantly.
- Clients send the server “insert after” operations that reference an existing ID. The server looks up the target ID and inserts the new characters immediately after it.
- Deletion hides a character for display purposes, but it is still kept for "insert after" position purposes.
This might have potential outside text editing. Game world synchronization, maybe.
by Animats
5/21/2025 at 10:46:30 PM
This is literally a degenerate CRDT. Central server for tie-breaking goes back to Google Wave.by jahewson
5/22/2025 at 12:27:19 AM
Any algorithm that uses a central server as a tie-breaker could easily be replaced by one where client ids are used for the tie-breaker.by bikeshaving
5/22/2025 at 3:04:56 AM
If you used UUIDv7 you get time-ordered UUID and could use that for a tie breaker.by jonstokes
5/22/2025 at 4:03:00 AM
Don’t you have to be confident the clocks are sufficiently synced across the system?by yanowitz
5/22/2025 at 5:43:42 AM
And you know.. simultaneity is not a thing in reality...by boxed
5/22/2025 at 6:07:51 AM
It's even less of a thing in distributed systems, but you have to pick something to show the user and clocks are often good enough :Dby ChadNauseam
5/22/2025 at 12:29:01 AM
I miss Wave a lot, very quirky in a good way imo. We ran a few D&D games over it. RIPby eiginn
5/21/2025 at 8:55:11 PM
What you describe is a CRDT, isn't it ?by sdeframond
5/21/2025 at 9:12:54 PM
No; there is no single consistent final state that the system must converge to if the parts go offline. If you have this document: a{uuid=1}
and two clients send the following operations: b{uuid=2} insert-after{uuid=1}
c{uuid=3} insert-after{uuid=1}
then the following two documents are both valid final states: abc
acb
That's fine as long as you have an authoritative server that observes all events in a single order and a way to unwind misordered local state, but it means that it's not a CRDT.
by ori_b
5/21/2025 at 10:38:35 PM
Like Raft is a "special case" of Paxos, this feels like a "special case" of CRDT.It has all the flavor of CRDT, but adds a leader and a different way for the total ordering (basically using leader's local lamport clock to break tie).
Throw in leader reelection and some ledger syncing and then give everything some other names, I bet you can have "collaborative text editing on one page".
by YoumuChan
5/22/2025 at 6:48:32 AM
Yeah. It’s also quite inefficient by default - because you’re storing deleted items and you have a uuid per character. And you usually want the other parts from a crdt which this discards. Like, a consistent ordering rule for siblings. Doing so barely adds any code (like 10 lines or so) and it makes the system way more useful.I don’t really understand the benefit of doing this when text CRDTs are small, simple and fast.
by josephg
5/22/2025 at 3:49:34 AM
> No; there is no single consistent final state that the system must converge to if the parts go offline.Sounds like a naive implementation of delta state CRDT. I mean, what if the author has this major epiphany that it's nice to sync states to get convergence?
by motorest
5/21/2025 at 9:44:11 PM
What about ordering concurrent operations by id? Then "abc" is the only consistent final state.I get what you mean though, having a central authority greatly relaxes the requirement.
by sdeframond
5/22/2025 at 3:07:36 AM
How do you determine which operations are concurrent?by rendaw
5/22/2025 at 9:45:12 PM
if there are multiple valid final states?by tangjurine
5/21/2025 at 9:58:37 PM
We could generalize having a tree of known authorities upstream with what a CRDT does, resulting in both being two special cases of a more general consistent event processing model. CRDT makes the order of events commutative, hence the "authority" becomes a property of the math itself, rather than a physical service.by 3cats-in-a-coat
5/21/2025 at 9:41:40 PM
If you have the additional semantics that says "operation with lowest uuid gets applied first", don't you essentially get a CRDT?I mean, a uuid is kind of a poor man's Lamport clock, isn't it?
by Yoric
5/21/2025 at 10:29:05 PM
Yes, you can extend the algorithm that was described with additional semantics, and you can turn it into a CRDT.by ori_b
5/22/2025 at 2:59:07 PM
Yeah, so this is an operational transform where the transform operation is just identity.by anon291
5/21/2025 at 8:52:17 PM
Is this really that novel? I mean using a central process for serializing a distributed system is like a no brainer -- didn't we start off from here originally? -- until you have to worry about network partitions, and CAP and all that jazz. You also now have a single point of failure. Also I skimmed the thing but was performance discussed?by yubblegum
5/23/2025 at 5:08:20 AM
> Is this really that novel? I mean using a central process for serializing a distributed system is like a no brainer -- didn't we start off from here originally?Yes. This article reads like the adage "a month in a lab saves you an hour in a library".
by motorest
5/22/2025 at 12:49:29 AM
I really want to believe you were trying to make a reference to cap'n jazz - https://en.m.wikipedia.org/wiki/Cap'n_Jazz:)
by thebeardisred
5/22/2025 at 12:53:46 AM
Never heard of them:(
by yubblegum
5/21/2025 at 10:23:57 PM
Yeah I have the same question. I'm not familiar with the problem space but this seems like my naive first idea so I'm wondering what the catch is.by bongodongobob
5/22/2025 at 12:08:07 AM
As the author, same.My best guess is:
- Central-server collaborative editing work focuses on Operational Transformation (OT), likely due to inertia (studied since 1989) and the perception that storing an ID per character is inefficient. In fairness, it is, absent the optimizations introduced by RGASplit and Yjs (~2015).
- For decentralized editing, OT is very complicated, and CRDTs took over as the solution of interest (studied since 2005). Storing every operation permanently in a log - needed to use the linked approach without a server - feels inefficient, as does server reconciliation's undo/re-apply process. So CRDT research has focused on avoiding those inefficiencies, sacrificing simplicity along the way, instead of just embracing them as the easy way out.
To me, the "inefficiencies" seem quite manageable. Storage is cheap, text is small, and you probably want a complete op log anyway for auditing and document histories (cf. git). Server reconciliation's undo/re-apply process can be batched aggressively, e.g., only do it a few times per second; that just makes remote ops take a little longer to show up.
Granted, I have not built a complete app around server reconciliation or the linked approach, so perhaps there is a hidden catch. But I am encouraged by the success of Replicache (https://doc.replicache.dev/concepts/how-it-works), which is where I learned of server reconciliation.
by mweidner
5/22/2025 at 2:02:09 PM
I wonder if you could sidestep the inefficiencies of storing ID's for every character by using redis streams to store the characters with id's "represented as delta-compressed macro nodes that are linked together by a radix tree" where the ids are "monotonically incrementing and has two parts: <time>-<counter>. The time is in milliseconds and the counter increases for entries generated in the same milliseconds"This would seem to avoid clashes and also compress the identifier storage in an efficient way.
by kamranjon
5/23/2025 at 1:46:06 AM
This sounds similar to the idea behind articulated (though with ids UUID-counter instead of time-counter): https://github.com/mweidner037/articulatedI will check out Antirez.
by mweidner
5/22/2025 at 12:18:10 AM
ctrl+actrl+x
ctrl+v
Good luck
by pcthrowaway
5/22/2025 at 3:31:43 AM
Isn’t this a NOP?I think you could batch these things though. You just need a slightly more advanced protocol. If B is the first id, and L is the last, create range(B, L) as R and insert R after L (assuming Ctrl-v a second time).
by apgwoz
5/23/2025 at 5:24:00 AM
> Isn’t this a NOP?I think that's the point: conceptually it's a co-op, but the implementation forces each character to be individually removed following a ctrl-x, and then individually inserted back (after issuing a new UUID) following a ctrl-v.
Now, the outcome of neither a Ctrl+x nor a Ctrl+v is atomic, which means communicating each character operation through the wire takes time. In turn this means that this is a scenario where conflicts will take place (someone is editing text midway through another user does ctrl-a,ctrl-x) and where the choice of operation will result in important user impacts.
This is a basic test scenario of CRDTs, and why research focuses on higher-level operations instead of atomic character-level operations.
It should be noted that ctrl-a,ctrl-x,ctrl-v is an extreme case of a very frequent operation: copying and pasting text. Editor-level undo/redo can manifest the same way, too.
by motorest