Protocol clarifications
Make sure to read the Yjs protocol specification, it contains most of the information needed to implement a Yjs server in any language.
VarUint
VarUint is used extensively in the protocol, which is a way of encoding any number between 1 and 8 (technically more possible) bytes in size
without having to send unnecessary information. This is achieved by interpreting the most significant bit of each byte as a "continue flag",
indicating whether another byte should be read as part of the number. The remaining 7 bits contain the actual value of the current byte.
The bytes are sent in little endian order from least to most significant, so the first byte contains the 7 lowest bits, the last byte the 7 highest.
Take this payload as an example:
0x8f 0x44
The first byte 0x8f is 0b10001111 in binary, which means that the first 7 bits of the parsed number are 0b0001111 and another byte should be read.
The second byte 0x44 is 0b01000100, which means that the following 7 bits of the parsed number are 0b1000100 and that this is the last byte of this number.
This gives us the number 0b10001000001111, or 0x220f or 8719.
An implementation in Java can be found in the server side code of the proof of concept source
in the functions readVarUint and writeVarUint.
Sync protocol
Another thing that might not be evident in the protocol spec is that a server (if operating in a server/client model) of a Yjs document
shall reply to SyncStep1 with SyncStep2 immediately followed by SyncStep1 as outlined in the
Javascript reference implementation in order for the server to get the up to date
state of the document from the clients when starting/reconnecting after being offline.
Awareness protocol
The protocol spec does not mention this, but the whole awareness update messages are sent
as a varArray, which means that before parsing it like described in the spec, one has to
unwrap the array first.
Custom protocol support
The first value of every message is a varUint indicating the protocol type (by default 0 for sync and 1 for awareness),
which would enable us to easily implement our own custom protocols to send custom data via the same websocket if needed.
Yrs
Comparison with Yjs
Even though Yjs and Yrs are compatible with each other, the function names and usage are slightly different.
- Every action on a document must be done within a transaction, which can be initiated with
YDoc.readTransactionorYDoc.writeTransactionand committed withYTransaction.commit. Because the application crashes if multiple transactions are created, there needs to be some synchronization mechanism to avoid that case. In the proof of concept,synchronizedmethods are used. - Encoding the state vector in Yjs via
Y.encodeStateVectorcan be done viaYTransaction.stateVectorV1. - Encoding a state as an update in Yjs via
Y.encodeStateAsUpdatecan be done viaYTransaction.stateDiffV1.
As is evident by the function names, there is a second version of the sync protocol which is supposed to be more efficient, but it isn't documented yet and not widely adopted, so the V1 protocol should suffice.
Java bindings with Yrs4J
While Yrs4J is not complete, I found no functions missing to implement a Yrs server for our needs. Also, if the need should arise, it should be pretty easy to fork and extend Yrs4J or even create our own JNA bindings because of the yffi module, which contains C bindings to Yrs.