Y Websocket Server in Java

Table of Contents

Protocol clarifications

VarUint
Sync protocol
Awareness protocol
Custom protocol support

Comparison with Yjs
Java bindings with Yrs4J

Resources

Protocol clarifications

Make sure to read the Yjs protocol specification, it contains most of the information needed to implement a Yjs server in any language.

VarUint

VarUint is used extensively in the protocol, which is a way of encoding any number between 1 and 8 (technically more possible) bytes in size without having to send unnecessary information. This is achieved by interpreting the most significant bit of each byte as a "continue flag", indicating whether another byte should be read as part of the number. The remaining 7 bits contain the actual value of the current byte. The bytes are sent in little endian order from least to most significant, so the first byte contains the 7 lowest bits, the last byte the 7 highest. Take this payload as an example:

0x8f 0x44

The first byte 0x8f is 0b10001111 in binary, which means that the first 7 bits of the parsed number are 0b0001111 and another byte should be read. The second byte 0x44 is 0b01000100, which means that the following 7 bits of the parsed number are 0b1000100 and that this is the last byte of this number. This gives us the number 0b10001000001111, or 0x220f or 8719.

An implementation in Java can be found in the server side code of the proof of concept source in the functions readVarUint and writeVarUint.

Sync protocol

Another thing that might not be evident in the protocol spec is that a server (if operating in a server/client model) of a Yjs document shall reply to SyncStep1 with SyncStep2 immediately followed by SyncStep1 as outlined in the Javascript reference implementation in order for the server to get the up to date state of the document from the clients when starting/reconnecting after being offline.

Awareness protocol

The protocol spec does not mention this, but the whole awareness update messages are sent as a varArray, which means that before parsing it like described in the spec, one has to unwrap the array first.

Custom protocol support

The first value of every message is a varUint indicating the protocol type (by default 0 for sync and 1 for awareness), which would enable us to easily implement our own custom protocols to send custom data via the same websocket if needed.

Yrs

Comparison with Yjs

Even though Yjs and Yrs are compatible with each other, the function names and usage are slightly different.

Every action on a document must be done within a transaction, which can be initiated with YDoc.readTransaction or YDoc.writeTransaction and committed with YTransaction.commit. Because the application crashes if multiple transactions are created, there needs to be some synchronization mechanism to avoid that case. In the proof of concept, synchronized methods are used.
Encoding the state vector in Yjs via Y.encodeStateVector can be done via YTransaction.stateVectorV1.
Encoding a state as an update in Yjs via Y.encodeStateAsUpdate can be done via YTransaction.stateDiffV1.

As is evident by the function names, there is a second version of the sync protocol which is supposed to be more efficient, but it isn't documented yet and not widely adopted, so the V1 protocol should suffice.

Java bindings with Yrs4J

While Yrs4J is not complete, I found no functions missing to implement a Yrs server for our needs. Also, if the need should arise, it should be pretty easy to fork and extend Yrs4J or even create our own JNA bindings because of the yffi module, which contains C bindings to Yrs.