Skip to content

Commit

Permalink
Documenting the syncing protocol
Browse files Browse the repository at this point in the history
  • Loading branch information
Wavesonics committed Nov 11, 2024
1 parent 017ad70 commit 0e6db82
Showing 1 changed file with 142 additions and 43 deletions.
185 changes: 142 additions & 43 deletions docs/SYNCING-PROTOCOL.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,55 +102,29 @@ sequenceDiagram

The goal of this protocol is to synchronize the various `Entities` on the client to the server.

There is no file history as in a true version control system such as `git`. This is instead a simpler synchronization system, yet still smart enough to detect `conflicts`, and prevent edits on various devices from overwriting each other on accident.
There is no file history as in a true version control system such as `git`. This is instead a simpler synchronization system, yet still smart enough to detect `conflicts`, and prevent edits on multiple devices from overwriting each other on accident.

As such, there is very little book keeping data, and none of it is actually required. When all actors are fully synchronized, they all contain the full set of data. Thus if the server were to die and lose all of its data, it wouldn't matter. Every client would contain everything necessary to
setup on a new server.

## Terminology

**Entity**
any individual block of data. Each entity is given a unique ID. Examples include:

- Scene
- Scene Draft
- Timeline Event
- Encyclopedia Entry
- Note
Further more, the protocol is fully fault tolerant. It may fail at any step along the way, and the state of the client and server will remain entirely valid, although not entirely synchronized.

**Entity ID**
Every Entity is given an Entity ID, which is a unique, monotonically incrementing
integer, with the first valid ID being 1
## Network Protocol (overview)

**Sync ID**
This is a UUID generated by the server and passed back to the client identifying a
particular syncing session to a particular client. The server will only allow one syncing session per
account at a time to prevent race conditions.
This is largely a client driven synchronization process.

**Entity Update Sequence**
A list of Entity IDs in a particular order determined by the server.
The client will update these IDs in the provided order. The server will leave out IDs of Entities
that do not need synchronization.
### SyncIDs
The client calls `begin_sync` to get a valid `syncID`. This `syncID` is provided to all subsequent calls, and is terminated with a call to `end_sync`.

**Re-ID**
The process of taking a client side Entity and issuing it a new ID, changing any
references to that ID in the process.
There can be only one valid `syncID` per project at any given time. This prevents race conditions with two clients syncing the same project at the same time.

**Conflicts**
The same file that has been edited in different ways on different devices, must allow the user to resolve the conflict in order to bring them back into sync with each other.
You may however have `syncID`s for multiple different projects simultaneously.

**Dirty Entity** When a client edits a local Entity, the client first hashes the existing,
pre-edited content, and saves off the **Entity ID** and this pre-edit hash of the data to a "dirty
list". If the client and server are in sync at the time of this edit, then the saved hash in the
dirty list will match the hash of the server's copy of the Entity.
At syncing time this allows us to detect conflicts. If another client edits the same entity, and
syncs with the server first.
Thus our local "dirty list" hash will not match the hash of the server side copy, and we'll know we
have a conflict that needs resolving.
These are Project level `syncID`s. Account syncing use separate Account level `syncID`s. There may only be one valid Account level `syncID` at a time, and if there is a valid Account `syncID`, then no Project level `syncID`s are allowed to be created. The Account level sync must finish before any Project level syncs may begin.

### Network Protocol (overview)
### Entity Update Sequence
The server will inspect the provided ClientState, and then return a sequence of Entity IDs. Those and only those IDs should be synchronized by the client, and in that order.

This is largely a client driven synchronization process.

```mermaid
sequenceDiagram
Expand Down Expand Up @@ -180,8 +154,8 @@ sequenceDiagram
end
rect rgb(11, 0, 74)
loop Entity Transfer
Note right of Client: See breakout sectioin for details
loop Transfer Entities
Note right of Client: See breakout section for details
Client->>Server:
Server->>Client:
end
Expand All @@ -191,18 +165,143 @@ sequenceDiagram
deactivate Client
activate Server
Note right of Client: ProjectID<br/>SyncId
Server -->> Client: 200 OK (Sync Terminated)
deactivate Server
activate Client
```

## Network Protocol (Entity Transfer)
The Client now attempts to sync each ID provided in the server in the `Entity Update Sequence` in the order provided.

It will now either upload or download each ID depending on what it infers from the combined Client and Server state that has been transferred so far.

### Download
The client has determined that it needs to download the Server's copy of an Entity. This is either because the client is simply missing the Entity, or it has determined that the server has a newer version and it wants to overwrite the local client copy with the server copy.
```mermaid
sequenceDiagram
participant Client
participant Server
Client->>Server: GET /project/$userId/$projectName/download_entity/$entityId
activate Server
Server -->> Client: 200 OK (Sync Began)
deactivate Server
activate Client
Note left of Server: LoadEntityResponse
```

### Network Protocol (Entity Transfer)
TBD
### Upload
The client has determined that it needs to upload the local Client copy of an Entity. This is either because the server is missing the entity, or the client has a dirty copy that needs to be synchronized.

#### No conflict
In the nominal case, the server will accept the incoming entity, and simply overwrite the Server's own copy with it. The server knows this is safe to due because it compares the Server copy's hash, with the provided `original hash`. If they match, the Server knows that the Client was editing the same copy which the server will now replace.
```mermaid
sequenceDiagram
participant Client
participant Server
Client->>Server: POST /project/$userId/$projectName/upload_entity/$entityId
activate Server
Note right of Client: X-Entity-Hash = {original hash} <br /> ApiProjectEntity
Server -->> Client: 200 OK
deactivate Server
activate Client
Note left of Server: SaveEntityResponse
```

### Client Overview
TBD
#### Conflict detected
In the case where the Sever and Client's `original hash` do no match, there is a conflict.

The server infers from this that the client was editing a different version of the Entity that what the server now has. This is probably because a different client uploaded an independent edit of the Entity.

The server will respond with it's copy of the Entity and require the Client to resolve the conflict by resubmitting the upload with `force=true` set.
```mermaid
sequenceDiagram
participant Client
participant Server
Client->>Server: POST /project/$userId/$projectName/upload_entity/$entityId
activate Server
Note right of Client: X-Entity-Hash = {original hash} <br /> ApiProjectEntity
Server -->> Client: 409 Conflict
deactivate Server
activate Client
Note left of Server: ApiProjectEntity
Note right of Client: {client now helps the user resolve the conflict}
Client->>Server: POST /project/$userId/$projectName/upload_entity/$entityId?force=true
deactivate Client
activate Server
Note right of Client: X-Entity-Hash = {original hash} <br /> ApiProjectEntity {resolved entity}
Server -->> Client: 200 OK
deactivate Server
activate Client
Note left of Server: SaveEntityResponse
```
Note that the resolved `ApiProjectEntity` in the `force` request does not have to be exclusively the Client's or Server's, it can be a merging between the two that the client helped the user create.

## Client Operations Sequence
Beyond the network side of the Protocol, the Client is doing a bit of work to ensure data loss is not possible, and to work out what should be done with the minimal book keeping data it has.

```mermaid
flowchart TD
A[PrepareForSync] --> B[FetchLocalData]
B --> C[FetchServerData]
C --> D[CollateIds]
D --> E[Backup]
E --> F[IdConflictResolution]
F --> G[EntityDelete]
G --> H[EntityTransfer]
H --> I[FinalizeSync]
```

## Terminology

**Entity**
any individual block of data. Each entity is given a unique ID. Examples include:

- Scene
- Scene Draft
- Timeline Event
- Encyclopedia Entry
- Note

**Entity ID**
Every Entity is given an Entity ID, which is a unique, monotonically incrementing
integer, with the first valid ID being 1

**Sync ID**
This is a UUID generated by the server and passed back to the client identifying a
particular syncing session to a particular client. The server will only allow one syncing session per
account at a time to prevent race conditions.

**Entity Update Sequence**
A list of Entity IDs in a particular order determined by the server.
The client will update these IDs in the provided order. The server will leave out IDs of Entities
that do not need synchronization.

**Re-ID**
The process of taking a client side Entity and issuing it a new ID, changing any
references to that ID in the process.

**Conflicts**
The same file that has been edited in different ways on different devices, must allow the user to resolve the conflict in order to bring them back into sync with each other.

**Dirty Entity** When a client edits a local Entity, the client first hashes the existing,
pre-edited content, and saves off the **Entity ID** and this pre-edit hash of the data to a "dirty
list". If the client and server are in sync at the time of this edit, then the saved hash in the
dirty list will match the hash of the server's copy of the Entity.
At syncing time this allows us to detect conflicts. If another client edits the same entity, and
syncs with the server first.
Thus our local "dirty list" hash will not match the hash of the server side copy, and we'll know we
have a conflict that needs resolving.


---

### Explained (_old explaination, possibly out of date_)

Expand Down

0 comments on commit 0e6db82

Please sign in to comment.