Documenting the syncing protocol

Wavesonics · Nov 11, 2024 · 0e6db82 · 0e6db82
1 parent 017ad70
commit 0e6db82
Showing 1 changed file with 142 additions and 43 deletions.
diff --git a/docs/SYNCING-PROTOCOL.md b/docs/SYNCING-PROTOCOL.md
@@ -102,55 +102,29 @@ sequenceDiagram
 
 The goal of this protocol is to synchronize the various `Entities` on the client to the server.
 
-There is no file history as in a true version control system such as `git`. This is instead a simpler synchronization system, yet still smart enough to detect `conflicts`, and prevent edits on various devices from overwriting each other on accident.
+There is no file history as in a true version control system such as `git`. This is instead a simpler synchronization system, yet still smart enough to detect `conflicts`, and prevent edits on multiple devices from overwriting each other on accident.
 
 As such, there is very little book keeping data, and none of it is actually required. When all actors are fully synchronized, they all contain the full set of data. Thus if the server were to die and lose all of its data, it wouldn't matter. Every client would contain everything necessary to
 setup on a new server.
 
-## Terminology
-
-**Entity**
-any individual block of data. Each entity is given a unique ID. Examples include:
-
-- Scene
-- Scene Draft
-- Timeline Event
-- Encyclopedia Entry
-- Note
+Further more, the protocol is fully fault tolerant. It may fail at any step along the way, and the state of the client and server will remain entirely valid, although not entirely synchronized.
 
-**Entity ID**
-Every Entity is given an Entity ID, which is a unique, monotonically incrementing
-integer, with the first valid ID being 1
+## Network Protocol (overview)
 
-**Sync ID**
-This is a UUID generated by the server and passed back to the client identifying a
-particular syncing session to a particular client. The server will only allow one syncing session per
-account at a time to prevent race conditions.
+This is largely a client driven synchronization process.
 
-**Entity Update Sequence**
-A list of Entity IDs in a particular order determined by the server.
-The client will update these IDs in the provided order. The server will leave out IDs of Entities
-that do not need synchronization.
+### SyncIDs
+The client calls `begin_sync` to get a valid `syncID`. This `syncID` is provided to all subsequent calls, and is terminated with a call to `end_sync`.
 
-**Re-ID**
-The process of taking a client side Entity and issuing it a new ID, changing any
-references to that ID in the process.
+There can be only one valid `syncID` per project at any given time. This prevents race conditions with two clients syncing the same project at the same time.
 
-**Conflicts**
-The same file that has been edited in different ways on different devices, must allow the user to resolve the conflict in order to bring them back into sync with each other.
+You may however have `syncID`s for multiple different projects simultaneously.
 
-**Dirty Entity** When a client edits a local Entity, the client first hashes the existing,
-pre-edited content, and saves off the **Entity ID** and this pre-edit hash of the data to a "dirty
-list". If the client and server are in sync at the time of this edit, then the saved hash in the
-dirty list will match the hash of the server's copy of the Entity.
-At syncing time this allows us to detect conflicts. If another client edits the same entity, and
-syncs with the server first.
-Thus our local "dirty list" hash will not match the hash of the server side copy, and we'll know we
-have a conflict that needs resolving.
+These are Project level `syncID`s. Account syncing use separate Account level `syncID`s. There may only be one valid Account level `syncID` at a time, and if there is a valid Account `syncID`, then no Project level `syncID`s are allowed to be created. The Account level sync must finish before any Project level syncs may begin.
 
-### Network Protocol (overview)
+### Entity Update Sequence
+The server will inspect the provided ClientState, and then return a sequence of Entity IDs. Those and only those IDs should be synchronized by the client, and in that order.
 
-This is largely a client driven synchronization process.
 
 ```mermaid
 sequenceDiagram
@@ -180,8 +154,8 @@ sequenceDiagram
     end
 
     rect rgb(11, 0, 74)
-        loop Entity Transfer
-        Note right of Client: See breakout sectioin for details
+        loop Transfer Entities
+        Note right of Client: See breakout section for details
             Client->>Server: 
             Server->>Client: 
         end
@@ -191,18 +165,143 @@ sequenceDiagram
 	deactivate Client
 	activate Server
 	Note right of Client: ProjectID<br/>SyncId
+	Server -->> Client: 200 OK (Sync Terminated)
+	deactivate Server
+	activate Client
+```
+
+## Network Protocol (Entity Transfer)
+The Client now attempts to sync each ID provided in the server in the `Entity Update Sequence` in the order provided.
+
+It will now either upload or download each ID depending on what it infers from the combined Client and Server state that has been transferred so far.
+
+### Download
+The client has determined that it needs to download the Server's copy of an Entity. This is either because the client is simply missing the Entity, or it has determined that the server has a newer version and it wants to overwrite the local client copy with the server copy.
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Server
+
+    Client->>Server: GET /project/$userId/$projectName/download_entity/$entityId
+	activate Server
+
 	Server -->> Client: 200 OK (Sync Began)
 	deactivate Server
 	activate Client
+	Note left of Server: LoadEntityResponse
 ```
 
-### Network Protocol (Entity Transfer)
-TBD
+### Upload
+The client has determined that it needs to upload the local Client copy of an Entity. This is either because the server is missing the entity, or the client has a dirty copy that needs to be synchronized.
+
+#### No conflict
+In the nominal case, the server will accept the incoming entity, and simply overwrite the Server's own copy with it. The server knows this is safe to due because it compares the Server copy's hash, with the provided `original hash`. If they match, the Server knows that the Client was editing the same copy which the server will now replace.
 ```mermaid
+sequenceDiagram
+    participant Client
+    participant Server
+
+    Client->>Server: POST /project/$userId/$projectName/upload_entity/$entityId
+	activate Server
+	Note right of Client: X-Entity-Hash = {original hash} <br /> ApiProjectEntity
+
+	Server -->> Client: 200 OK
+	deactivate Server
+	activate Client
+	Note left of Server: SaveEntityResponse
 ```
 
-### Client Overview
-TBD
+#### Conflict detected
+In the case where the Sever and Client's `original hash` do no match, there is a conflict.
+
+The server infers from this that the client was editing a different version of the Entity that what the server now has. This is probably because a different client uploaded an independent edit of the Entity.
+
+The server will respond with it's copy of the Entity and require the Client to resolve the conflict by resubmitting the upload with `force=true` set.
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Server
+
+    Client->>Server: POST /project/$userId/$projectName/upload_entity/$entityId
+	activate Server
+	Note right of Client: X-Entity-Hash = {original hash} <br /> ApiProjectEntity
+
+	Server -->> Client: 409 Conflict
+	deactivate Server
+	activate Client
+	Note left of Server: ApiProjectEntity
+
+	Note right of Client: {client now helps the user resolve the conflict}
+	Client->>Server: POST /project/$userId/$projectName/upload_entity/$entityId?force=true
+	deactivate Client
+	activate Server
+	Note right of Client: X-Entity-Hash = {original hash} <br /> ApiProjectEntity {resolved entity}
+
+	Server -->> Client: 200 OK
+	deactivate Server
+	activate Client
+	Note left of Server: SaveEntityResponse
+```
+Note that the resolved `ApiProjectEntity` in the `force` request does not have to be exclusively the Client's or Server's, it can be a merging between the two that the client helped the user create.
+
+## Client Operations Sequence 
+Beyond the network side of the Protocol, the Client is doing a bit of work to ensure data loss is not possible, and to work out what should be done with the minimal book keeping data it has.
+
+```mermaid
+flowchart TD
+    A[PrepareForSync] --> B[FetchLocalData]
+    B --> C[FetchServerData]
+    C --> D[CollateIds]
+    D --> E[Backup]
+    E --> F[IdConflictResolution]
+    F --> G[EntityDelete]
+    G --> H[EntityTransfer]
+    H --> I[FinalizeSync]
+```
+
+## Terminology
+
+**Entity**
+any individual block of data. Each entity is given a unique ID. Examples include:
+
+- Scene
+- Scene Draft
+- Timeline Event
+- Encyclopedia Entry
+- Note
+
+**Entity ID**
+Every Entity is given an Entity ID, which is a unique, monotonically incrementing
+integer, with the first valid ID being 1
+
+**Sync ID**
+This is a UUID generated by the server and passed back to the client identifying a
+particular syncing session to a particular client. The server will only allow one syncing session per
+account at a time to prevent race conditions.
+
+**Entity Update Sequence**
+A list of Entity IDs in a particular order determined by the server.
+The client will update these IDs in the provided order. The server will leave out IDs of Entities
+that do not need synchronization.
+
+**Re-ID**
+The process of taking a client side Entity and issuing it a new ID, changing any
+references to that ID in the process.
+
+**Conflicts**
+The same file that has been edited in different ways on different devices, must allow the user to resolve the conflict in order to bring them back into sync with each other.
+
+**Dirty Entity** When a client edits a local Entity, the client first hashes the existing,
+pre-edited content, and saves off the **Entity ID** and this pre-edit hash of the data to a "dirty
+list". If the client and server are in sync at the time of this edit, then the saved hash in the
+dirty list will match the hash of the server's copy of the Entity.
+At syncing time this allows us to detect conflicts. If another client edits the same entity, and
+syncs with the server first.
+Thus our local "dirty list" hash will not match the hash of the server side copy, and we'll know we
+have a conflict that needs resolving.
+
+
+---
 
 ### Explained (_old explaination, possibly out of date_)