-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raw/serving community history mvp v2 #164
base: master
Are you sure you want to change the base?
Raw/serving community history mvp v2 #164
Conversation
|
||
|
||
## Supported platforms | ||
|
||
Creating, distributing, downloading and unpacking community history archives SHOULD only be supported in desktop clients. Mobile clients SHOULD NOT implement this functionality due to bandwidth and storage limitations. | ||
Creating, distributing, downloading and unpacking community history archives SHOULD only be supported in Status desktop clients. Status mobile clients SHOULD NOT implement this functionality due to bandwidth and storage limitations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change this to: Status mobile are out of scope for the MVP of this service. However we may consider bringing this service to Status Mobile in the future.
Thinking behind this comment: in many countries mobile bandwidth is plentiful, very fast, and very cheep, for example in Ukraine it costs about $5 per month for unlimited mobile data, and the speed of this mobile data is normally between 50 and 100 megabits per second (faster than wired ADSL in London!). In the US mobile data is both expensive and slow, but for folks who are in countries with fast a cheep mobile data, mobile data usage isn't a worry.
Re. storage, in 64gb seems to be becoming the storage baseline these days, with the latest top of the line phones now offering 1TB storage(!!!). So storage will probably not be a limitation on mobile in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change that!
2. Community owner enables community history archive support | ||
3. A special type of channel for distributing magnet links ([Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme), [Extensions for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html)) is created | ||
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner enables community history archive support (possibly on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps change to "Community owner doesn't disable community history archive support"
Remove word "possibly", target is on by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
3. A special type of channel for distributing magnet links ([Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme), [Extensions for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html)) is created | ||
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner enables community history archive support (possibly on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) | ||
3. A special type of channel to exchange metadata about the archival data is created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add: Note this channel is not visible in the UI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add this. Reason I didn't mentioned anything about it being hidden is because this is a spec for the protocol, which should be UI agnostic.
|
||
## Storing live messages | ||
|
||
Community owner nodes MUST store live messages as [14/WAKU2-MESSAGE](https://rfc.vac.dev/spec/14/). This is required to provide confidentiality, authenticity, and integrity of message data distributed via the BitTorrent layer, and later validated by Status nodes when they unpack message history archives. | ||
|
||
Community owner nodes SHOULD remove those messages from their local databases after they have been turned into archives and distributed to the BitTorrent network. | ||
Community owner nodes SHOULD remove those messages from their local databases once they are older than 30 days and after they have been turned into message archives and distributed to the BitTorrent network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is correct? If a community owner node removes all messages older than 30 days from it's local database, how will the community owner be able to access messages older than 30 days using the Status desktop UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we had this same discussion in the other PR :D
See: #162 (comment)
This is about removing the Waku messages, not the application messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh yes, thanks for reminding me! Ignore this comment ;-)
|
||
Community owner nodes SHOULD store these files in a dedicated folder that is identifiable via the community id. | ||
|
||
Community owner nodes MAY remove older torrent files that were generated for previous message history archives, after a new torrent was created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the removing of the previous (old) torrent file be automatic after the new torrent file is created? Is there any reason not to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. The new/latest torrent file will replace the current one, making this paragraph obsolete.
|
||
|
||
## Supported platforms | ||
|
||
Creating, distributing, downloading and unpacking community history archives SHOULD only be supported in desktop clients. Mobile clients SHOULD NOT implement this functionality due to bandwidth and storage limitations. | ||
Creating, distributing, downloading and unpacking community history archives SHOULD only be supported in Status desktop clients. Status mobile clients SHOULD NOT implement this functionality due to bandwidth and storage limitations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change that!
2. Community owner enables community history archive support | ||
3. A special type of channel for distributing magnet links ([Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme), [Extensions for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html)) is created | ||
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner enables community history archive support (possibly on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
3. A special type of channel for distributing magnet links ([Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme), [Extensions for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html)) is created | ||
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner enables community history archive support (possibly on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) | ||
3. A special type of channel to exchange metadata about the archival data is created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add this. Reason I didn't mentioned anything about it being hidden is because this is a spec for the protocol, which should be UI agnostic.
|
||
## Storing live messages | ||
|
||
Community owner nodes MUST store live messages as [14/WAKU2-MESSAGE](https://rfc.vac.dev/spec/14/). This is required to provide confidentiality, authenticity, and integrity of message data distributed via the BitTorrent layer, and later validated by Status nodes when they unpack message history archives. | ||
|
||
Community owner nodes SHOULD remove those messages from their local databases after they have been turned into archives and distributed to the BitTorrent network. | ||
Community owner nodes SHOULD remove those messages from their local databases once they are older than 30 days and after they have been turned into message archives and distributed to the BitTorrent network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we had this same discussion in the other PR :D
See: #162 (comment)
This is about removing the Waku messages, not the application messages.
|
||
Community owner nodes SHOULD store these files in a dedicated folder that is identifiable via the community id. | ||
|
||
Community owner nodes MAY remove older torrent files that were generated for previous message history archives, after a new torrent was created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. The new/latest torrent file will replace the current one, making this paragraph obsolete.
The community owner node MUST send magnet links containing message archives and the message archive index to a special community channel. The topic of that special channel follows the following format: | ||
|
||
``` | ||
/{application-name}/{version-of-the-application}/{content-topic-name}/{encoding} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to update this. What content-topic-name
should we use here? @staheri14 thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Pascal for the spec, overall looks good to me!
I have left some comments and suggestions. Did not get to review the second half of the spec, will do it during the week and leave further comments.
|
||
This specification has the following assumptions: | ||
|
||
- Store nodes are available 24/7, ensuring constant live message availability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may add another item "The storage time range limit is 30 days."
|
||
### Serving community history archives | ||
|
||
Community owner nodes go through the following (high level) process to provide community members with message histories (assumes community owner node is available 24/7): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part
(assumes community owner node is available 24/7):
can be included in the list of assumptions above.
Community owner nodes go through the following (high level) process to provide community members with message histories (assumes community owner node is available 24/7): | ||
|
||
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner doesn't disable community history archive support (on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these two terms mean the same thing? if yes, then I suggest to use a consistent term across the spec
"community history archive support" and "message archive capabilities" which is used in the Terminology table
Community owner nodes go through the following (high level) process to provide community members with message histories (assumes community owner node is available 24/7): | ||
|
||
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner doesn't disable community history archive support (on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Community owner doesn't disable community history archive support (on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) | |
2. Community owner enables community history archive support (on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) |
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) | ||
2. Community owner doesn't disable community history archive support (on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)) | ||
3. A special type of channel to exchange metadata about the archival data is created, this channel should not be visible in the user interface | ||
4. Community owner invites members and creates additional channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4. Community owner invites members and creates additional channels | |
4. Community owner invites community members and creates additional channels visible in the user interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course a Community owner doesn't have to create any additional channel, they could have a community that only contains a single channel. Whenever a community is created, by default after it's created it always contains a single channel
The `timestamp` is determined by the context in which the community owner node attempts to create a message history archives as described below: | ||
|
||
1. The community owner node attempts to create an archive periodically for the past seven days (including the current day). In this case, the `timestamp` has to lie within those 7 days. | ||
2. The community owner node has been offline (owner node's main process has stopped and needs restart) and attempts to create archives for all the live messages it has missed since it went offline. In this case, the `timestamp` has to lie within the day the latest message was received and the current day. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the whole missed time range be broken into 7-day intervals? I think item 1 already covers it and there is no need for the second item.
I'd suggest adding the explanation of item 2 (how to measure the missed time range) under the second time of this section https://github.com/status-im/specs/blob/c9184b74b5623fe14aa4bd99ed8b268a65984117/docs/raw/serving-community-history.md#serving-archives-for-missed-messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will update this. I thought it would be good to be explicit about the scenario that the node can be offline, in which case there can be up to 30 days worth of messages from which the node needs to create archives (4, 1 for each 7 days)
|
||
## Exporting messages for bundling | ||
|
||
Community owner nodes export messages from their local database for creating and bundling history archives using the following criteria: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Community owner nodes export messages from their local database for creating and bundling history archives using the following criteria: | |
Community owner nodes export Waku messages from their local database for creating and bundling history archives using the following criteria: |
|
||
Community owner nodes export messages from their local database for creating and bundling history archives using the following criteria: | ||
|
||
- Messages to be exported MUST have a `contentTopic` that match any of the topics of the community channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Messages to be exported MUST have a `contentTopic` that match any of the topics of the community channels | |
- Waku messages to be exported MUST have a `contentTopic` that match any of the topics of the community channels |
Community owner nodes export messages from their local database for creating and bundling history archives using the following criteria: | ||
|
||
- Messages to be exported MUST have a `contentTopic` that match any of the topics of the community channels | ||
- Messages to be exported MUST have a `timestamp` that lies within a time range of 7 days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Messages to be exported MUST have a `timestamp` that lies within a time range of 7 days | |
- Waku messages to be exported MUST have a `timestamp` that lies within a time range of 7 days |
|
||
## Storing live messages | ||
|
||
Community owner nodes MUST store live messages as [14/WAKU2-MESSAGE](https://rfc.vac.dev/spec/14/). This is required to provide confidentiality, authenticity, and integrity of message data distributed via the BitTorrent layer, and later validated by Status nodes when they unpack message history archives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Community owner nodes MUST store live messages as [14/WAKU2-MESSAGE](https://rfc.vac.dev/spec/14/). This is required to provide confidentiality, authenticity, and integrity of message data distributed via the BitTorrent layer, and later validated by Status nodes when they unpack message history archives. | |
For the archival data serving, community owner nodes MUST store live messages as [14/WAKU2-MESSAGE](https://rfc.vac.dev/spec/14/). This is in addition to their database of application messages. This is required to provide confidentiality, authenticity, and integrity of message data distributed via the BitTorrent layer, and later validated by Status nodes when they unpack message history archives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more suggestions and questions
### WakuMessageHistoryArchive | ||
|
||
The `from` field SHOULD contain a timestamp of the time range's lower bound. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its type parallels the `timestamp` field of [`WakuMessage`](https://rfc.vac.dev/spec/14/#payloads). | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate what it means for it to "paralell" the field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has the same type as the timestamp
field of the waku message.
|
||
The `messages` field MUST contain all messages that belong into the archive given its `from`, `to` and `contentTopic` fields. | ||
|
||
The `padding` field MUST contain the amount of zero bytes needed so that the overall byte size of the protobuf encoded `WakuMessageArchive` is a multiple of the `pieceLength` used to divide the message archive data into pieces, as explained in [creating message archive torrents](#creating-message-archive-torrents). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pieces" is unclear at this point, I'd suggest something like below:
The padding
field MUST contain the amount of zero bytes needed so that the overall byte size of the protobuf encoded WakuMessageArchive
is a multiple of the pieceLength
. This is needed for seamless encoding and decoding of archival data in interaction with BitTorrent as explained in creating message archive torrents.
|
||
## Message history archive index | ||
|
||
Community owner nodes MUST provide message archives for the entire community history. Each individual archive only contains a subset of the complete history, that is, data for a time range of seven days, and all message history archives are concatenated into a single file as byte string (see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Community owner nodes MUST provide message archives for the entire community history. Each individual archive only contains a subset of the complete history, that is, data for a time range of seven days, and all message history archives are concatenated into a single file as byte string (see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces)). | |
Community owner nodes MUST provide message archives for the entire community history. The entire history consists of a set of `WakuMessageArchive`s where each archive contains a subset of historical`WakuMessage`s for a time range of seven days. All the `WakuMessageArchive`s are concatenated into a single file as a byte string (see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces)). |
Community owner nodes MUST provide message archives for the entire community history. Each individual archive only contains a subset of the complete history, that is, data for a time range of seven days, and all message history archives are concatenated into a single file as byte string (see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces)). | |
Community owner nodes MUST provide message archives for the entire community history. Each individual archive only contains a subset of the complete history, that is, data for a time range of seven days, and all message history archives are concatenated into a single file as byte string (see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces)). |
- `data` - Contains all protobuf encoded message history archives concatenated in ascending order | ||
- `index` - Contains the protobuf encoded message history archive index | ||
|
||
Community owner nodes SHOULD store these files in a dedicated folder that is identifiable via the community id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Community id is not defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean it should be added to the terminology section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dedicated folder that is identifiable via the community id.
Can you please elaborate on this? the folder should have the same name as the community id?
|
||
A torrent's source folder MUST contain the following two files: | ||
|
||
- `data` - Contains all protobuf encoded message history archives concatenated in ascending order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this
- `data` - Contains all protobuf encoded message history archives concatenated in ascending order | |
- `data` - Contains all protobuf encoded `WakuMessageArchive`s (as bit strings) concatenated in ascending order based on their time |
A torrent's source folder MUST contain the following two files: | ||
|
||
- `data` - Contains all protobuf encoded message history archives concatenated in ascending order | ||
- `index` - Contains the protobuf encoded message history archive index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `index` - Contains the protobuf encoded message history archive index | |
- `index` - Contains the protobuf encoded `WakuMessageArchiveIndex`. |
|
||
The community owner node MUST ensure that the byte string resulting from the protobuf encoded `data` is equal to the byte string `data` from the previously generated message archive torrent, plus the data of the latest 7 days worth of messages encoded as `WakuMessageArchive`. Therefore, the size of `data` grows every seven days as it's append only. | ||
|
||
The community owner nodes also MUST ensure that the byte size of every individual `WakuMessageArchive` encoded protobuf is a multiple of `pieceLength: ???` (**TODO**) using the `padding` field. If the last piece of a message archive has fewer bytes than `pieceLength`, it MUST be filled with zero bytes until it has the size `pieceLength`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the piece
is not properly introduced yet, I suggest the following edits:
The community owner nodes also MUST ensure that the byte size of every individual `WakuMessageArchive` encoded protobuf is a multiple of `pieceLength: ???` (**TODO**) using the `padding` field. If the last piece of a message archive has fewer bytes than `pieceLength`, it MUST be filled with zero bytes until it has the size `pieceLength`. | |
The community owner nodes also MUST ensure that the byte size of every individual `WakuMessageArchive` encoded protobuf is a multiple of `pieceLength: ???` (**TODO**) using the `padding` field. If the protobuf encoded 'WakuMessageArchive` is not a multiple of `pieceLength`, its `padding` field MUST be filled with zero bytes and the `WakuMessageArchive` MUST be re-encoded until its size becomes multiple of `pieceLength`. |
|
||
The community owner nodes also MUST ensure that the byte size of every individual `WakuMessageArchive` encoded protobuf is a multiple of `pieceLength: ???` (**TODO**) using the `padding` field. If the last piece of a message archive has fewer bytes than `pieceLength`, it MUST be filled with zero bytes until it has the size `pieceLength`. | ||
|
||
This is necessary because message history archive data will be split into pieces of `pieceLength` when the torrent file is created, and the SHA1 hash of every piece is then stored in the torrent file and later used by other nodes to request the data for each individual data piece. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary because message history archive data will be split into pieces of `pieceLength` when the torrent file is created, and the SHA1 hash of every piece is then stored in the torrent file and later used by other nodes to request the data for each individual data piece. | |
This is necessary because the content of the `data` file will be split into pieces of `pieceLength` when the torrent file is created, and the SHA1 hash of every piece is then stored in the torrent file and later used by other nodes to request the data for each individual data piece. |
Thanks @staheri14 for all the suggestions!! Will update the PR accordingly. One thing we need to clarify is what Any ideas? |
I imagine a smaller piece size is good for resource-limited members to easily get started and to contribute, but the connection overhead and bandwidth usage would be higher (to keep track of pieces and to announce ownerships). In contrast, a larger piece size contributes to lowering connection and bandwidth overhead, though we should be mindful of the resource limit of members. There might be studies around this topic as well. |
|
||
This is necessary because message history archive data will be split into pieces of `pieceLength` when the torrent file is created, and the SHA1 hash of every piece is then stored in the torrent file and later used by other nodes to request the data for each individual data piece. | ||
|
||
By fitting message archives into a multiple of `pieceLength` and ensuring they fill possible remainding space with zero bytes, community owner nodes prevent the **next** message archive to occupy that remainding space of the last piece, which will result in a different SHA1 hash for that piece. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By fitting message archives into a multiple of `pieceLength` and ensuring they fill possible remainding space with zero bytes, community owner nodes prevent the **next** message archive to occupy that remainding space of the last piece, which will result in a different SHA1 hash for that piece. | |
By fitting message archives into a multiple of `pieceLength` and ensuring they fill possible remaining space with zero bytes, community owner nodes prevent the **next** message archive to occupy that remaining space of the last piece, which will result in a different SHA1 hash for that piece. |
20 // piece[2] SHA1: 0x789 | ||
``` | ||
|
||
The next `WakuMessageArchive` "A3" will be appended ("#3") to the existing data and occupy the remainding space of the third data piece. The piece at index 2 will now produce a different SHA1 hash: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next `WakuMessageArchive` "A3" will be appended ("#3") to the existing data and occupy the remainding space of the third data piece. The piece at index 2 will now produce a different SHA1 hash: | |
The next `WakuMessageArchive` "A3" will be appended ("#3") to the existing `data` and occupy the remaining space of the third data piece. The piece at index 2 will now produce a different SHA1 hash: |
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[3] | ||
``` | ||
|
||
By filling up the remainding space of the third piece with A2 using its `padding` field, it is guaranteed that its SHA1 will stay the same: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By filling up the remainding space of the third piece with A2 using its `padding` field, it is guaranteed that its SHA1 will stay the same: | |
By filling up the remaining space of the third piece with A2 using its `padding` field, it is guaranteed that its SHA1 will stay the same: |
|
||
## Seeding message history archives | ||
|
||
The community owner node MUST seed the [generated torrent](#creating-message-archive-torrents) until a new message history archive is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The community owner node MUST seed the [generated torrent](#creating-message-archive-torrents) until a new message history archive is created. | |
The community owner node MUST seed the [generated torrent](#creating-message-archive-torrents) until a new message history archive `WakuMessageArchive` is created. |
The community owner node MUST update the `WakuMessageArchiveIndex` every time it creates one or more `WakuMessageArchive`s and bundle it into a new torrent (**TODO: see section**). | ||
For every created `WakuMessageArchive`, there MUST be a `WakuMessageArchiveIndexMetadata` entry in the `archives` field `WakuMessageArchiveIndex`. | ||
|
||
## Creating message archive torrents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure of the torrent file is not explained here, isn't it needed?
/{application-name}/{version-of-the-application}/{content-topic-name}/{encoding} | ||
``` | ||
|
||
All messages sent with this topic MUST be instances of `ApplicationMetadataMessage` ([6/PAYLOADS](/specs/6-payloads)) with a `payload` of `CommunityMessageArchiveIndex`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CommunityMessageArchiveIndex
is going to be added? could not find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about the type of message? I think by CommunityMessageArchiveIndex
you meant it is the type
of ApplicationMetadataMessage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to the magnet link? shouldn't it be published in the channel?
|
||
Since the magnet links are created from the community owner node's database (and previously distributed archives), the message history provided by the community owner becomes the canonical message history and single source of truth for the community. | ||
|
||
Community member nodes MUST replace messages in their local databases with the messages extracted from archives within the same time range. Messages that didn't receive the community owner node MUST be removed and are no longer part of the message history of interest, even if it already existed in a community member node's database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Community member nodes MUST replace messages in their local databases with the messages extracted from archives within the same time range. Messages that didn't receive the community owner node MUST be removed and are no longer part of the message history of interest, even if it already existed in a community member node's database. | |
Community member nodes MUST replace messages in their local databases with the messages extracted from archives within the same time range. Messages that the community owner node didn't receive MUST be removed and are no longer part of the message history of interest, even if it already existed in a community member node's database. |
|
||
## Fetching message history archives | ||
|
||
Generally, fetching message history archives is a tree step process: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, fetching message history archives is a tree step process: | |
Generally, fetching message history archives is a three step process: |
|
||
Generally, fetching message history archives is a tree step process: | ||
|
||
1. Receive message archive index magnet link as described in [Message archive distribution], download `index` file from torrent, then determine which message archives to download |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest linking to the "Creating message archive torrents" section for this part:
download
index
file from torrent
|
||
Community member nodes subscribe to the special channel that community owner nodes publish magnet links for message history archives to. There are two scenarios in which member nodes can receive such a magnet link message from the special channel: | ||
|
||
1. The member node receives it via live messages, that is, messages that are relayed by store nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store nodes are known for their storage service as part of which they also relay messages by running waku relay protocol. So, store nodes are a subset of relay nodes but not all of them. Waku relay nodes are the ones that run the relay protocol.
I think it might be better to skip this explanation as it may cause confusion for the readers.
1. The member node receives it via live messages, that is, messages that are relayed by store nodes | |
1. The member node receives it via live messages by listening to the special channel |
This the same as #162 but without all of the comments (#162 is getting too big to load).