Requirement: Include both a format and a data/publication version number #13

krischer · 2018-01-03T23:34:50Z

Include both a format and a data/publication version number.

chad-earthscope · 2018-01-06T03:38:59Z

I support the addition of both a data format and data publication version.

The motivation for the format version is to make the format self describing and for identification, i.e. as a signature to match. It would also allow future evolution of the fundamental portions of the format.

The data publication version was discussed during the previous evaluation. With miniSEED 2.x there is no versioning built into the format. Some data centers used the "data quality" identifier as a crude form of versioning, but this is extremely limited with only 4 "levels" and a vague implication of "quality". Tracking data versions, which are a reality in modern data management and use, is especially important for scientific data. Including the capability to identify versions directly in the format allows for basic versioning and can be used by systems external to the format for extended, version-specific metadata.

andres-h · 2018-01-06T16:02:01Z

Data/publication version number should be an optional (IRIS) extension. Linear version numbers do not support "forks" where data has been modified in multiple datacentres. I would not hardcode this feature into the standard, because something more clever might be needed in future.

crotwell · 2018-01-08T16:49:17Z

Format version is critical for sure.

Andres has a good point about linear version numbers not working well with forks. If two data centers both receive version 7 of the data, each does something and then has a different version 8.

The alternatives are to either name-space the data version (perhaps within the additional headers) or to declare that the data version has no meaning beyond the context of the datacenter where it was created.

chad-earthscope · 2018-01-08T23:18:48Z

... or to declare that the data version has no meaning beyond the context of the datacenter where it was created.

That is exactly the conclusion we got to in the previous conversation last July on this topic. In the case of the IRIS DMC, I think we would work with those that contribute data so that the version is done by the owner whenever possible.

A system that identifies relative relationships between versions across forks and data centers would require some sort of central registry or much more complexity.

I suspect a data publication version in a record would be useful for many data centers, justifying a fixed 1 byte it would use, but it would be OK to use an optional header for this if that's where the consensus lands.

krischer · 2018-01-29T19:24:45Z

Summary

(Please let me know if I missed a point or misunderstood something)

Please vote on:

Do we want to include the actual data format version to enable self-identification and versioning of the data format? (Yes/No)
Do we want a single byte "data publication version" somewhere in each record? This would be a linear version number without a lot of additional semantics largely useful internally for data centers. (Yes/No)
Do we want a more complex "data publication version" which must include things like namespaces. (Yes/No)

crotwell · 2018-01-29T20:58:55Z

1 yes
2 yes
3 no, or at least not as a required header field. No objection to a standardized key that could be used in the optional part of the header as in #14

chad-earthscope · 2018-01-29T22:25:34Z

Yes
Yes
Not as a requirement.

kaestli · 2018-01-30T10:04:54Z

Yes
No (a data stream which was modified should get a different streamID, not a different version number, but a streamID pretending it to be the same. What if "version" tag varies between records of the "same" stream?. Using the streamID for to point to metadata allows to further describe the version/modification there)
(this should be answered in the streamID discussion)

ozym · 2018-01-30T11:24:46Z

Yes
Yes, although I could see this used as a mechanism to determine data providence within the collection systems (e.g. daisy chained data feeds) rather than version per se
No

claudiodsf · 2018-01-31T10:05:21Z

Do we want to include the actual data format version to enable self-identification and versioning of the data format? (Yes/No)

Yes

Do we want a single byte "data publication version" somewhere in each record? This would be a linear version number without a lot of additional semantics largely useful internally for data centers. (Yes/No)

Yes

Do we want a more complex "data publication version" which must include things like namespaces. (Yes/No)

No

ihenson-bsl · 2018-01-31T17:39:20Z

Yes
Yes
No

ValleeMartin · 2018-02-02T14:18:59Z

Yes
Yes
No

JoseAntonioJara · 2018-02-02T17:20:13Z

Yes
Yes, adding an identifier (namespace or other) meaning the datacenter where it was created.
Yes

krischer added the additional requirement label Jan 3, 2018

jmsaurel mentioned this issue Jan 11, 2018

Requirement: Identification of non-raw, derived data #10

Open

crotwell mentioned this issue Jan 29, 2018

Requirement: Eliminate time correction field as a required, always-present field and retain as an optional field present when needed #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirement: Include both a format and a data/publication version number #13

Requirement: Include both a format and a data/publication version number #13

krischer commented Jan 3, 2018

chad-earthscope commented Jan 6, 2018

andres-h commented Jan 6, 2018

crotwell commented Jan 8, 2018

chad-earthscope commented Jan 8, 2018 •

edited

Loading

krischer commented Jan 29, 2018

crotwell commented Jan 29, 2018

chad-earthscope commented Jan 29, 2018

kaestli commented Jan 30, 2018 •

edited

Loading

ozym commented Jan 30, 2018

claudiodsf commented Jan 31, 2018

ihenson-bsl commented Jan 31, 2018

ValleeMartin commented Feb 2, 2018

JoseAntonioJara commented Feb 2, 2018

Requirement: Include both a format and a data/publication version number #13

Requirement: Include both a format and a data/publication version number #13

Comments

krischer commented Jan 3, 2018

chad-earthscope commented Jan 6, 2018

andres-h commented Jan 6, 2018

crotwell commented Jan 8, 2018

chad-earthscope commented Jan 8, 2018 • edited Loading

krischer commented Jan 29, 2018

Summary

crotwell commented Jan 29, 2018

chad-earthscope commented Jan 29, 2018

kaestli commented Jan 30, 2018 • edited Loading

ozym commented Jan 30, 2018

claudiodsf commented Jan 31, 2018

ihenson-bsl commented Jan 31, 2018

ValleeMartin commented Feb 2, 2018

JoseAntonioJara commented Feb 2, 2018

chad-earthscope commented Jan 8, 2018 •

edited

Loading

kaestli commented Jan 30, 2018 •

edited

Loading