Requirement: New data payload encoding types (general, opaque) #11

krischer · 2018-01-03T23:34:05Z

New data payload encoding types (general, opaque) for example to support other compression techniques (e.g. 32-bit integers, general compressor; 32- bit IEEE floats, general compressor; 64-bit IEEE floats (doubles), general compressor; Opaque data, general compressor).

chad-earthscope · 2018-01-06T01:39:03Z

As I remember it, the original motivation of this suggestion was to adopt a compression scheme that is in broad use outside of seismology and therefore take advantage of available libraries and hopefully modern compression advancements.

The ubiquitously used Steim 1 and 2 encodings are a very good balance between compression performance of seismic data, the common pattern of recording continuous data and complexity, as it relates to resource requirements for encoding/decoding and programming. But there are some, in most cases minor, drawbacks to these encodings: we (as a community) must write/maintain all the encoders/decoders, only 32-bit integer data, the rigid 64-byte framing forces wasted space when the frame cannot be filled and Steim 2 cannot encode differences larger than can be represented in 30-bits.

In my opinion the most likely scenario for FDSN adoption of a new compression encoding would be identification of one that addresses as many of those drawbacks as possible, while having similar compression performance (on miniSEED sized payloads) and reasonable complexity. Ideally, something well established and supported.

Also, while it would be convenient to introduce a new compression encoding at the same time as the next generation format, we can do this any time in the future as long as we retain the encoding identification system such as used in blockette 1000 of miniSEED 2.x.

Regarding opaque data, this was in an early requirement to provide an alternative to blockette 2000. There are cases where inserting an opaque payload into a record is handy to take advantage of a miniSEED data transmission or handling system. As I understand it that is why blockette 2000 was originally created, after transmission the miniSEED "wrapper" was discarded. Such use cases will certainly exist, providing a number for such an encoding is much better than someone choosing their own encoding value to get their own data payload included. I would prefer to document the use of an opaque encoding as something to be used transiently and within contained scenarios, i.e. strongly discouraged for use in a long term FDSN repository.

crotwell · 2018-01-08T16:28:43Z

I think we agree on the meaning of this, but just to be sure the phrase "general compressor" is just a placeholder and will not be associated with a encoding type. It just means we have not picked the particular compressor(s) that will be allowed, correct? Otherwise you have the problem of knowing that data is compressed, but not knowing how.

krischer · 2018-01-29T19:24:08Z

Summary

(Please let me know if I missed a point or misunderstood something)

Let's break this down in a couple of separate issues - please vote on:

Retain data encoding specification system as in miniSEED 2.x. (Yes/No)
Allow for an easy integration of additional data encodings without changes to the core definition. (Yes/No).
Actively investigate alternative encodings. (Urgent/Not Urgent)
Explicitly allow an "opaque" data encoding type. (Yes/No)
Clearly state that any opaque data should not be exported by data centers and should be considered a transient transport mechanism in contained scenarios. (Yes/No)

crotwell · 2018-01-29T20:50:44Z

On 1, I assume yes means we keep the general idea of mapping from numbers to encoding types, and keeping the currently defined numbers, but NGF may deprecate unused encodings. Basically keep primitive types and the steims (any others?).

Maybe 5 should be rephrased as "should not be exported by a data center". Network operators can use this to transmit proprietary data from a station into their datacenter, but it would not be part of the public, general-use archive/request system.

1 yes
2 yes
3 not urgent (assuming 2 is yes)
4 yes (with limitations on public use, ie 5)
5 yes

chad-earthscope · 2018-01-29T22:46:30Z

Yes, with a number of encodings (e.g. DWWSSN) marked as deprecated.
Yes.
Yes, not urgent.
Yes
Yes

krischer · 2018-01-30T07:55:38Z

On 1, I assume yes means we keep the general idea of mapping from numbers to encoding types, and keeping the currently defined numbers, but NGF may deprecate unused encodings. Basically keep primitive types and the steims (any others?).

Yes. We'll also have to expand this a bit to for example define the byte order for the integer + IEEE's float encodings.

Maybe 5 should be rephrased as "should not be exported by a data center". Network operators can use this to transmit proprietary data from a station into their datacenter, but it would not be part of the public, general-use archive/request system.

Done. I assume this does not change @chad-iris vote.

kaestli · 2018-01-30T10:23:57Z

Yes, retain such a system
Yes (define a procedure to adopt new codes)
Yes, (urgency will rather come from non-fdsn communities)
Yes
Yes

ozym · 2018-01-30T11:07:10Z

Yes.
Yes.
Yes, not urgent but would allow taking good advantage of variable length blocks if they are voted in.
Yes.
Yes.

claudiodsf · 2018-01-31T09:58:58Z

Retain data encoding specification system as in miniSEED 2.x. (Yes/No)

Yes. With marking obsolete encodings as deprecated (as per @chad-iris)

Allow for an easy integration of additional data encodings without changes to the core definition. (Yes/No).

Yes

Actively investigate alternative encodings. (Urgent/Not Urgent)

Not urgent, but consider as soon as possible IEEE formats

Explicitly allow an "opaque" data encoding type. (Yes/No)

Yes

Clearly state that any opaque data should not be exported by data centers and should be considered a transient transport mechanism in contained scenarios. (Yes/No)

Yes

ihenson-bsl · 2018-01-31T17:36:39Z

Yes
Yes
Yes
Yes
Yes

ValleeMartin · 2018-02-02T14:16:49Z

Yes
Yes
Not urgent
Yes
Yes

JoseAntonioJara · 2018-02-02T16:47:30Z

Yes
Yes
Not urgent
Yes
Yes

krischer added the additional requirement label Jan 3, 2018

krischer mentioned this issue Jan 11, 2018

Misc Discussions #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirement: New data payload encoding types (general, opaque) #11

Requirement: New data payload encoding types (general, opaque) #11

krischer commented Jan 3, 2018

chad-earthscope commented Jan 6, 2018

crotwell commented Jan 8, 2018

krischer commented Jan 29, 2018 •

edited

Loading

crotwell commented Jan 29, 2018

chad-earthscope commented Jan 29, 2018

krischer commented Jan 30, 2018

kaestli commented Jan 30, 2018

ozym commented Jan 30, 2018

claudiodsf commented Jan 31, 2018

ihenson-bsl commented Jan 31, 2018

ValleeMartin commented Feb 2, 2018

JoseAntonioJara commented Feb 2, 2018

Requirement: New data payload encoding types (general, opaque) #11

Requirement: New data payload encoding types (general, opaque) #11

Comments

krischer commented Jan 3, 2018

chad-earthscope commented Jan 6, 2018

crotwell commented Jan 8, 2018

krischer commented Jan 29, 2018 • edited Loading

Summary

crotwell commented Jan 29, 2018

chad-earthscope commented Jan 29, 2018

krischer commented Jan 30, 2018

kaestli commented Jan 30, 2018

ozym commented Jan 30, 2018

claudiodsf commented Jan 31, 2018

ihenson-bsl commented Jan 31, 2018

ValleeMartin commented Feb 2, 2018

JoseAntonioJara commented Feb 2, 2018

krischer commented Jan 29, 2018 •

edited

Loading