-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify whether fields can be added to groups/dsets with only data_type_inc
#13
Comments
Effectively, this is the question of "monkey-patching", i.e., we have an existing type and add properties without changing the class type. If we ignore backward compatibility, then my vote would be to require that we create a new data_type_def in those cases, if only to make this case easier to handle. However, I assume this would mess with a bunch of existing schema, as such I think it may be best to: 1) Explicitly discourage this in the schema docs, 2) Add a test on the ndx-catalog that checks for this and raises a warning to encourage better schema, 2) continue support as is in the API. |
Yes, removing those cases would mess with existing schema, so I think that plan of action makes sense. And even if we remove those cases from existing schema, I think we would have to continue to support them in order to be able to read and validate files with older schema. To be clear, based on this, in the example where the Units table has a VectorData column for spike_times and we want to add an attribute to that column, we should create a new data type SpikeTimesColumn that inherits from VectorData and adds the attribute, and use that in the Units table instead, right? What about the cases when fields are not added but properties of a data type are modified (restricted), e.g., VectorData allows any dtype and shape, but a new data type includes a VectorData and restricts dtype to int and shape to (None, )? To be clear, these cases should not be discouraged, right? |
Since we need to support this case anyways, I think this is ultimately then more like a best practice. I think the best practice should be:
|
TODO: Add warnings in HDMF extensions API when adding new elements to a group/dataset with only data_type_inc. |
Discussion should be continued in #14 |
Raised in hdmf-dev/hdmf#542 in the context of how extra fields are treated.
It is not clear in the documentation whether the schema language allows new fields for groups/datasets defined with only a
data_type_inc
. We should clarify this in the documentation.It is my understanding that it is allowed (though perhaps not recommended) for fields to be added to groups/datasets defined with only a
data_type_inc
.For example: a new group may contain a dataset with:
A concrete example is in the Units table in NWB: https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/core/nwb.misc.yaml#L187-L199
The
resolution
attribute is added to the baseVectorData
definition.Or a new group may be defined as:
A related but not fully matching example is the 'electrodes' table in the NWBFile: https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/core/nwb.file.yaml#L269
The 'electrodes' table lacks a data_type_def but specifies many named
VectorData
types. However, this differs from the above becauseDynamicTable
explicitly allowsquantity: *
ofdata_type_inc: VectorData
without a name specified.Note that a group/dataset defined with only a
data_type_inc
may have a different doc, quantity, dims, shape, and sometimes even dtype than in thedata_type_def
definition, as long as these are compatible with the included spec (e.g., a square is a special case of a rectangle). This should also be clarified in the documentation and fully supported in HDMF.If adding fields to types with only
data_type_inc
is not allowed, then the NWB schema should be amended.If the above is allowed, then the HDMF validator should be amended to validate against the spec with any additions to the included type.
The text was updated successfully, but these errors were encountered: