Add best practices section #14

rly · 2021-04-21T23:06:03Z

The schema language supports some flexibility in how data types are defined, and some methods are encouraged over others for clarity and consistency. These best practices should be added to the schema language documentation:

Define new data types at the root of the schema rather than nested within another data type definition. Nested type definitions may in some cases lead to errors in HDMF. See Loading namespaces from file with nested extensions fails hdmf#511, Cannot extend group with subgroup/dataset that contains type definition hdmf#316, and nested type definitions hdmf#73
Use the quantity key not in the data type definition but in the group/dataset spec where the type is included. When the data type is defined at the root of the schema (as opposed to nested), then in order to use the data type, a new group (subgroup) spec is defined where the quantity key is set to a value or if omitted, the default value of 1 would be used. This makes the quantity defined in the data type definition meaningless and confusing. See also Remove quantity from OptogeneticStimulusSite and ImagingPlane definitions NeurodataWithoutBorders/nwb-schema#472
Use the name key not in the data type definition but iin the group/dataset spec where the type is included. Mismatch between the name defined on the data type definition and where it is included can lead to confusion in the expected behavior and may lead to errors in HDMF. See Mismatched fixed name and included name causes errors on read/validate hdmf#582
Create a new data type when adding attributes/datasets/groups/links to an existing data type. See Clarify whether fields can be added to groups/dsets with only data_type_inc #13. -- Make this a rule (stop allowing new specs with this)
Modifying the dtype, shape, or quantity of a data type when using data_type_inc should only restrict the values from their original definitions. For example, if type A has dtype: text and type B extends type A (data_type_def: B, data_type_inc: A), then type B should not redefine dtype to be int which is incompatible with the dtype of type A. Same thing if type A is included and a new type is not defined (just data_type_inc: A). In other words, all children types should be valid against the parent type. This is not yet checked in HDMF but see progress in Pass parent properties to extended dataset/attribute hdmf#321 .
Non-scalar values for the value and default_value keys are not yet supported by the official APIs, so these are discouraged until support is added.
Don'e allow spaces in any names. Don't allow creation of NWBGroups or NWBDatasets with spaces in the name NeurodataWithoutBorders/pynwb#1421

@bendichter @oruebel @ajtritt Can you think of other best practices to add? Do you agree with the above?

The text was updated successfully, but these errors were encountered:

rly · 2021-05-03T23:39:15Z

Question: How to validate against this case:

data_type_def: SpecialVectorData
data_type_inc: VectorData
dtype: text

data_type_def: DynamicTable
datasets:
- data_type_inc: VectorData  (validate against this as anonymous spec DynamicTable__VectorData)
  name: spike_times
  dtype: int
  shape: [None]

# question: should this be allowed?
# and what if SpecialVectorData does not specify a dtype?
MyTableBuilder
data type: DynamicTable
datasets:
- data type: SpecialVectorData
  name: spike_times

oruebel · 2021-09-16T22:27:58Z

See also #13 (comment) for additional best practices

oruebel added category: enhancement improvements of code or code behavior topic: docs Issues related to documentation labels Sep 16, 2021

rly added this to the Future milestone Jan 5, 2023

rly added the priority: low alternative solution already working and/or relevant to only specific user(s) label Jan 5, 2023

rly mentioned this issue Jun 28, 2023

[Feature]: Create NWBDatasetIncSpec and NWBGroupIncSpec hdmf-dev/hdmf#884

Open

3 tasks

alessandratrapani mentioned this issue Nov 29, 2023

Mismatched fixed name and included name causes errors on read/validate hdmf-dev/hdmf#582

Closed

5 tasks

rly mentioned this issue Mar 12, 2024

Clarify whether fields can be added to groups/dsets with only data_type_inc #13

Closed

rly linked a pull request Apr 10, 2024 that will close this issue

Add best practices section, prepare 3.0.0 release #32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add best practices section #14

Add best practices section #14

rly commented Apr 21, 2021 •

edited

Loading

rly commented May 3, 2021

oruebel commented Sep 16, 2021

Add best practices section #14

Add best practices section #14

Comments

rly commented Apr 21, 2021 • edited Loading

rly commented May 3, 2021

oruebel commented Sep 16, 2021

rly commented Apr 21, 2021 •

edited

Loading