-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proceedure for creating the trait list #18
Comments
It would be great to have a logic for the numbering, but I guess any scheme is too constrained and will break at some point in the future as further traits are added to the list. Also, the categories or classifications sometimes depend very much on the research question and might confuse researchers of a very different background. If we manage to upload our trait list to a public website, we could provide globally valid URIs as traitID, of the scheme "https://www.bexis.uni-jena.de/arthropodtraits.html#body_size" instead of numeric IDs which are only locally valid. The TSita traits then could receive their original URI as traitID, e.g. 'http://t-sita.cesab.org/BETSI_vizInfo.jsp?trait=Wing_surface'. Such a URI-based traitID scheme is kind of human readable, but too complicated to be entered manually. The users would then rely on the For the hierarchies a more flexible scheme could be to add columns which link each trait to the next broader term (parent term) or narrower terms (child terms). That is what T-SITA already provides and we could just adopt this. Then, there is no limit to the hierarchical depth of terms. |
Concerning the trait list we provide, In the whitepaper I would put it that way:
That said, we are free to publish a 'incomplete' list without using a defined hierarchy, but providing a semantic framework for building such hierarchies. (see #1) The field |
Regarding the Regarding the hierarchy: A flexible hierarchy is probably best. The field Regarding the completeness of the trait list: I also think that we can start with the arthropod trait list for the whitepaper. The list 'traitlist_arthropods.csv' currently includes T-SITA traits and the additional traits from our survey. I think it should be feasible to complete this list for publication with the whitepaper, even if there is no automatic way to update it. |
I should probably check back with GFBio Terminology Server what would be the best in terms of semantic web standards. @aostrow: maybe you can comment on this? I think we can avoid duplicating the definitions of T-SITA traits. We should not just copy their definitions in our ontology, since they did not publish it as commons. Their ontology is accessible online and we should link to it directly. That is, for our list, the traits that are available in T-SITA should just contain the three fields: traitName (which can follow our own naming scheme), traitID (containing the T-SITA URI) and traitUnit (since this is important for the R-package to function and I can't extract it from T-Sita directly). Our reference then directly tunnels anyone looking up the trait to T-SITA. The Definition, broader and narrower terms must then be extracted from there. If in any case a trait needs to be defined slightly differently than in T-SITA, we should refer to the T-SITA URI as a related term. For all additional traits used within BExIS and not occuring in T-SITA, we must provide our own Definition, URI (for now linking to a Github resource, similar to our current data standard. we should create an extra repo for that. We'll aim to move this to GFBio before publication), traitUnit, and the related, broader and narrower terms (narrower and related terms may contain multiple entries, separated by semicolon). |
Based on our discussions in the other issues, I suggest the following for the trait list:
I am thinking of making the Identifier somewhat human readable. For example all morphological traits would start with 1, all measurements of body_size would start with 11, followed by body_length as 111 and body_width as 112, etc. and body_length_abdomen as 1121. I am wondering though if this system will be flexible enough and if there are enough numbers to do this. We would probably have to assign more than one digit to the lowest level, but could then only have nine groups on each higher level. What are your thoughts on that?
The text was updated successfully, but these errors were encountered: