-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trait list: categorical traits #8
Comments
If we build on top of the existing trait Thesaurus, we should follow their constraints and definitions of factor levels as well, but we may be more explicit if they are not constraining it. For some traits, and especially those that are particular for a certain taxonomic group, a predefined vocabulary may be better. E.g. sociality, were just a few options exist, or for feeding_specialisation, were a lack of predefined factor levels would allow too many degrees of information (thus, defining ordinal levels to something like one, few, many resources, i.e. monophagous, oligophagous, polyphagous). |
I would go for dummy variables. For instance a common trait in mammals is activity diel (nocturnal, diurnal, crepuscular..). Species can be active at day/night or dusk/night. This is typically difficult to handle as categorical. Similar example is when you have food categories (e.g. leaves, seeds, invertebrates, vertebrates) and a single species can eat several categories. This might be true just for some of the categorical though. |
As I understood it, the T-SITA thesaurus uses boolean values for all traits. Those are structured within higher-level traits, e.g. Behaviour --> Nutrition_behaviour --> Diet --> Detritivore --> Coprophagous (0/1). Even though this means that we will end up with a huge list of traits, I think it will keep the trait database more flexible because I would assume that adding a trait should be easier than adding a trait level. On the other hand, adding a new trait would mean that all species which were characterised before, would have to be given a value (either 0 or 1) for this new trait, while adding another factor level would not require that. |
closing, because this is of minor importance for our methods paper. The online documentation of the standard will get a page on how to build trait thesauri (see also #18). This is where we may discuss the problem around factorial traits. |
How do we define categorical data? As one trait with a long list of possible levels or as dummy variables where users can assign more than one level to each trait (think of the definition for different types of omnivores)?
The text was updated successfully, but these errors were encountered: