The Unified Model ("UM" hereafter) is highly normalized and it may seem overwhelming at first. That is understandable. Remember that the UM is meant to be a comprehensive representation that accommodates all use cases. It may not seem the simplest way to represent the data you'll be mapping, but that is because it has to cover other prespectives as well. As such, please also keep in mind that this exercise is to test the capacity of the UM to faithfully represent the data in collection management systems in aggregate, not to determine a least common denominator publishing model, such as is the case with Darwin Core archives.
Your task is to populate a postgresql database using the UM structure we have provided in the creation script schema.sql, using data from your database as a source. This will require "mapping" between your structure and that of the UM.
In this document we will use figures to illustrate the structure of the UM. These figures take the form of Entity-Relationship (ER) diagrams. In these diagrams, concepts (implemented as tables for this exercise) are denoted by boxes with labels in UpperCamelCase. The properties (fields) for these concepts are listed within the box for the concept they are properties of, and are in lowerCamelCase. The figures do not necessarily show the full set of fields for the tables they represent, nor do they show data types and other constraints. At times we will show snippets of the schema (such as table definitions) for reference. The definitive version of the tables to populate is in schema.sql. The term names in the figures (e.g., eventType
) and correspond to their equivalents in lower_snake_case in the database (e.g., event_type
).
You will not be expected to parse the data in your database to make it fit into the UM, but you will be asked in some cases to provide explicit data in the UM that are only implicit in your data. It is likely that the source data won't have all tables needed and some will need to be "invented". For example, a source database may have collecting events and locations merged into a single table. This will require the table to be split to map correctly into Event
s, Location
s, and Georeference
s.
For example, you may have database records based on material in your collection, but no field that identifies the event during which that material was collected. In the UM those would be non-overlapping and required concepts, and each MUST be identified separately.
The use of record identifiers for concepts in the UM is ubiquitous, and required whenever you have data that correspond to a given concept. For this exercise, when creating tables in the UM, use resolvable global unique identifiers for the ID
fields if you have them. If you don't, use non-resolvable global unique identifiers if you have them. If you don't, generate UUIDS as identifiers in place of the identifiers that are unique only within the scope of your database. In cases where your database does not have identifiers for records that can be inferred for the UM, generate UUIDs for these identifiers. For every identifier you have to create in the place of a local one in your database, you CAN also create an Identifier
record that translates between your local identifier and the one you created for sharing via the UM. If you do this, set the identifierType
to local
. Here is the statement from schema.sql that creates the structure of the Identifier
table.
CREATE TABLE identifier (
identifier_target_id TEXT NOT NULL,
identifier_target_type COMMON_TARGETS NOT NULL,
identifier_type TEXT NOT NULL,
identifier_value TEXT NOT NULL,
PRIMARY KEY (identifier_target_id, identifier_target_type, identifier_type, identifier_value)
);
The Identifier
and other "common model" tables are described in GBIF Common Models document and will be discussed in context as we proceed through the Suggested steps for data mapping.
Some AgentRole
s are currently made explicit in the UM. Most of these are simply fields for the name of the Agent
fulfilling the role (e.g., georeferencedBy
), while others are fields for an Agent
identifier (e.g., recordedByID
. Following are lists of explicit AgentRole
fields in the UM, separated by the concept they can be found in. Separate AgentRole records for these are not necessary.
Assertion: assertionByAgentName
, assertionByAgentID
DigitalEntity: rightsHolder
, creator
, nameAccordingTo
, taxonAuthority
, recordedBy
, recordedByID
Identification: typeDesignatedBy
, identifiedByID
Location: locationAccordingTo
, georeferencedBy
MaterialEntity: institutionCode
, institutionID
, collectionCode
, collectionID
, ownerCollectionCode
, recordedBy
, recordedByID
, chronometricAgeDeterminedBy
Organism: identifiedBy
Taxon: nameAccordingToID
Most of the tables in the UM have fields that benefit from using controlled vocabularies. Some of these fields MUST use values from a specific controlled vocabulary. In the database creation script these can be found as ENUM
s where the values are in UPPER_SNAKE_CASE. Following is a simple example for the strictly controlled vocabulary for the entity_type
field in the entity
table (no other values are valid):
CREATE TYPE ENTITY_TYPE AS ENUM (
'DIGITAL_ENTITY',
'MATERIAL_ENTITY'
);
CREATE TABLE entity (
entity_id TEXT PRIMARY KEY,
entity_type ENTITY_TYPE NOT NULL,
dataset_id TEXT NOT NULL,
entity_name TEXT,
entity_remarks TEXT
);
Most "type" fields in the UM are not controlled by an ENUM. For these type fields, and any other fields for which a controlled vocabulary is suggested, any requirements that might exists will be given in the Suggested steps as they are encountered. Other than the requirements encountered, feel free to use values that make sense for your data.
Part of what this exercise will reveal is the diversity of data that are being stored in collection management systems. The aggregation of vocabulary values that are actually in use locally will be a very interesting outcome of this project that may help to inform future work on vocabularies of values that help us all translate to concepts we have in common with different labels.
The UM provides four special tables (AgentRole
, Assertion
, Citation
, and Identifier
) to supplement the core information of other tables (e.g., Organism AgentRole
, Event Assertion
, GeneticSequence Citation
, Agent Identifier
). The document GBIF Common Models describes how these concepts fit into the UM.
Each of the "common model" tables can be linked to the set of tables given in the COMMON_TARGETS
enumeration, which is defined in schema.sql as shown below. How to use the COMMON_TARGETS
enumeration for the various targetType
fields of the "common model" tables will be explained in context throughout the Suggested steps.
CREATE TYPE COMMON_TARGETS AS ENUM (
'ENTITY',
'MATERIAL_ENTITY',
'MATERIAL_GROUP',
'ORGANISM',
'DIGITAL_ENTITY',
'GENETIC_SEQUENCE',
'EVENT',
'OCCURRENCE',
'LOCATION',
'GEOREFERENCE',
'GEOLOGICAL_CONTEXT',
'PROTOCOL',
'AGENT',
'COLLECTION',
'ENTITY_RELATIONSHIP',
'IDENTIFICATION',
'TAXON',
'REFERENCE',
'AGENT_GROUP',
'ASSERTION',
'CHRONOMETRIC_AGE'
);
Below is a list of the steps we suggest to follow to map your collection management system data to the UM. Each step has a link to a more detailed description of what to do. The order of these steps was designed to make sure that you will already have records for concepts that will be linked to in subsequent steps of the mapping process.
3. Assertions, Citations, and Identifiers for Agents
8. AgentRoles, Assertions, Citations, and Identifiers for DigitalEntities
10. Locations, Georeferences, and GeologicalContexts
12. Occurrences and other Events
13. AgentRoles, Assertions, Citations, and Identifiers for Occurrences and other Events
15. AgentRoles, Assertions, Citations, and Identifiers for Taxa
17. AgentRoles, Assertions, Citations, and Identifiers for Identifications
NOTE: Skip this step if your Agent
s are identified only by name (i.e., not with a separate agent identifier).
We recommend to map Agent
s ((e.g., people, groups of people, organizations, collections, see Figure 1) first, if you have them, because their identifiers will be used in the construction of many of the other tables in the UM. If you don't track agents separately in your database, don't worry about it, they can be designated by their names where appropriate in the UM.
Figure 1. Agents and their relationships in the Unified Model
If an Agent
is a Collection
or an AgentGroup
, the agentType
MUST be COLLECTION
or AGENT_GROUP
respectively. However, the agentType
field is not controlled by an ENUM, because there are other possible values that are not subtypes of Agent
, such as ORGANIZATION
, PERSON
, and even ORGANISM
. If you need to use an agentType
we haven't mentioned here, please create it in UPPER_SNAKE_CASE.
For this exercise, We suggest values such as MUSEUM
, HERBARIUM
, BOTANICAL_GARDEN
, ZOO
. If you need to use an collectionType
we haven't mentioned here, please create it in UPPER_SNAKE_CASE.
An AgentGroup
is a way to refer to a single Agent
entity that is composed of multiple other Agent
s. Thus, a group of Collection
s might be a CONSORTIUM
, a group of university students might be a CLASS
. If you need to create an agentGroupType
, please use UPPER_SNAKE_CASE.
The range of possible relationships between Agent
s is vast. Note that the relationship has directionality. The subjectAgentID
is related to the objectAgentID
in the direction expressed in the agentRelationshipType
, thus it helps to express the directionality in the agentRelationshipType
term itself, for example, DOCTORAL_ADVISOR_OF
instead of DOCTORAL_ADVISOR
, which would be ambiguous to interpret. If you need to create an agentRelationshipType
, please use UPPER_SNAKE_CASE.
NOTE: Skip this step if do not have Reference
s in your data or if your Reference
s are identified only by bibliographic citations.
In the UM, a Reference
, like an Agent
, has the potential to be related to many different kinds of things (e.g., MaterialEntity
, Event
, Taxon
) through Citation
s. If you track references with identifiers, create Reference
records for them so that they can be connected in later steps when the other tables they are related to are created. If you don't track reference separately in your database, don't worry about it, they can be designated by their bibliographic citations where appropriate in the UM.
Figure 2. Citations of References in the Unified Model
Here are some suggestions for values of referenceType
, but feel free to use others if none of these suffices: JOURNAL_ARTICLE
, BOOK
, BOOK_SECTION
, DISSERTATION
, FIELD_NOTEBOOK
, WEB_PAGE
, OTHER
.
NOTE: Skip this step if you created no Agent
records in Step 1
It is possible to create Assertion
s, Citation
s, and Identifier
s for Agent
s. See GBIF Common Models for general discussions about how to map to these three types of tables and considerations when developing the vocabularies for assertionType
and assertionUnit
.
The value for this term MUST be one of AGENT
, AGENT_GROUP
, or COLLECTION
and MUST match the table to which the Assertion
applies.
NOTE: Skip this step if your Protocol
s are identified only by simple strings (names or descriptions) or if you do not have Protocol
s mentioned in your data.
A Protocol
can be used by the classes Event
, ChronometricAge
, and the various Assertion
s. If you track protocols with identifiers, create Protocol
records for them so that they can be connected when the tables they are related to are created.
Figure 3. Entities and their relationships in the Unified Model
A MaterialEntity
can be any physical object (same as bco:material entity and dcterms:PhysicalResource). In the UM there can be many types of MaterialEntity
s, which are distinguished by the value of materialEntityType
. These can be as specific as desired, but there are two MaterialEntity
subtype classes to distinguish two important concepts, MaterialGroup
and Organism
. For each MaterialEntity
record you create, also create an Entity
record using the same identifier for the entityID
as for the materialEntityID
. The entityType
for the Entity
MUST be MATERIAL_ENTITY
.
A MaterialGroup
is any set of MaterialEntity
s and the utility of this concept is to be able to make Assertion
s about the group as a whole, distinct from Assertion
s about its individual members (e.g., the weight of an entire catch as opposed to the weights of selected individuals in the catch). A MaterialGroup
record MUST have a corresponding MaterialEntity
record, which in turn MUST have MATERIAL_GROUP
as its materialEntityType
. Potential vocabulary terms for materialGroupType
are HAUL
and LOT
. Feel free to create others as needed.
An Organism (same as dwc:Organism) is modeled in the UM as a MaterialEntity
, even if none of the material remains accessible (such as in the case of some observations, or the case of a specimen that was lost or destroyed). Even though an Organism
might also act as an Agent
, we do not currently model it in this way. In the most basic case, a cataloged item consists of the entire existing accessible material remains of a single Organism
. These may be separated into "parts" in a database, which may or may not be tracked separately. When they are tracked separately, the Entity
that unites them is the Organism
. The derivation of the "parts" from the Organism
(or from each other) are expressed through EntityRelationship
s. An Organism
record MUST have a corresponding MaterialEntity
record with its materialEntityID
the same as the organismID
. The materialEntityType
of of the MaterialEntity
record MUST be ORGANISM
. The MaterialEntity
record for the Organism
must in turn have a corresponding Entity
record with its entityID
the same as the organismID
an materialEntityID
. The entityType
of the Entity
record MUST be MATERIAL_ENTITY
.
6. AgentRoles, Assertions, Citations, Identifiers and ChronometricAges for MaterialEntities and their subtypes
Figure 4 shows the relationships between MaterialEntity
and associated tables, including the "common model" tables. The relationships between MaterialEntity
and other Entity
tables was shown in Figure 3. Each of the Entity
tables can be connected to the common model tables. The important thing is to make sure that the connections happen at the appropriate, most specific level in the hierarchy. For example, suppose a blood sample was taken from an Organism
and its volume was measured. The blood sample is a MaterialEntity
(NOT an Organism
). There should be an EntityRelationship
showing that the subject MaterialEntity
had the relationship extractedFrom
the object Organism
. The blood sample volume should result in an Assertion
for the MaterialEntity
, not an Assertion
for the corresponding parent Entity
record, nor the related Organism
record. Specifically, the assertionTargetID
should be the same as the materialEntityID
for the blood sample, the assertionTargetType
MUST be MATERIAL_ENTITY
, the assertionType
should be VOLUME
, the assertionValue
should be left empty, the assertionValueNumeric
should have the numerical value of the volume, and the assertionUnit
should have an appropriate SI unit (e.g., ml
). The same principles apply to relationships to the Citation
, AgentRole
and Identifier
tables - they should be associated with the correct level of the Entity
hierarchy.
A ChronometricAge
s MUST only be related directly to a MaterialEntity
.
The following AgentRole
s related to MaterialEntity
s are currently made explicit in the UM, these roles do not require separate AgentRole
s to be made: institutionCode
, institutionID
, collectionCode
, collectionID
, ownerCollectionCode
, recordedBy
, recordedByID
, chronometricAgeDeterminedBy
.
Figure 4. MaterialEntities and their "common model" tables in the Unified Model
In the UM there can be many types of DigitalEntity
. These are distinguished by the digitalEntityType
field, which has a strictly controlled vocabulary consisting of the values in the following enumeration:
CREATE TYPE DIGITAL_ENTITY_TYPE AS ENUM (
'DATASET',
'INTERACTIVE_RESOURCE',
'MOVING_IMAGE',
'SERVICE',
'SOFTWARE',
'SOUND',
'STILL_IMAGE',
'TEXT',
'GENETIC_SEQUENCE'
);
One of these, the GENETIC_SEQUENCE
is a formal subtype of DigitalEntity
(see Figure 3). This means that when a GENETIC_SEQUENCE
record is created, a corresponding MaterialEntity
record MUST also be created, and the digitalEntityType
for it MUST be GENETIC_SEQUENCE
. For each DigitalEntity
, also create an Entity
record using the same unique identifier for the entityID
as for the digitalEntityID
. The entityType
for the Entity
MUST be DIGITAL_ENTITY
.
The same kinds of "common model" associations shown in Figure 4 for MaterialEntity
s can be made for DigitalEntity
s, except that each targetID
MUST be the same as the identifier (digitalEntityID
or geneticSequenceID
) for the DigitalEntity
or GeneticSequence
it is directly associated with. The values for the targetType
fields MUST be DIGITAL_ENTITY
or GENETIC_SEQUENCE
, depending on the table they are to be directly related to.
The following AgentRole
s related to MaterialEntity
s are currently made explicit in the UM, these roles do not require separate AgentRole
s to be made: rightsHolder
, creator
, nameAccordingTo
, taxonAuthority
, recordedBy
, recordedByID
.
At this stage in the process, all of the Entity
records will have been created, providing the prerequisite for being able to create the relationships between them. The supertype/subtype relationships between Entity
tables were shown above in Figure 3, and should already heve been created at this point. The "common model" associations will also already have been made. Here we will concentrate on other associations, ones that should be captured in the EntityRelationship
table, the definition of which is:
CREATE TABLE entity_relationship (
entity_relationship_id TEXT PRIMARY KEY,
depends_on_entity_relationship_id TEXT REFERENCES entity_relationship ON DELETE CASCADE,
subject_entity_id TEXT REFERENCES entity ON DELETE CASCADE,
entity_relationship_type TEXT NOT NULL,
object_entity_id TEXT REFERENCES entity ON DELETE CASCADE,
object_entity_iri TEXT,
entity_relationship_date TEXT,
entity_relationship_order SMALLINT NOT NULL DEFAULT 0 CHECK (entity_relationship_order >= 0)
);
The EntityRelationship
table is a powerful way to make just about any connection between Entity
s in the UM. Any Entity
can be related to any other one with any relationship you care to create in the entityRelationshipType
. There are two things to keep in mind here. The first is that the subtype relationships should be strictly relegated to the correspondence of the values of identifier fields (e.g., entityID
and materialEntityID
for a MaterialEntity
) and should already have been done in previous steps. This would be the equivalent of an EntityRelationship
stating that a particular Entity
isA
MaterialEntity
, which would be superfluous. The second thing to keep in mind is that the semantics of the relationships is entirely dependent on the clear understanding of the predicate (the entityRelationshipType
) and the correct assignment of Entity
s to the subject and object roles. The relationships should always be read as subject->predicate->object - that is, the relationship has a direction. Each relationship can have a complementary one where the subject/object roles are reversed and the predicate shows what the relationship looks like from the opposite direction. For example, if Organism
'A' was eaten by
another Organism
'B', it follows that Organism
'B' ate
Organism
'A'. There are certainly cases in which reverse roles might be necessary. For example, if 'B' was a parasitoid of
'A', it isn't enough to understand this by saying 'A' was a host of
'B'. Of course, there are alternative ways to express the relationship in the predicate to solve this issue, such as 'A' was a parasitoid host of
'B'. We leave it to your discretion which relationships to capture from your original data, but be aware that the semantics are tied up entirely in the predicates, and care should be taken when developing these vocabulary terms.
Location
s in the UM are used to provide both textual and geospatial context to Event
s. Location
can be expressed as a denormalized (flattened) construct with the (Darwin Core part of a) geographic classification in the same record, or as a normalized construct with the geography built of parent/child relationships of successive administrative regions. Figure 5 shows the structural relationships between the Location
-related tables in the UM.
Georeference
s are special assertions of the geospatial interpretation of a Location
s. As assertions, the model supports zero, one, or multiple georeferences per Location
, whether current or historical, accepted or disputed. The UM also supports the designation of zero or one accepted Georeference
s by populating acceptedGeoreferenceID
in the Location
table with the georeferenceID
of the corresponding accepted Georeference
, if any.
GeologicalContext
is modeled similarly to a Georeference
, but with an acceptedGeologicalContextID
in the Location
table that MUST match the geologicalContextID
of the corresponding GeologicalContext
, if any.
Figure 5. Locations, Georeferences and GeologicalContexts in the Unified Model
11. AgentRoles, Assertions, Citations, and Identifiers for Locations, Georeferences, and GeologicalContexts
The "common model" tables associated with the three Location
-related tables can be populated at this point. The values for the targetType
fields of the "common model" tables MUST be LOCATION
, GEOREFERENCE
or GEOLOGICAL_CONTEXT
, and their targetID
s MUST correspond to the locationID
, georeferenceID
or geologicalContextID
, depending on the table they are to be directly related to.
An Event
is something that happens within a place during a period of time. The spatial scale and temporal duration of the Event
may be as specific or vague as necessary, and may or may not be provided. Event
s are hierarchical in the UM, with a parent Event
containing all of its child Event
s both spatially and temporally. A project (or any other higher organizational initiative) might be a parent-most Event
, the spatial and temporal limits of which encompass all of the Event
s within it. The next level down might consist of collecting expeditions launched as part of the parent project, for example. Each Event
can likewise encompass sub-Event
s to an arbitrary hierarchical depth, each with the same or distinct Location
and temporal bounds as its parent (under the limitation of being contained).
Figure 6. Events in the Unified Model
In the UM, an Occurrence
is a subtype of Event
in which the activity (observing, collecting, sampling) established the existence of an Organism
within a spatiotemporal context, usually with accompanying evidence. The OccurrenceEvidence
table serves to connect the Occurrence
with digital and/or material evidence, such as images, material samples or whole organisms, and genetic sequences. In collections, an Organism
is often effectively the Entity
that gets cataloged, with an accompanying list of preparations that represent the "parts" of the Organism
that are or were present in the collection. If you do not track "parts" separately with their own characteristics, the Organism
record should be the one used for the OccurrenceEvidence
. Note that the organismID
is not an occurrenceID
- the former is an identifer for an Organism
(a MaterialEntity
), while the latter is an identifier for the Occurrence
(an Event
), and MaterialEntity
s are not Event
s. In the absence of unique (and distinct) identifers for Organism
s and Occurrence
s, they will have to be generated to populate the UM correctly, as described in the General considerations section.
The Occurrence
carries with it the ephemeral characteristics of the Organism
at the place and time of the Event
. Thus, for example, an Organism
that had blood samples taken multiple times over its lifetime may have had a reproductiveCondition
of juvenile
in an early Occurrence
and a reproductiveCondition
of adult
in later one.
Each Occurrence
has its own occurrenceID
. The Occurrence
s associated with a given Organism
can be discovered by the organismID
they have in common. Every Occurrence
must have a corresponding Event
record in which the eventID
is the same is the occurrenceID
and the eventType
for the Event
record MUST be OCCURRENCE
.
Figure 7. Occurrences and their evidence in the Unified Model
With Location
s, Protocol
s, and Event
s now in place, the "common model" tables associated with the Event
-related tables can be populated. The values for the targetType
fields of the "common model" tables MUST be EVENT
or OCCURRENCE
and their targetID
s MUST correspond to the eventID
or occurrenceID
, depending on the table they are to be directly related to. Remember that Assertion
s about ephemeral characterics of the Organism
should be attached to Occurrence
record rather than to the Organism
record.
In the UM, a Taxon
can be expressed as a denormalized (flattened) construct with the (Darwin Core part of a) taxonomic classification in the same record, or as a normalized construct with the classification built of parent/child relationships of taxa of successive ranks. Feel free to use the construct that best matches how your data are structured. The table definition for Taxon
from schema.sql is:
CREATE TABLE taxon (
-- common to all
taxon_id TEXT PRIMARY KEY,
scientific_name TEXT NOT NULL,
scientific_name_authorship TEXT,
name_according_to TEXT,
taxon_rank TEXT,
taxon_source TEXT, -- From what taxonomic authority is the information taken
scientific_name_id TEXT,
taxon_remarks TEXT,
-- normalized view
parent_taxon_id TEXT REFERENCES taxon ON DELETE CASCADE,
taxonomic_status TEXT,
-- denormalized
kingdom TEXT,
phylum TEXT,
class TEXT,
"order" TEXT,
family TEXT,
subfamily TEXT,
genus TEXT,
subgenus TEXT,
accepted_scientific_name TEXT -- populated only when scientific name is a synonym
);
The "common model" tables associated with Taxon
can now be populated. The values for the targetType
fields of the "common model" tables MUST be TAXON
and their targetID
s MUST correspond to the taxonID
of the Taxon
they are to be directly related to.
In the UM, an Identification
applies to an Organism
, though the IdentificationEvidence
may consist of any number of MaterialEntity
s and/or DigitalEntity
s. An Organism
can also have multiple Identification
s, though only zero or one of these can be marked as 'accepted'. The Identification
record itself consists of the verbatimIdentification
string applied to the Organism
and a taxonFormula
from a controlled vocabulary that indicates the pattern of taxon names mixed with qualifiers in the verbatimIdentification
. This allows for Identification
s that are not strictly scientific names, but that can point to all of the real scientific names involved. For example, the hybrid verbatimIdentification
Canis latrans x Canis lupus familiaris
(see example below) isn't a scientificName
, but its component parts Canis latrans
and Canis lupus familiaris
are. For reference, here is the statement to create the Identification
table:
CREATE TABLE identification (
identification_id TEXT PRIMARY KEY,
identification_type TEXT NOT NULL,
taxon_formula TEXT NOT NULL,
verbatim_identification TEXT,
type_status TEXT,
identified_by TEXT,
identified_by_id TEXT,
date_identified TEXT,
identification_references TEXT,
identification_verification_status TEXT,
identification_remarks TEXT,
type_designation_type TEXT,
type_designated_by TEXT
);
Figure 8. Identifications in the Unified Model
The recommended controlled vocabulary for taxonFormula
can be found in the Arctos taxa_formula code table documentation, repeated here for convenience:
A
A / B intergrade
A ?
A aff.
A and B
A cf.
A or B
A ssp.
A x B
A {string}
For the hybrid verbatimIdentification
Canis latrans x Canis lupus familiaris
, the taxonFormula
would be A x B
. There are two taxon_id
s involved, one for Canis latrans
(the A
in the taxonFormula
) and one for Canis lupus familiaris
(the B
in the taxonFormula
). We would expect to find Taxon
records for these two taxa, and their taxonID
s would be used in two records of TaxonIdentification
. The TaxonIdentification
record corresponding to Canis latrans
would include the identificationID
for the Identification
record that has verbatimIdentification
Canis latrans x Canis lupus familiaris
and taxonFormula
A x B
. That same TaxonIdentification
record would have the taxonID
for Canis latrans
and the taxonOrder
would be 1
(because it is the first taxon that appears in the formula). The TaxonIdentification
record corresponding to Canis lupus familiaris
would include the identificationID
for the Identification
record that has verbatimIdentification
Canis latrans x Canis lupus familiaris
and taxonFormula
A x B
. That same TaxonIdentification
record would have the taxonID
for Canis lupus familiaris
and the taxonOrder
would be 2
(because it is the second taxon that appears in the formula).
The final modeling step is to populate the "common model" tables associated with Identification
. The values for the targetType
fields of the "common model" tables MUST be IDENTIFICATION
and their targetID
s MUST correspond to the identificationID
of the Identification
they are to be directly related to.