Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation of GBIF required metadata #100

Open
CecSve opened this issue Oct 4, 2024 · 6 comments
Open

Add documentation of GBIF required metadata #100

CecSve opened this issue Oct 4, 2024 · 6 comments
Assignees
Labels
Data Publishing Issues relating to enhancing the 'Data Publishing' section of the technical documentation
Milestone

Comments

@CecSve
Copy link
Contributor

CecSve commented Oct 4, 2024

Based on the discussion here gbif/doc-freshwater-data-publishing-guide#20

It is unclear for publishers what is required and not required metadata. There may be differences between EML (or GBIF Metadata Profile) and the IPT's metadata editor interface that should be resolved.

@CecSve CecSve added the Data Publishing Issues relating to enhancing the 'Data Publishing' section of the technical documentation label Oct 4, 2024
@CecSve
Copy link
Contributor Author

CecSve commented Oct 14, 2024

Perhaps a table specifying what is EML profile required, IPT metadata required, similar to this table. Additional information on what is EML-derived and what are GBIF-specific elements would be nice a well, similar to the logo we add in the definition of this table specifying whether it is a GBIF term or a data standard term.

@MattBlissett
Copy link
Member

We need to avoid publishing conflicting information.

There are terms with a strict technical requirement enforced by the data format — an EML document is invalid without them. These are specified in the XSD schema definition in the case of EML / GBIF's EML profile.

There are additional terms where the requirement is enforced by our API or other processing, e.g. licence.

And there are terms where we write that they are required, but there's no technical enforcement of this: https://www.gbif.org/data-quality-requirements-occurrences — occurrenceID, basisOfRecord, scientificName, eventDate.

@CecSve CecSve self-assigned this Oct 14, 2024
@MattBlissett
Copy link
Member

The GBIF Metadata Profile document could be migrated into the tech docs: https://ipt.gbif.org/manual/en/ipt/latest/gbif-metadata-profile

As it's in the IPT at present, it has complete Spanish and Japanese translations and we don't want to lose these.

@CecSve CecSve added this to the 2024 updates milestone Oct 15, 2024
@CecSve
Copy link
Contributor Author

CecSve commented Oct 23, 2024

I will take relevant sections of the metadata profile guide and move them to a dedicated page on metadata in tech docs. I will first commit entire sections as is, then create new commits for edits, so it is possible to add translations from the original section.

The removed sections in the IPT manual will stay in during the transition phase and contain links to the relevant section in tech docs.

@CecSve
Copy link
Contributor Author

CecSve commented Oct 24, 2024

I think it is not possible with the current format we have for the eml-profile, but it would be nice if we could harvest the definitions from the schema itself instead of copy-pasting. Would we need to generate XML schema documentation to transfer the descriptions in the current GBIF EML Profile?

@MattBlissett
Copy link
Member

For the Darwin Core Archive core/extension XML I used a Python script to generate a snippet of AsciiDoctor: https://github.com/gbif/tech-docs/blob/main/en/data-use/modules/ROOT/partials/download-terms-tables.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Publishing Issues relating to enhancing the 'Data Publishing' section of the technical documentation
Projects
None yet
Development

No branches or pull requests

2 participants