Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include structured citations in EML #4

Open
mdoering opened this issue Apr 27, 2023 · 12 comments
Open

Include structured citations in EML #4

mdoering opened this issue Apr 27, 2023 · 12 comments
Assignees

Comments

@mdoering
Copy link
Member

mdoering commented Apr 27, 2023

For the custom GBIF extension of EML it would be good to also have a structured citation in addition to the citation string and identifier. That applies to both the main citation as well as the bibliography.

It will allow the registry to be able to search and facet on journals and publishers, something important to journals participating in publishing treatment articles. For example:

<additionalMetadata>
	<metadata>
		<gbif>
			<citation 
				identifier="http://doi.org/10.5886/zw3aqw"
				type="Dataset"
				author="Brouillet L"
				title="Database of Vascular Plants of Canada (VASCAN)"
				version="37.12"
				issued="2023"
				publisher="Université de Montréal Biodiversity Centre"
				accessed="2023-04-27"
				url="https://doi.org/10.5886/zw3aqw"
			>Brouillet L (2023). Database of Vascular Plants of Canada (VASCAN). Version 37.12. Université de Montréal Biodiversity Centre. Checklist dataset https://doi.org/10.5886/zw3aqw accessed via GBIF.org on 2023-04-27.</citation>
			<bibliography>
				<citation 
					identifier="http://doi.org/10.1111/j.1095-8339.2009.00996.x"
					type="ARTICLE-JOURNAL"
					author="Angiosperm Phylogeny Group"
					title="An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III"
					containerTitle="Botanical Journal of the Linnaen Society"
					issued="2009"
					volume="161"
					page="105–121"
				>Angiosperm Phylogeny Group (2009) An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Botanical Journal of the Linnaen Society 161: 105–121. doi: 10.1111/j.1095-8339.2009.00996.x</citation>
				<citation identifier="http://doi.org/10.1111/j.1095-8339.2009.01002.x">Chase MW, Reveal JL (2009) A phylogenetic classification of land plants to accompany APG III. Botanical Journal of the Linnaen Society 161 (2): 122–127. doi: 10.1111/j.1095-8339.2009.01002.x</citation>
			</bibliography>
		</gbif>
	</metadata>
</additionalMetadata>

List of main fields corresponding to CSL-JSON:

  • type: # CSL type, e.g. ARTICLE-JOURNAL, BOOK, CHAPTER, DATASET, WEBPAGE. See https://aurimasv.github.io/z2csl/typeMap.xml for a mapping of CSL types to field sets
  • doi: # a DOI
  • author: # author list
  • editor: # editor list
  • title: # primary title of the item
  • containerAuthor: # author(s) of the container holding the item (e.g. the book author for a book chapter). type=NAME[]
  • containerTitle: # title of the container holding the item (e.g. the book title for a book chapter, the journal title for a journal article)
  • issued: # date the item was issued/published in possibly truncated ISO format, e.g. 1998, 1998-05 or 1998-05-21. type=DATE
  • accessed: # date the item has been accessed. type=DATE
  • collectionTitle: # title of the collection holding the item (e.g. the series title for a book)
  • collectionEditor: # editor(s) of the collection holding the item (e.g. the series editor for a book). type=NAME[]
  • volume: # (container) volume holding the item (e.g. “2” when citing a chapter from book volume 2). type=NUMBER
  • issue: # (container) issue holding the item (e.g. “5” when citing a journal article from journal volume 2, issue 5). type=NUMBER
  • edition: # (container) edition holding the item (e.g. “3” when citing a chapter in the third edition of a book). type=NUMBER
  • page: # range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)
  • publisher: # publisher
  • publisherPlace: # geographic location of the publisher
  • version: # version of the dataset/source
  • isbn: # International Standard Book Number
  • issn: # International Standard Serial Number
  • url: # link to webpage for electronic resources
  • note: # (short) inline note giving additional item details (e.g. a concise summary or commentary)
@mdoering
Copy link
Member Author

This is similar to the source filed in the ColDP metadata.
@thomasstjerne @timrobertson100 @ahahn-gbif @gsautter does that represent our discussion well?

@gsautter
Copy link

gsautter commented May 2, 2023

@mdoering please have a look at the eml.xml in tb.plazi.org/GgServer/dwca/995CFFC54F0AB7277923CD0E036BB046.zip ... is that what you had in mind?

@mdoering
Copy link
Member Author

mdoering commented May 2, 2023

Exactly, yes!

<citation author="Mondaca, José; Rebolledo, Guido; Vitali, Francesco" containerTitle="Insecta Mundi" doi="http://doi.org/10.5281/zenodo.7887620" issn="1942-1354" issue="979" issued="2023" page="1-5" title="Stictoleptura cordigera (Füssli, 1775) (Cerambycidae: Lepturinae: Lepturini), a new alien longhorn beetle introduced in Chile" volume="2023">Mondaca, José, Rebolledo, Guido, Vitali, Francesco (2023): Stictoleptura cordigera (Füssli, 1775) (Cerambycidae: Lepturinae: Lepturini), a new alien longhorn beetle introduced in Chile. Insecta Mundi 2023 (979): 1-5, DOI: http://doi.org/10.5281/zenodo.7887620</citation>

The only thing I wonder about is whether the main, single citation element should be used for this or one under bibliography. The main one is the dataset citation (i.e. type="Dataset"), so it's odd if it differs from the other metadata.

@gsautter
Copy link

gsautter commented May 2, 2023

The only thing I wonder about is whether the main, single citation element should be used for this or one under bibliography. The main one is the dataset citation (i.e. type="Dataset"), so it's odd if it differs from the other metadata.

Well, considering the fact that the citation element has always contained the citation of the source publication, I think that's a rather natural place to put the detail attributes as well ... on top of the fact that we've never used the bibliography element at all in eml.xml so far ... what sorts of alternatives do you have in mind?

@mdoering
Copy link
Member Author

mdoering commented May 2, 2023

AFAIK GBIF currently ignores the citation element and produces its own citation from the rest of the metadata.
From GBIF:

Plazi.org taxonomic treatments database. Stictoleptura cordigera (Füssli, 1775) (Cerambycidae: Lepturinae: Lepturini), a new alien longhorn beetle introduced in Chile. Checklist dataset https://doi.org/10.15468/nkfvu8 accessed via GBIF.org on 2023-05-02.

Your provided citation:

Mondaca, José, Rebolledo, Guido, Vitali, Francesco (2023): Stictoleptura cordigera (Füssli, 1775) (Cerambycidae: Lepturinae: Lepturini), a new alien longhorn beetle introduced in Chile. Insecta Mundi 2023 (979): 1-5, DOI: http://doi.org/10.5281/zenodo.7887620

Not only are the authors different, but also the DOI and other parts as GBIF tries to produce a citation for the dataset.
The ColDP metadata.json you nicely do for ChecklistBank also includes the structured citation of the article as an entry in the source list which corresponds to the bibliography list in EML.

I would be interested to hear @ahahn-gbif and @timrobertson100, but I would recommend to place the pure & structured article citation inside the bibliography section of the EML and actually remove the other one. Basically this means moving the citation element from /eml/additionalMetadata/metadata/gbif/citation to /eml/additionalMetadata/metadata/gbif/bibliography/citation but keep it otherwise as it is now!

@gsautter
Copy link

gsautter commented May 2, 2023

Basically this means moving the citation element from /eml/additionalMetadata/metadata/gbif/citation to /eml/additionalMetadata/metadata/gbif/bibliography/citation but keep it otherwise as it is now!

Easy enough, merely an XML edit (in the eml.xml template) ... have another look at the eml.xml in http://tb.plazi.org/GgServer/dwca/995CFFC54F0AB7277923CD0E036BB046.zip ... is that what you suggested?

@mdoering
Copy link
Member Author

mdoering commented May 2, 2023

yes, 100% that!

@gsautter
Copy link

gsautter commented May 2, 2023

OK, thanks ... all DwC_As from Plazi should look like that from now on (unless I change back the eml.xml template).

@gsautter
Copy link

gsautter commented May 3, 2023

@mdoering the post kind of implies you need something on top of the citation attributes I added in the past couple of days ... where exactly should those extra attributes go, and what should they be populated with?

@mdoering
Copy link
Member Author

mdoering commented May 3, 2023

No, this is just to get what we discussed into a) the GBIF EML profile XSD b) let the GBIF registry read and store it and c) allow the dataset search to use some of that, e.g. the journal, publisher or issue date.

@gsautter
Copy link

gsautter commented May 3, 2023

No, this is just to get what we discussed into a) the GBIF EML profile XSD b) let the GBIF registry read and store it and c) allow the dataset search to use some of that, e.g. the journal, publisher or issue date.

Ah, OK ... take it this one is done from my end, then?

@mdoering
Copy link
Member Author

mdoering commented May 3, 2023

No, this is just to get what we discussed into a) the GBIF EML profile XSD b) let the GBIF registry read and store it and c) allow the dataset search to use some of that, e.g. the journal, publisher or issue date.

Ah, OK ... take it this one is done from my end, then?

YES!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants