-
Notifications
You must be signed in to change notification settings - Fork 7
Maintaining authority files
Authority files are necessary to implement separate search indexes for entities such as works, people or places.
These are mentioned throughout the manuscript TEI files, and users will be able to find them by searching the text of those descriptions. But to control which should be discoverable via browsing, and dedicated search indexes, then lists must be maintained in separate XML files. Identifiers, which must be unique within the catalogue, link entries in those authority files to the elements (e.g. author, persName, placeName, settlement, etc) and form part of the URLs of the pages that will be created for them on the web site.
The controlled vocabulary aspect of an authority file allows variations to be indexed as synonyms. For example, if the name of an author of a work in a manuscript is recorded in their native language form, then the authority file can provide latinizated and other versions. Place names can be disambiguated and historical spellings added. One version is chosen for display on the page that will be created for each on the web site, but it will not affect what is displayed in the manuscript description.
Additional information can be added, such as geographical coordinates for places, dates of birth and death for people, and references to other resources such as VIAF or Library of Congress vocabularies . These are not currently used by the web site, but features could be built on top of such information in the future.
Authority files should be kept in the root folder of your catalogue's GitHub repository (not in either the collections or processing subfolders) and named appropriately (e.g. "works.xml" for a works authority file.)
The contents should be TEI documents, although not using the same customized schema as for manuscript descriptions. The following template can be used to start the file:
TODO
The entries added within depend on type of entities:
- Works should be represented by
bibl
elements within alistBibl
parent element.- TODO: More rules
- People should be represented by
person
elements within alistPerson
parent element.- TODO: More rules
- Places should be represented by
place
elements within alistPlace
parent element.- Use a
type
attribute to indicate whether each is a 'settlement', a 'region', or a 'country' - TODO: More rules
- Use a
- Organizations should be represented by
org
elements within alistOrg
parent element.- TODO: More rules
In all cases, an xml:id
attribute must be specified, containing a unique ID, which cross-references the key
attributes of corresponding element in manuscript descriptions.
If multiple people are going to be editing the authority files, conflicts are likely. TODO: Move stuff about resolving them here?
When you have decided which authority files you want to maintain, and begun to create them, raise an issue in your repository on GitHub to request the processing scripts required to read the authority files, cross-reference them with the manuscript descriptions, and build indexes to enable browsing on the web site.