From d81ea99a048e6731b417fd6ee1d50a09536c4dc9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Toma=C5=BE=20Erjavec?= Date: Tue, 22 Aug 2023 13:53:48 +0200 Subject: [PATCH] Add state/@type to CHES variables and values. --- Schema/ParlaMint-schemaSpecs.odd.xml | 2 + Schema/ParlaMint.odd.rnc | 8 +- Schema/ParlaMint.odd.rng | 6 +- Schema/ParlaMint.odd.sch | 4 +- Schema/ParlaMint.odd.xml | 38 +- Schema/ParlaMint.rnc | 8 +- Schema/ParlaMint.rng | 5 + Schema/compile.log | 2 + Scripts/ches-tsv2tei.xsl | 12 +- TEI/ParlaMint-schemaSpecs.editing.odd.xml | 2 + TEI/ParlaMint-schemaSpecs.odd.xml | 2 + TEI/ParlaMint.odd.rnc | 8 +- TEI/ParlaMint.odd.rng | 6 +- TEI/ParlaMint.odd.sch | 4 +- TEI/ParlaMint.odd.xml | 38 +- docs/index.html | 981 +++++++++++----------- 16 files changed, 593 insertions(+), 533 deletions(-) diff --git a/Schema/ParlaMint-schemaSpecs.odd.xml b/Schema/ParlaMint-schemaSpecs.odd.xml index 14e061539..b187ebef7 100644 --- a/Schema/ParlaMint-schemaSpecs.odd.xml +++ b/Schema/ParlaMint-schemaSpecs.odd.xml @@ -5826,6 +5826,8 @@ + + diff --git a/Schema/ParlaMint.odd.rnc b/Schema/ParlaMint.odd.rnc index c5ca527a3..d84bb7287 100644 --- a/Schema/ParlaMint.odd.rnc +++ b/Schema/ParlaMint.odd.rnc @@ -11,7 +11,7 @@ namespace xi = "http://www.w3.org/2001/XInclude" namespace xlink = "http://www.w3.org/1999/xlink" namespace xsl = "http://www.w3.org/1999/XSL/Transform" -# Schema generated from ODD source 2023-08-20T17:06:32Z. 2023-08-20. +# Schema generated from ODD source 2023-08-22T11:41:50Z. 2023-08-22. # TEI Edition: Version 4.6.0a. Last updated on # 5th January 2023, revision 9074b9038 # TEI Edition Location: https://www.tei-c.org/Vault/P5/Version 4.6.0a./ @@ -2623,6 +2623,12 @@ tei_state = | ## "CHES" + | + ## + "variable" + | + ## + "value" }?, empty } diff --git a/Schema/ParlaMint.odd.rng b/Schema/ParlaMint.odd.rng index d2a4e485e..7038971f7 100644 --- a/Schema/ParlaMint.odd.rng +++ b/Schema/ParlaMint.odd.rng @@ -5,7 +5,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" ns="http://www.tei-c.org/ns/1.0"> + - + diff --git a/Schema/ParlaMint.odd.xml b/Schema/ParlaMint.odd.xml index 7d47b0d74..f10c03897 100644 --- a/Schema/ParlaMint.odd.xml +++ b/Schema/ParlaMint.odd.xml @@ -25,7 +25,7 @@ CLARIN - 2023-08-20 + 2023-08-22

This file is freely available and you are hereby authorised to copy, modify, and redistribute it in any way without further reference or permissions.

@@ -48,7 +48,7 @@ - Tomaž Erjavec: Change description of org/state. + Tomaž Erjavec: Change description of org/state. Tomaž Erjavec: Add org/@role='federatedState'. Tomaž Erjavec: Start work on section for MT. Tomaž Erjavec: allow orgName in affiliation. @@ -71,7 +71,7 @@ corpora - 2023-08-20 + 2023-08-22

@@ -2037,27 +2037,29 @@ top-level state element gives the type of the state, i.e. CHES and the URL of the CSV source for the information. Its label gives the abbreviation of the political party name in CHES, which can, and often does, differ from its ParlaMint abbreviation. - Each subordinate state then encodes one CHES variable, which is given, via the ana attribute, - as the reference to the appropriate category defined in the CHES taxonomy. Finally, as CHES gives - the variables according to years, the third level of state gives the time periods of the variable together - with its numeric value in the n attribute, as illustrated in the example below: + Each subordinate state (of type variable) then encodes one CHES variable, which is + given, via the ana attribute, as the reference to the appropriate category defined in the CHES + taxonomy. Finally, as CHES gives + variables according to years, the third level of state (of type value) gives the + time periods of the variable together with its numeric value in the n attribute, as illustrated in the + example below: - - - - - + + + + + - - - - - + + + + + ... diff --git a/Schema/ParlaMint.rnc b/Schema/ParlaMint.rnc index 836e7a0b6..30a7c3871 100644 --- a/Schema/ParlaMint.rnc +++ b/Schema/ParlaMint.rnc @@ -382,8 +382,14 @@ stateType.val = ## Data from Chapel Hill Survey. ( - ## Data on political orientation. + ## A CHES variable. "CHES" + | + ## A value of a CHES variable. + "variable" + | + ## Data on political orientation. + "value" | ## Data from Wikipedia. "politicalOrientation" diff --git a/Schema/ParlaMint.rng b/Schema/ParlaMint.rng index 126a270c4..d25a1513f 100644 --- a/Schema/ParlaMint.rng +++ b/Schema/ParlaMint.rng @@ -823,6 +823,11 @@ Data from Chapel Hill Survey. CHES + A CHES variable. + variable + A value of a CHES variable. + value + Data on political orientation. politicalOrientation Data from Wikipedia. diff --git a/Schema/compile.log b/Schema/compile.log index 960deecf4..7886fd294 100644 --- a/Schema/compile.log +++ b/Schema/compile.log @@ -7,3 +7,5 @@ java -jar /usr/share/java/trang.jar ParlaMint-listOrg.rng ParlaMint-listOrg.rnc java -jar /usr/share/java/trang.jar ParlaMint-listPerson.rng ParlaMint-listPerson.rnc java -jar /usr/share/java/trang.jar ParlaMint-taxonomy.rng ParlaMint-taxonomy.rnc make[1]: Leaving directory '/home/project/corpora/Parla/ParlaMint/ParlaMint/Schema' +5.07user 0.62system 0:02.47elapsed 230%CPU (0avgtext+0avgdata 68200maxresident)k +1664inputs+832outputs (0major+71739minor)pagefaults 0swaps diff --git a/Scripts/ches-tsv2tei.xsl b/Scripts/ches-tsv2tei.xsl index 0a198b821..9ca63fc65 100644 --- a/Scripts/ches-tsv2tei.xsl +++ b/Scripts/ches-tsv2tei.xsl @@ -30,10 +30,10 @@ - - - - + + + + @@ -226,13 +226,13 @@ - + - + diff --git a/TEI/ParlaMint-schemaSpecs.editing.odd.xml b/TEI/ParlaMint-schemaSpecs.editing.odd.xml index 1442bc852..87977360b 100644 --- a/TEI/ParlaMint-schemaSpecs.editing.odd.xml +++ b/TEI/ParlaMint-schemaSpecs.editing.odd.xml @@ -3526,6 +3526,8 @@ + +
diff --git a/TEI/ParlaMint-schemaSpecs.odd.xml b/TEI/ParlaMint-schemaSpecs.odd.xml index 14e061539..b187ebef7 100644 --- a/TEI/ParlaMint-schemaSpecs.odd.xml +++ b/TEI/ParlaMint-schemaSpecs.odd.xml @@ -5826,6 +5826,8 @@ + + diff --git a/TEI/ParlaMint.odd.rnc b/TEI/ParlaMint.odd.rnc index c5ca527a3..d84bb7287 100644 --- a/TEI/ParlaMint.odd.rnc +++ b/TEI/ParlaMint.odd.rnc @@ -11,7 +11,7 @@ namespace xi = "http://www.w3.org/2001/XInclude" namespace xlink = "http://www.w3.org/1999/xlink" namespace xsl = "http://www.w3.org/1999/XSL/Transform" -# Schema generated from ODD source 2023-08-20T17:06:32Z. 2023-08-20. +# Schema generated from ODD source 2023-08-22T11:41:50Z. 2023-08-22. # TEI Edition: Version 4.6.0a. Last updated on # 5th January 2023, revision 9074b9038 # TEI Edition Location: https://www.tei-c.org/Vault/P5/Version 4.6.0a./ @@ -2623,6 +2623,12 @@ tei_state = | ## "CHES" + | + ## + "variable" + | + ## + "value" }?, empty } diff --git a/TEI/ParlaMint.odd.rng b/TEI/ParlaMint.odd.rng index d2a4e485e..7038971f7 100644 --- a/TEI/ParlaMint.odd.rng +++ b/TEI/ParlaMint.odd.rng @@ -5,7 +5,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" ns="http://www.tei-c.org/ns/1.0"> + - + diff --git a/TEI/ParlaMint.odd.xml b/TEI/ParlaMint.odd.xml index 7d47b0d74..f10c03897 100644 --- a/TEI/ParlaMint.odd.xml +++ b/TEI/ParlaMint.odd.xml @@ -25,7 +25,7 @@ CLARIN - 2023-08-20 + 2023-08-22

This file is freely available and you are hereby authorised to copy, modify, and redistribute it in any way without further reference or permissions.

@@ -48,7 +48,7 @@ - Tomaž Erjavec: Change description of org/state. + Tomaž Erjavec: Change description of org/state. Tomaž Erjavec: Add org/@role='federatedState'. Tomaž Erjavec: Start work on section for MT. Tomaž Erjavec: allow orgName in affiliation. @@ -71,7 +71,7 @@ corpora - 2023-08-20 + 2023-08-22

@@ -2037,27 +2037,29 @@ top-level state element gives the type of the state, i.e. CHES and the URL of the CSV source for the information. Its label gives the abbreviation of the political party name in CHES, which can, and often does, differ from its ParlaMint abbreviation. - Each subordinate state then encodes one CHES variable, which is given, via the ana attribute, - as the reference to the appropriate category defined in the CHES taxonomy. Finally, as CHES gives - the variables according to years, the third level of state gives the time periods of the variable together - with its numeric value in the n attribute, as illustrated in the example below: + Each subordinate state (of type variable) then encodes one CHES variable, which is + given, via the ana attribute, as the reference to the appropriate category defined in the CHES + taxonomy. Finally, as CHES gives + variables according to years, the third level of state (of type value) gives the + time periods of the variable together with its numeric value in the n attribute, as illustrated in the + example below: - - - - - + + + + + - - - - - + + + + + ... diff --git a/docs/index.html b/docs/index.html index e04238a2d..6f5c67dcb 100644 --- a/docs/index.html +++ b/docs/index.html @@ -1,4 +1,4 @@ -The structure and encoding of ParlaMint corpora
The structure and encoding of ParlaMint corpora
2023-08-20

Table of contents

1. Introduction

This document is meant to serve as a reference for the encoding of ParlaMint corpora of parliamentary proceedings. In order for the ParlaMint corpora to be interoperable (i.e. so that the same scripts can be used to process them), their structure is fairly rigid, both in terms of file names and folder structure, as well as their TEI XML encoding. This is not to say that all the corpora have to contain exactly the same information because we distinguish obligatory information, which all the corpora should contain, from that which is optional, and present only in the corpora for which it has been possible to gather it from the corpus sources.

This document is a specialisation of Parla-CLARIN, itself a customisation the TEI Guidelines. But while Parla-CLARIN gives fairly general recommendations for encoding corpora of parliamentary proceedings, ParlaMint, as mentioned, is much stricter. This document gives very specific encoding recommendations without necessarily stating the reasons for their choice. It covers the overall structure of ParlaMint corpora, the metadata they contain, the encoding of transcriptions, and, for the linguistically annotated version, the encoding of word-level linguistic annotatios, syntactic dependencies and named entities.

The document is not meant as a tutorial on TEI or ParlaMint, but as a reference to elements, their nesting and attributes exemplified by snippets from the existing ParlaMint corpora. Other sources can help in understanding the encoding of ParlaMint corpora:

  • The freely available paper:
    Erjavec, T., Ogrodniczuk, M., Osenova, P. et al. The ParlaMint corpora of parliamentary proceedings. Language Resources & Evaluation (2022). https://doi.org/10.1007/s10579-021-09574-0.
  • The Parla-CLARIN guidelines, which provide general guidelines for encoding parliamentary corpora in TEI; they also give links to the relevant chapters of the TEI Guidelines.
  • Samples of ParlaMint corpora, available in the Data/ directory of the ParlaMint GitHub repository, esp. useful as they give the complete picture of a ParlaMint corpus; note that the samples in the main branch are supposed to be publication-ready, while those in the data branch are work in progress.

The rest of these recommendations are structured as follows:

  • Chapter 2 explains the overall XML structure of a ParlaMint corpus, and introduces the distinction between the corpus root and corpus components;
  • Chapter 3 explains some general requirements and the file-naming conventions a ParlaMint corpus has to meet; it also introduces the top level elements and their attributes and the main pointing attributes;
  • Chapter 4 concentrates on the stucture and encoding of the corpus metadata, such as the title information, documenting the source of the corpus, taxonomies used etc.;
  • Chapter 5 explains how and what information must be encoded about the persons giving the speeches and the (political) organisations they belong to;
  • Chapter 6 treats the encoding of the transcripts, including speeches and transcriber notes;
  • Chapter 7 details the addition of linguistic annotations to the corpus;
  • Chapter 8 introduces scripts to finalise, validate and convert a ParlaMint corpus to other formats;
  • Chapter 9 gives instructions on how to contribute samples of a ParlaMint corpus to GitHub;
  • Appendix A gives the formal specification of the Parla-CLARIN schema.

2. Overall corpus structure

2.1. XML structure

The parliamentary proceeding of one country of autonomous region constitute one ParlaMint corpus, which is stored as one XML document, with <teiCorpus> as its top-level element. It is composed of a <teiHeader>, giving the metadata for the corpus as a whole (further detailed in the Section on Corpus metadata), followed by a series of <TEI> elements that each contain one corpus component, as illustrated1 below:
+The structure and encoding of ParlaMint corpora
The structure and encoding of ParlaMint corpora
2023-08-22

Table of contents

1. Introduction

This document is meant to serve as a reference for the encoding of ParlaMint corpora of parliamentary proceedings. In order for the ParlaMint corpora to be interoperable (i.e. so that the same scripts can be used to process them), their structure is fairly rigid, both in terms of file names and folder structure, as well as their TEI XML encoding. This is not to say that all the corpora have to contain exactly the same information because we distinguish obligatory information, which all the corpora should contain, from that which is optional, and present only in the corpora for which it has been possible to gather it from the corpus sources.

This document is a specialisation of Parla-CLARIN, itself a customisation the TEI Guidelines. But while Parla-CLARIN gives fairly general recommendations for encoding corpora of parliamentary proceedings, ParlaMint, as mentioned, is much stricter. This document gives very specific encoding recommendations without necessarily stating the reasons for their choice. It covers the overall structure of ParlaMint corpora, the metadata they contain, the encoding of transcriptions, and, for the linguistically annotated version, the encoding of word-level linguistic annotatios, syntactic dependencies and named entities.

The document is not meant as a tutorial on TEI or ParlaMint, but as a reference to elements, their nesting and attributes exemplified by snippets from the existing ParlaMint corpora. Other sources can help in understanding the encoding of ParlaMint corpora:

  • The freely available paper:
    Erjavec, T., Ogrodniczuk, M., Osenova, P. et al. The ParlaMint corpora of parliamentary proceedings. Language Resources & Evaluation (2022). https://doi.org/10.1007/s10579-021-09574-0.
  • The Parla-CLARIN guidelines, which provide general guidelines for encoding parliamentary corpora in TEI; they also give links to the relevant chapters of the TEI Guidelines.
  • Samples of ParlaMint corpora, available in the Data/ directory of the ParlaMint GitHub repository, esp. useful as they give the complete picture of a ParlaMint corpus; note that the samples in the main branch are supposed to be publication-ready, while those in the data branch are work in progress.

The rest of these recommendations are structured as follows:

  • Chapter 2 explains the overall XML structure of a ParlaMint corpus, and introduces the distinction between the corpus root and corpus components;
  • Chapter 3 explains some general requirements and the file-naming conventions a ParlaMint corpus has to meet; it also introduces the top level elements and their attributes and the main pointing attributes;
  • Chapter 4 concentrates on the stucture and encoding of the corpus metadata, such as the title information, documenting the source of the corpus, taxonomies used etc.;
  • Chapter 5 explains how and what information must be encoded about the persons giving the speeches and the (political) organisations they belong to;
  • Chapter 6 treats the encoding of the transcripts, including speeches and transcriber notes;
  • Chapter 7 details the addition of linguistic annotations to the corpus;
  • Chapter 8 introduces scripts to finalise, validate and convert a ParlaMint corpus to other formats;
  • Chapter 9 gives instructions on how to contribute samples of a ParlaMint corpus to GitHub;
  • Appendix A gives the formal specification of the Parla-CLARIN schema.

2. Overall corpus structure

2.1. XML structure

The parliamentary proceeding of one country of autonomous region constitute one ParlaMint corpus, which is stored as one XML document, with <teiCorpus> as its top-level element. It is composed of a <teiHeader>, giving the metadata for the corpus as a whole (further detailed in the Section on Corpus metadata), followed by a series of <TEI> elements that each contain one corpus component, as illustrated1 below:
             <!-- Corpus root --> <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> @@ -426,22 +426,31 @@        2001, FDF decides to leave the alliance and chooses a new name, becoming DeFI.</note>   </state>  </state> -</org>
Note also that a <state> may have a note that gives furuther free-text information about the orientation.

5.2.3.2. Encoding CHES metadata

The second type of metadata comies from the Chapel Hill Expert Surveys for Europe (CHES), either from the 1999-2019 edition, of from the 2019 edition. Here the top-level <state> element gives the type of the state, i.e. CHES and the URL of the CSV source for the information. Its <label> gives the abbreviation of the political party name in CHES, which can, and often does, differ from its ParlaMint abbreviation. Each subordinate <state> then encodes one CHES variable, which is given, via the ana attribute, as the reference to the appropriate category defined in the CHES taxonomy. Finally, as CHES gives the variables according to years, the third level of <state> gives the time periods of the variable together with its numeric value in the n attribute, as illustrated in the example below:
<state type="CHES" +</org>
Note also that a <state> may have a note that gives furuther free-text information about the orientation.

5.2.3.2. Encoding CHES metadata

The second type of metadata comies from the Chapel Hill Expert Surveys for Europe (CHES), either from the 1999-2019 edition, of from the 2019 edition. Here the top-level <state> element gives the type of the state, i.e. CHES and the URL of the CSV source for the information. Its <label> gives the abbreviation of the political party name in CHES, which can, and often does, differ from its ParlaMint abbreviation. Each subordinate <state> (of type variable) then encodes one CHES variable, which is given, via the ana attribute, as the reference to the appropriate category defined in the CHES taxonomy. Finally, as CHES gives variables according to years, the third level of <state> (of type value) gives the time periods of the variable together with its numeric value in the n attribute, as illustrated in the example below:
<state type="CHES"  source="https://www.chesdata.eu/s/1999-2019_CHES_dataset_meansv3.csv">  <label> -  <orgName full="abbfrom="2002to="2018">MR</orgName> +  <orgName full="abbfrom="2002to="2018" +   xml:lang="en">MR</orgName>  </label><state ana="#ches-lrgen"> -  <state from="2002to="2005n="6.35"/> -  <state from="2006to="2009n="6.67"/> -  <state from="2010to="2013n="7.0"/> -  <state from="2014to="2018n="7.0"/><state type="variableana="#ches-lrgen"> +  <state type="valuefrom="2002to="2005" +   n="6.35"/> +  <state type="valuefrom="2006to="2009" +   n="6.67"/> +  <state type="valuefrom="2010to="2013" +   n="7.0"/> +  <state type="valuefrom="2014to="2018" +   n="7.0"/>  </state><state ana="#ches-lrecon"> -  <state from="2002to="2005n="7.3"/> -  <state from="2006to="2009n="7.5"/> -  <state from="2010to="2013n="7.62"/> -  <state from="2014to="2018n="7.60"/><state type="variableana="#ches-lrecon"> +  <state type="valuefrom="2002to="2005" +   n="7.3"/> +  <state type="valuefrom="2006to="2009" +   n="7.5"/> +  <state type="valuefrom="2010to="2013" +   n="7.62"/> +  <state type="valuefrom="2014to="2018" +   n="7.60"/>  </state> ... @@ -741,7 +750,7 @@    (<ref target="https://github.com/UKPLab/EasyNMT">https://github.com/UKPLab/EasyNMT</ref>)    with OPUS-MT model bat    (<ref target="https://github.com/Helsinki-NLP/Opus-MT">https://github.com/Helsinki-NLP/Opus-MT</ref>)</desc> -</application>
This element should be given in the corpus root, together with all the other information on applications inside the application information (<appInfo>) element.

9. Validation and conversion

The chapter explains how to validate and finalise a ParlaMint corpus, and introduces scripts for converting a ParlaMint corpus to other, derived formats.

9.1. Validating ParlaMint corpora

The XML structure of ParlaMint corpora can be validated via RelaxNG schemas, which exist in two versions, one that was produced as a customisation of the TEI Guidelines, and a set of schemas that were made from scratch for ParlaMint.

The TEI customisation is written as a TEI ODD document, which is, in fact, the XML version of this document, and is available in the TEI/ directory of the ParlaMint GitHub repository. The XML contains not only the prose guidelines, but also the formal specification of the TEI schema, which is given in the Appendix A. In the XML it contains the formal schema specification, while in the on-line version this is converted to a reference to all the elements, attributes and classes used in ParlaMint corpora. The ODD document is not immediately useful for XML validation, but has to be converted with TEI XSLT stylesheets first in order to obtain a RelaxNG schema, and this schema is also available in the same directory under the name of ParlaMint.rng (in RelaxNG XML syntax) and ParlaMint.rnc (in RelaxNG compact syntax). This schema should be used to check that ParlaMint component files validate against TEI.

However, it is difficult to constrain a TEI ODD-derived XML schema to allow only the kinds of nestings and attributes that should appear in a ParlaMint corpus, so this schema allows (and lists Appendix A) nesting of elements, as well as attributes that are in fact forbidden in ParlaMint corpora.

For this reason, we have also developed a set of RelaxNG schemas from scratch, which do allow only those elements, attributes and content models that are in fact valid for a ParlaMint corpus. There are all together four such schemas, one for a "plain-text" corpus root, one for its corpus components, one for the linguistically annotated corpus root, and one for its components. These schemas can be found in the Schema/ directory of the ParlaMint GitHub repository, with the README file giving instructions on how to use them.

Validating with XML schemas checks the formal structure of XML files but is less successful in validating other aspects of conformance, such as the textual content or linking of pointer attributes. For this reason, we have also developed an XSLT script that assumes a schema-validated ParlaMint file on its input, and checks various other aspects of conformance. These validation scripts can be found in the Scripts/ directory of the ParlaMint GitHub repository, with the README file listing them.

It should be noted that it is not necessary to run the validation scripts directly, as the validation can be performed by the main Makefile of the project. The Makefile is self-documenting, i.e. to see how to use it, please run make help in the top level directory of the ParlaMint project.

While each contributor of a corpus should validate their files with the ParlaMint schemas and validation script, there also exist further stages of validation, which are also applied to ParlaMint corpora:

  • The corpora are converted to derived formats, in particular, the linguistically annotated version of the corpus to CoNLL-U and to the so called vertical format for CQP-type concordancers. The Universal Dependencies project provides a program for validating the formatting and linguistic analyses in CoNLL-U files, and this validation is used on the CoNLL-U files derived from their XML source, up to level 2 conformance. The vertical files, on the other hand, are first compiled with manatee (the back end of (no)Sketch Engine) and this compilation can also expose various errors.
  • The last stage in validation is ‘human validation’ where e.g. simply looking at various produced metadata files or at the concordances of a corpus exposes errors.

9.2. Finalisation of corpora

While the vast majority of converting source encodings into the ParlaMint corpus format is left to the compilers of a corpus, there are a few metadata elements that can be produced by a common script on the basis of nearly finished corpora, which then results in the final version of the corpus for a particular release. This includes setting the date, edition and handle under which the corpus will be distributed, and also calculating the size of the corpus (cf. the Sections on Extents and on Tags declaration). The script for finalisation can be found in the Scripts/ directory of the ParlaMint GitHub repository and the README file briefly explains its function; more comments can be found in the script itself.

9.3. Conversions

A TEI encoded document is, in general, not meant to be used directly by software programs, rather, it serves as an interchange and storage format. The ParlaMint project has produced various scripts to down-convert the XML encoded corpora to other formats and they can be found in the Scripts/ directory of the ParlaMint GitHub repository, with the README file listing them and explaining their function. In short, the scripts convert the ParlaMint XML to plain text, to CoNLL-U, and to vertical format. There is also a script that takes a ParlaMint corpus and makes from it a sample for inclusion to the ParlaMint GitHub repository.

10. Contributing to ParlaMint

The ParlaMint GitHub repository contains these guidelines, the ParlaMint XML schemas, the scripts used to validate, finalise and convert the ParlaMint TEI XML corpora to derived formats, and samples of the ParlaMint corpora. There are four main branches in the repository:

  • main is the default branch used for the synchronisation of other branches. It is also used for releasing sample files that correspond to published corpora.
  • data serves as a pushing place for new sample files in ./Data/ParlaMint-XX directories.
  • devel: development of scripts and documentation.

The validation procedure for corpora is explained in the Section on Validating ParlaMint corpora, while the technical aspects of contributing corpora is further explained in the CONTRIBUTING file of the repository.

11. Acknowledgements

The work on these recommendations was funded by the CLARIN Research Infrastructure for Language Resources and Tools.

Appendix A Formal specification

Appendix A.1 Elements

Appendix A.1.1 <TEI>

<TEI> (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple <TEI> elements may be combined within a <TEI> (or <teiCorpus>) element. [4. Default Text Structure 15.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributesatt.global.linking (synch, next, prev, @corresp)
xml:id
StatusRequired
DatatypeID
xml:lang
StatusRequired
Datatypeteidata.language
ana
StatusRequired
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
core: teiCorpus
May contain
header: teiHeader
textstructure: text
Note

This element is required. It is customary to specify the TEI namespace http://www.tei-c.org/ns/1.0 on it, for example: <TEI version="4.4.0" xml:lang="it" xmlns="http://www.tei-c.org/ns/1.0">.

ExampleExample of ParlaMint corpus component:
<TEI xml:id="ParlaMint-GB_2015-01-06-commons" +</application>
This element should be given in the corpus root, together with all the other information on applications inside the application information (<appInfo>) element.

9. Validation and conversion

The chapter explains how to validate and finalise a ParlaMint corpus, and introduces scripts for converting a ParlaMint corpus to other, derived formats.

9.1. Validating ParlaMint corpora

The XML structure of ParlaMint corpora can be validated via RelaxNG schemas, which exist in two versions, one that was produced as a customisation of the TEI Guidelines, and a set of schemas that were made from scratch for ParlaMint.

The TEI customisation is written as a TEI ODD document, which is, in fact, the XML version of this document, and is available in the TEI/ directory of the ParlaMint GitHub repository. The XML contains not only the prose guidelines, but also the formal specification of the TEI schema, which is given in the Appendix A. In the XML it contains the formal schema specification, while in the on-line version this is converted to a reference to all the elements, attributes and classes used in ParlaMint corpora. The ODD document is not immediately useful for XML validation, but has to be converted with TEI XSLT stylesheets first in order to obtain a RelaxNG schema, and this schema is also available in the same directory under the name of ParlaMint.rng (in RelaxNG XML syntax) and ParlaMint.rnc (in RelaxNG compact syntax). This schema should be used to check that ParlaMint component files validate against TEI.

However, it is difficult to constrain a TEI ODD-derived XML schema to allow only the kinds of nestings and attributes that should appear in a ParlaMint corpus, so this schema allows (and lists Appendix A) nesting of elements, as well as attributes that are in fact forbidden in ParlaMint corpora.

For this reason, we have also developed a set of RelaxNG schemas from scratch, which do allow only those elements, attributes and content models that are in fact valid for a ParlaMint corpus. There are all together four such schemas, one for a "plain-text" corpus root, one for its corpus components, one for the linguistically annotated corpus root, and one for its components. These schemas can be found in the Schema/ directory of the ParlaMint GitHub repository, with the README file giving instructions on how to use them.

Validating with XML schemas checks the formal structure of XML files but is less successful in validating other aspects of conformance, such as the textual content or linking of pointer attributes. For this reason, we have also developed an XSLT script that assumes a schema-validated ParlaMint file on its input, and checks various other aspects of conformance. These validation scripts can be found in the Scripts/ directory of the ParlaMint GitHub repository, with the README file listing them.

It should be noted that it is not necessary to run the validation scripts directly, as the validation can be performed by the main Makefile of the project. The Makefile is self-documenting, i.e. to see how to use it, please run make help in the top level directory of the ParlaMint project.

While each contributor of a corpus should validate their files with the ParlaMint schemas and validation script, there also exist further stages of validation, which are also applied to ParlaMint corpora:

  • The corpora are converted to derived formats, in particular, the linguistically annotated version of the corpus to CoNLL-U and to the so called vertical format for CQP-type concordancers. The Universal Dependencies project provides a program for validating the formatting and linguistic analyses in CoNLL-U files, and this validation is used on the CoNLL-U files derived from their XML source, up to level 2 conformance. The vertical files, on the other hand, are first compiled with manatee (the back end of (no)Sketch Engine) and this compilation can also expose various errors.
  • The last stage in validation is ‘human validation’ where e.g. simply looking at various produced metadata files or at the concordances of a corpus exposes errors.

9.2. Finalisation of corpora

While the vast majority of converting source encodings into the ParlaMint corpus format is left to the compilers of a corpus, there are a few metadata elements that can be produced by a common script on the basis of nearly finished corpora, which then results in the final version of the corpus for a particular release. This includes setting the date, edition and handle under which the corpus will be distributed, and also calculating the size of the corpus (cf. the Sections on Extents and on Tags declaration). The script for finalisation can be found in the Scripts/ directory of the ParlaMint GitHub repository and the README file briefly explains its function; more comments can be found in the script itself.

9.3. Conversions

A TEI encoded document is, in general, not meant to be used directly by software programs, rather, it serves as an interchange and storage format. The ParlaMint project has produced various scripts to down-convert the XML encoded corpora to other formats and they can be found in the Scripts/ directory of the ParlaMint GitHub repository, with the README file listing them and explaining their function. In short, the scripts convert the ParlaMint XML to plain text, to CoNLL-U, and to vertical format. There is also a script that takes a ParlaMint corpus and makes from it a sample for inclusion to the ParlaMint GitHub repository.

10. Contributing to ParlaMint

The ParlaMint GitHub repository contains these guidelines, the ParlaMint XML schemas, the scripts used to validate, finalise and convert the ParlaMint TEI XML corpora to derived formats, and samples of the ParlaMint corpora. There are four main branches in the repository:

  • main is the default branch used for the synchronisation of other branches. It is also used for releasing sample files that correspond to published corpora.
  • data serves as a pushing place for new sample files in ./Data/ParlaMint-XX directories.
  • devel: development of scripts and documentation.

The validation procedure for corpora is explained in the Section on Validating ParlaMint corpora, while the technical aspects of contributing corpora is further explained in the CONTRIBUTING file of the repository.

11. Acknowledgements

The work on these recommendations was funded by the CLARIN Research Infrastructure for Language Resources and Tools.

Appendix A Formal specification

Appendix A.1 Elements

Appendix A.1.1 <TEI>

<TEI> (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple <TEI> elements may be combined within a <TEI> (or <teiCorpus>) element. [4. Default Text Structure 15.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributesatt.global.linking (synch, next, prev, @corresp)
xml:id
StatusRequired
DatatypeID
xml:lang
StatusRequired
Datatypeteidata.language
ana
StatusRequired
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
core: teiCorpus
May contain
header: teiHeader
textstructure: text
Note

This element is required. It is customary to specify the TEI namespace http://www.tei-c.org/ns/1.0 on it, for example: <TEI version="4.4.0" xml:lang="it" xmlns="http://www.tei-c.org/ns/1.0">.

ExampleExample of ParlaMint corpus component:
<TEI xml:id="ParlaMint-GB_2015-01-06-commons"  xml:lang="enana="#parla.sitting #reference" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader>...</teiHeader>  <text ana="#reference"> @@ -753,12 +762,12 @@ <sch:ns prefix="xs"  uri="http://www.w3.org/2001/XMLSchema"/>
Schematron
<sch:ns prefix="rng" - uri="http://relaxng.org/ns/structure/1.0"/>
Content model
+ uri="http://relaxng.org/ns/structure/1.0"/>
Content model
 <content>
  <elementRef key="teiHeader"/>
  <elementRef key="text"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element TEI
 {
    tei_att.global.linking.attribute.corresp,
@@ -767,16 +776,16 @@
    attribute ana { list { + } },
    tei_teiHeader,
    tei_text
-}

Appendix A.1.2 <addName>

<addName> (additional name) contains an additional name component, such as a nickname, epithet, or alias, or any other descriptive phrase used within a personal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: persName
May containCharacter data only
Example
<persName> +}

Appendix A.1.2 <addName>

<addName> (additional name) contains an additional name component, such as a nickname, epithet, or alias, or any other descriptive phrase used within a personal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: persName
May containCharacter data only
Example
<persName>  <surname>Möderndorfer</surname>  <forename>Jani</forename>  <addName>Janko</addName> -</persName>
Content model
+</persName>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element addName { text }

Appendix A.1.3 <affiliation>

<affiliation> (affiliation) contains an informal description of a person's present or past affiliation with some organisation, for example a political party or ministry. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to) att.canonical (key, @ref)
role
StatusRequired
Legal values are:
academician
alternateOfDelegation
associateMember
candidateChairman
constitutionalJudge
deputyHead
deputyMinister
head
member
minister
ministerDelegate
nonAttachedMember
observer
ombudsman
prosecutorGeneral
publicDefenderOfRights
replacement
representative
secretary
secretaryGeneral
secretaryOfState
verifier
vicePublicDefenderOfRights
Member of
Contained by
namesdates: person
May contain
namesdates: orgName roleName
Note

If included, the name of an organization may be tagged using either the <name> element as above, or the more specific <orgName> element.

Example
<person xml:id="AdamKalous.1979"> +
Schema Declaration
+element addName { text }

Appendix A.1.3 <affiliation>

<affiliation> (affiliation) contains an informal description of a person's present or past affiliation with some organisation, for example a political party or ministry. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to) att.canonical (key, @ref)
role
StatusRequired
Legal values are:
academician
alternateOfDelegation
associateMember
candidateChairman
constitutionalJudge
deputyHead
deputyMinister
head
member
minister
ministerDelegate
nonAttachedMember
observer
ombudsman
prosecutorGeneral
publicDefenderOfRights
replacement
representative
secretary
secretaryGeneral
secretaryOfState
verifier
vicePublicDefenderOfRights
Member of
Contained by
namesdates: person
May contain
namesdates: orgName roleName
Note

If included, the name of an organization may be tagged using either the <name> element as above, or the more specific <orgName> element.

Example
<person xml:id="AdamKalous.1979">  <persName>   <surname>Kalous</surname>   <forename>Adam</forename> @@ -818,7 +827,7 @@   to="2021-10-21T00:00:00">   <roleName xml:lang="en">MP</roleName>  </affiliation> -</person>
Example
<p>The affiliation element can also include an <att>ana</att> attribute, which points to the appropriate legislative period when the person was affiliated with the specified organisation:</p> +</person>
Example
<p>The affiliation element can also include an <att>ana</att> attribute, which points to the appropriate legislative period when the person was affiliated with the specified organisation:</p> <person xml:id="BahŽibertAnja">  <persName>   <surname>Bah</surname> @@ -839,14 +848,14 @@   from="2018-06-22ana="#DZ.8">   <roleName xml:lang="en">MP</roleName>  </affiliation> -</person>
Content model
+</person>
Content model
 <content>
  <elementRef key="roleName" minOccurs="0"
   maxOccurs="unbounded"/>
  <elementRef key="orgName" minOccurs="0"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element affiliation
 {
    tei_att.global.analytic.attribute.ana,
@@ -882,13 +891,13 @@
    },
    tei_roleName*,
    tei_orgName*
-}

Appendix A.1.4 <appInfo>

<appInfo> (application information) records information about an application which has edited the TEI file. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
header: application
Example
<appInfo> +}

Appendix A.1.4 <appInfo>

<appInfo> (application information) records information about an application which has edited the TEI file. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
header: application
Example
<appInfo>  <application version="4.0"   ident="stanford-corenlp">   <label>Stanford CoreNLP</label>   <desc>Tokenisation, POS tagging, NER and dependency parsed using Stanford CoreNLP <ref target="https://stanfordnlp.github.io/CoreNLP/">https://stanfordnlp.github.io/CoreNLP/</ref>.</desc>  </application> -</appInfo>
Example
<appInfo> +</appInfo>
Example
<appInfo>  <application version="1.0"   ident="reldi-tokeniser">   <label>ReLDI tokeniser</label> @@ -901,13 +910,13 @@   ident="janes-ner">   <label>NER system for South Slavic languages</label>  </application> -</appInfo>
Content model
+</appInfo>
Content model
 <content>
  <elementRef key="application"
   minOccurs="1" maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element appInfo { tei_application+ }

Appendix A.1.5 <application>

<application> provides information about an application which has acted upon the document. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Attributes
identsupplies an identifier for the application, independent of its version number or display name.
StatusRequired
Datatypeteidata.name
versionsupplies a version number for the application, independent of its identifier or display name.
StatusRequired
Datatypeteidata.versionNumber
Contained by
header: appInfo
May contain
core: desc label
Example
<appInfo> +
Schema Declaration
+element appInfo { tei_application+ }

Appendix A.1.5 <application>

<application> provides information about an application which has acted upon the document. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Attributes
identsupplies an identifier for the application, independent of its version number or display name.
StatusRequired
Datatypeteidata.name
versionsupplies a version number for the application, independent of its identifier or display name.
StatusRequired
Datatypeteidata.versionNumber
Contained by
header: appInfo
May contain
core: desc label
Example
<appInfo>  <application version="1"   ident="app-stanza">   <label>Stanza</label> @@ -925,48 +934,48 @@   <desc xml:lang="en">    <ref target="http://conllu2teixml">CoNLL-U 2 TEI XML</ref>: converter from CoNLL-U format to (ParlaClarin/ParlaMint) Tei XML Format</desc>  </application> -</appInfo>
Example
<appInfo> +</appInfo>
Example
<appInfo>  <application version="4.0"   ident="stanford-corenlp">   <label>Stanford CoreNLP</label>   <desc>Tokenisation, POS tagging, NER and dependency parsed using Stanford CoreNLP <ref target="https://stanfordnlp.github.io/CoreNLP/">https://stanfordnlp.github.io/CoreNLP/</ref>.</desc>  </application> -</appInfo>
Content model
+</appInfo>
Content model
 <content>
  <elementRef key="label"/>
  <elementRef key="desc" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element application
 {
    attribute ident { text },
    attribute version { text },
    tei_label,
    tei_desc+
-}

Appendix A.1.6 <availability>

<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Attributes
status
StatusRequired
Legal values are:
free
Contained by
May contain
core: p
header: licence
Note

A consistent format should be adopted

Example
<availability status="free"> +}

Appendix A.1.6 <availability>

<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Attributes
status
StatusRequired
Legal values are:
free
Contained by
May contain
core: p
header: licence
Note

A consistent format should be adopted

Example
<availability status="free">  <licence>http://creativecommons.org/licenses/by/4.0/</licence>  <p xml:lang="hr">Ovaj rad je dostupan pod <ref target="http://creativecommons.org/licenses/by/4.0/">međunarodnom licencom Creative Commons Imenovanje 4.0</ref>  </p>  <p xml:lang="en">This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref>  </p> -</availability>
Content model
+</availability>
Content model
 <content>
  <elementRef key="licence"/>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element availability { attribute status { "free" }, tei_licence, tei_p+ }

Appendix A.1.7 <bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]
Modulecore — Formal specification
Member of
Contained by
header: sourceDesc
May contain
Note

Contains phrase-level elements, together with any combination of elements from the model.biblPart class

Example
<bibl> +
Schema Declaration
+element availability { attribute status { "free" }, tei_licence, tei_p+ }

Appendix A.1.7 <bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]
Modulecore — Formal specification
Member of
Contained by
header: sourceDesc
May contain
Note

Contains phrase-level elements, together with any combination of elements from the model.biblPart class

Example
<bibl>  <title type="main">Minutes of the National Assembly of the Republic of Bulgaria</title>  <date when="2020-03-11">2020-03-11</date> -</bibl>
Example
<bibl> +</bibl>
Example
<bibl>  <title type="mainxml:lang="en">https://www.tbmm.gov.tr/tutanak/donem24/yil2/bas/b013m.htm</title>  <edition xml:lang="en">Official session record</edition>  <publisher xml:lang="en">The Turkish Parliament</publisher>  <idno type="URI">https://www.tbmm.gov.tr/</idno>  <date when="2011-10-27">2011-10-27</date> -</bibl>
Content model
+</bibl>
Content model
 <content>
  <elementRef key="title" minOccurs="1"
   maxOccurs="unbounded"/>
@@ -982,42 +991,42 @@
    maxOccurs="1"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element bibl
 {
    tei_title+,
    ( tei_edition? | tei_publisher? | tei_idno* | tei_date )+
-}

Appendix A.1.8 <birth>

<birth> (birth) contains information about a person's birth, such as its date and place. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusRequired
Datatypeteidata.temporal.w3c
Contained by
namesdates: person
May contain
namesdates: placeName
Example
<person xml:id="ReinerŽeljkon="1291"> ... +}

Appendix A.1.8 <birth>

<birth> (birth) contains information about a person's birth, such as its date and place. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusRequired
Datatypeteidata.temporal.w3c
Contained by
namesdates: person
May contain
namesdates: placeName
Example
<person xml:id="ReinerŽeljkon="1291"> ... <birth when="1953-05-28"/> -</person>
Example
<birth when="1966-03-22"> +</person>
Example
<birth when="1966-03-22">  <placeName ref="https://www.geonames.org/2523918">Palermo</placeName> -</birth>
Content model
+</birth>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <elementRef key="placeName" minOccurs="0"
    maxOccurs="1"/>
  </alternate>
 </content>
-    
Schema Declaration
-element birth { attribute when { text }, ( tei_placeName? ) }

Appendix A.1.9 <body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]
Moduletextstructure — Formal specification
Contained by
textstructure: text
May contain
textstructure: div
Example
<body> +
Schema Declaration
+element birth { attribute when { text }, ( tei_placeName? ) }

Appendix A.1.9 <body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]
Moduletextstructure — Formal specification
Contained by
textstructure: text
May contain
textstructure: div
Example
<body>  <div type="debateSection">...</div>  <div type="debateSection">...</div> ... -</body>
Content model
+</body>
Content model
 <content>
  <elementRef key="div" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element body { tei_div+ }

Appendix A.1.10 <catDesc>

<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
header: category
May contain
core: ref term
character data
Example
<category xml:id="parla.organisation"> +
Schema Declaration
+element body { tei_div+ }

Appendix A.1.10 <catDesc>

<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
header: category
May contain
core: ref term
character data
Example
<category xml:id="parla.organisation">  <catDesc xml:lang="en">   <term>Organisation</term>  </catDesc>  <catDesc xml:lang="bg">   <term>Организация</term>  </catDesc> -</category>
Content model
+</category>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="term"/>
@@ -1028,12 +1037,12 @@
   </alternate>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element catDesc
 {
    tei_att.global.attribute.xmllang,
    ( tei_term, ( text | tei_ref )+ )
-}

Appendix A.1.11 <catRef>

<catRef> (category reference) specifies one or more defined categories within some taxonomy or text typology. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Attributes
targetspecifies the destination of the reference by supplying one or more URI References
Derived fromatt.pointing
StatusRequired
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
schemeidentifies the classification scheme within which the set of categories concerned is defined, for example by a <taxonomy> element, or by some other resource.
StatusRequired
Datatypeteidata.pointer
Contained by
header: textClass
May containEmpty element
Note

The scheme attribute needs to be supplied only if more than one taxonomy has been declared.

Example
<textClass> +}

Appendix A.1.11 <catRef>

<catRef> (category reference) specifies one or more defined categories within some taxonomy or text typology. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Attributes
targetspecifies the destination of the reference by supplying one or more URI References
Derived fromatt.pointing
StatusRequired
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
schemeidentifies the classification scheme within which the set of categories concerned is defined, for example by a <taxonomy> element, or by some other resource.
StatusRequired
Datatypeteidata.pointer
Contained by
header: textClass
May containEmpty element
Note

The scheme attribute needs to be supplied only if more than one taxonomy has been declared.

Example
<textClass>  <catRef scheme="#parla.legislature"   target="#parla.uni"/> </textClass> @@ -1048,46 +1057,46 @@    <term>Unicameralism</term>   </catDesc>  </category> -</taxonomy>
Content model
+</taxonomy>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element catRef
 {
    attribute target { list { + } },
    attribute scheme { text },
    empty
-}

Appendix A.1.12 <category>

<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributes
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
Derived fromatt.global.analytic
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
May contain
Example
<category xml:id="parla.session"> +}

Appendix A.1.12 <category>

<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributes
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
Derived fromatt.global.analytic
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
May contain
Example
<category xml:id="parla.session">  <catDesc xml:lang="en">   <term>Session</term>: A parliamentary year, which always begins on the first Tuesday in October at 12.00 o’clock noon and ends on the same date at the same time the following year. However, parliamentary work at Christiansborg is organised in such a way that it primarily takes place from October to June.</catDesc> -</category>
Example
<category xml:id="parla.term"> +</category>
Example
<category xml:id="parla.term">  <catDesc xml:lang="nl">   <term>Zittingsperiode</term>  </catDesc>  <catDesc xml:lang="en">   <term>Legislative period</term>  </catDesc> -</category>
Content model
+</category>
Content model
 <content>
  <elementRef key="catDesc" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="category" minOccurs="0"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element category
 {
    attribute xml:id { text },
    attribute ana { list { + } }?,
    tei_catDesc+,
    tei_category*
-}

Appendix A.1.13 <change>

<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 11.7. Identifying Changes and Revisions]
Moduleheader — Formal specification
Attributesatt.datable.w3c (notBefore, notAfter, from, to, @when)
Contained by
header: revisionDesc
May contain
core: name
character data
Note

The who attribute may be used to point to any other element, but will typically specify a <respStmt> or <person> element elsewhere in the header, identifying the person responsible for the change and their role in making it.

It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented.

Example
<revisionDesc> +}

Appendix A.1.13 <change>

<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 11.7. Identifying Changes and Revisions]
Moduleheader — Formal specification
Attributesatt.datable.w3c (notBefore, notAfter, from, to, @when)
Contained by
header: revisionDesc
May contain
core: name
character data
Note

The who attribute may be used to point to any other element, but will typically specify a <respStmt> or <person> element elsewhere in the header, identifying the person responsible for the change and their role in making it.

It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented.

Example
<revisionDesc>  <change when="2021-01-28">   <name>Tommaso Agnoloni</name>: Generated corpus in ParlaMint.</change>  <change when="2021-02-26">   <name>Tommaso Agnoloni</name>, <name>Francesca Frontini</name>: Corpus revision, fixing</change> -</revisionDesc>
Content model
+</revisionDesc>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -1095,13 +1104,13 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
-element change { tei_att.datable.w3c.attribute.when, ( tei_name | text )+ }

Appendix A.1.14 <classDecl>

<classDecl> (classification declarations) contains taxonomies defining classificatory codes used elsewhere in the text. Note that the taxonomies are in ParlaMint typically stored in separate files. [2.3.7. The Classification Declaration 2.3. The Encoding Description]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
derived-module-parlamint: include
header: taxonomy
Example
<classDecl> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" +
Schema Declaration
+element change { tei_att.datable.w3c.attribute.when, ( tei_name | text )+ }

Appendix A.1.14 <classDecl>

<classDecl> (classification declarations) contains taxonomies defining classificatory codes used elsewhere in the text. Note that the taxonomies are in ParlaMint typically stored in separate files. [2.3.7. The Classification Declaration 2.3. The Encoding Description]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
derived-module-parlamint: include
header: taxonomy
Example
<classDecl> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="href="ParlaMint-SI-taxonomy-parla.legislature.xml"/> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="href="ParlaMint-SI-taxonomy.xml-speaker_types"/> ... -</classDecl>
Content model
+</classDecl>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -1109,18 +1118,18 @@
   <elementRef key="include"/>
  </alternate>
 </content>
-    
Schema Declaration
-element classDecl { ( tei_taxonomy | tei_include )+ }

Appendix A.1.15 <correction>

<correction> (correction principles) states how and under what circumstances corrections have been made in the text. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Note

May be used to note the results of proof reading the text against its original, indicating (for example) whether discrepancies have been silently rectified, or recorded using the editorial tags described in section 3.5. Simple Editorial Changes.

Example
<editorialDecl> +
Schema Declaration
+element classDecl { ( tei_taxonomy | tei_include )+ }

Appendix A.1.15 <correction>

<correction> (correction principles) states how and under what circumstances corrections have been made in the text. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Note

May be used to note the results of proof reading the text against its original, indicating (for example) whether discrepancies have been silently rectified, or recorded using the editorial tags described in section 3.5. Simple Editorial Changes.

Example
<editorialDecl>  <correction>   <p>No correction of source texts was performed.</p>  </correction> -</editorialDecl>
Content model
+</editorialDecl>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element correction { tei_p+ }

Appendix A.1.16 <date>

<date> (date) contains a date in any format. [3.6.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 15.2.3. The Setting Description 13.4. Dates]
Modulecore — Formal specification
Attributesatt.typed (@type, @subtype) att.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Member of
Contained by
analysis: s
corpus: setting
May contain
analysis: pc w
core: date
character data
ExampleThe element <date> gives the date in the when attribute in the ISO 8601 format, while the textual content is not constrained:
<date when="2021-06-08">2021-06-08</date>
ExampleThe textual content can be given according to the conventions used in the local language:
<date when="2018-04-13xml:lang="sl">13.4.2018</date>
Content model
+    
Schema Declaration
+element correction { tei_p+ }

Appendix A.1.16 <date>

<date> (date) contains a date in any format. [3.6.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 15.2.3. The Setting Description 13.4. Dates]
Modulecore — Formal specification
Attributesatt.typed (@type, @subtype) att.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Member of
Contained by
analysis: s
corpus: setting
May contain
analysis: pc w
core: date
character data
ExampleThe element <date> gives the date in the when attribute in the ISO 8601 format, while the textual content is not constrained:
<date when="2021-06-08">2021-06-08</date>
ExampleThe textual content can be given according to the conventions used in the local language:
<date when="2018-04-13xml:lang="sl">13.4.2018</date>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -1130,7 +1139,7 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element date
 {
    tei_att.global.attribute.xmlid,
@@ -1141,15 +1150,15 @@
    tei_att.datable.w3c.attribute.to,
    tei_att.typed.attributes,
    ( tei_w | tei_pc | tei_date | text )+
-}

Appendix A.1.17 <death>

<death> (death) contains information about a person's death, such as its date and place. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusRequired
Datatypeteidata.temporal.w3c
Contained by
namesdates: person
May contain
namesdates: placeName
Example
<death when="2020-12-29"/>
Content model
+}

Appendix A.1.17 <death>

<death> (death) contains information about a person's death, such as its date and place. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusRequired
Datatypeteidata.temporal.w3c
Contained by
namesdates: person
May contain
namesdates: placeName
Example
<death when="2020-12-29"/>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <elementRef key="placeName" minOccurs="0"
    maxOccurs="1"/>
  </alternate>
 </content>
-    
Schema Declaration
-element death { attribute when { text }, ( tei_placeName? ) }

Appendix A.1.18 <desc>

<desc> (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented. [22.4.1. Description of Components]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Member of
Contained by
core: gap
namesdates: org
May contain
core: ref term
character data
Note

When used in a specification element such as <elementSpec>, TEI convention requires that this be expressed as a finite clause, begining with an active verb.

Example
<p>Example of <gi>desc</gi> elements for transcriber comments:</p> +
Schema Declaration
+element death { attribute when { text }, ( tei_placeName? ) }

Appendix A.1.18 <desc>

<desc> (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented. [22.4.1. Description of Components]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Member of
Contained by
core: gap
namesdates: org
May contain
core: ref term
character data
Note

When used in a specification element such as <elementSpec>, TEI convention requires that this be expressed as a finite clause, begining with an active verb.

Example
<p>Example of <gi>desc</gi> elements for transcriber comments:</p> <gap reason="inaudible">  <desc>speaker spoke too quietly, not understood</desc> </gap> @@ -1167,7 +1176,7 @@ <incident type="action">  <desc>minute of silence</desc> -</incident>
ExampleExample of <desc> elements used as a part of taxonomy:
<taxonomy xml:id="parla.legislature"> +</incident>
ExampleExample of <desc> elements used as a part of taxonomy:
<taxonomy xml:id="parla.legislature">  <desc xml:lang="sl">   <term>Zakonodajna oblast</term>  </desc> @@ -1176,7 +1185,7 @@  </desc> ... -</taxonomy>
ExampleElement <desc> can also be used to describe tool(s) used to linguistically annotate the corpus:
<application version="1.0" +</taxonomy>
ExampleElement <desc> can also be used to describe tool(s) used to linguistically annotate the corpus:
<application version="1.0"  ident="reldi-tokeniser">  <label>ReLDI tokeniser</label>  <desc xml:lang="en">Tokenisation and sentence segmentation with ReLDI tokeniser, available from <ref target="https://github.com/clarinsi/reldi-tokeniser">https://github.com/clarinsi/reldi-tokeniser</ref>.</desc> @@ -1187,7 +1196,7 @@ that is being deprecated: that is, only an element that has a @validUntil attribute should have a child <desc type="deprecationInfo">.</sch:assert> -</sch:rule>
Content model
+</sch:rule>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef minOccurs="0" key="term"/>
@@ -1198,12 +1207,12 @@
   </alternate>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element desc
 {
    tei_att.global.attribute.xmllang,
    ( tei_term?, ( text | tei_ref )+ )
-}

Appendix A.1.19 <div>

<div> (text division) contains division of the body a corpus component. [4.1. Divisions of the Body]
Moduletextstructure — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp) att.typed (type, @subtype)
type
StatusRequired
Legal values are:
debateSection
General purpose text division for all parts of parliamentary proceedings. It should include at least one utterance. If needed, the @subtype attribute can be used for additional content classification.
commentSection
A special purpose text division used as a container for transcriber comments. Should not contain any utterances. If needed, the @subtype attribute can be used for additional content classification.
Contained by
textstructure: body
May contain
Example
<div type="debateSection"> +}

Appendix A.1.19 <div>

<div> (text division) contains division of the body a corpus component. [4.1. Divisions of the Body]
Moduletextstructure — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp) att.typed (type, @subtype)
type
StatusRequired
Legal values are:
debateSection
General purpose text division for all parts of parliamentary proceedings. It should include at least one utterance. If needed, the @subtype attribute can be used for additional content classification.
commentSection
A special purpose text division used as a container for transcriber comments. Should not contain any utterances. If needed, the @subtype attribute can be used for additional content classification.
Contained by
textstructure: body
May contain
Example
<div type="debateSection">  <head>Devolution of Power (Cities)</head>  <u xml:id="ParlaMint-GB_2015-01-06-commons.u1">...</u>  <u xml:id="ParlaMint-GB_2015-01-06-commons.u2">...</u> @@ -1213,7 +1222,7 @@ <sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not(ancestor::tei:floatingText)"> Abstract model violation: Lines may not contain higher-level structural elements such as div, unless div is a descendant of floatingText. </sch:report>
Schematron
<sch:report test="(ancestor::tei:p or ancestor::tei:ab) and not(ancestor::tei:floatingText)"> Abstract model violation: p and ab may not contain higher-level structural elements such as div, unless div is a descendant of floatingText. -</sch:report>
Content model
+</sch:report>
Content model
 <content>
  <elementRef key="head" minOccurs="0"
   maxOccurs="unbounded"/>
@@ -1228,7 +1237,7 @@
   <elementRef key="u"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element div
 {
    tei_att.global.attribute.xmlid,
@@ -1247,20 +1256,20 @@
     | tei_pb
     | tei_u
    )+
-}

Appendix A.1.20 <edition>

<edition> (edition) describes the particularities of one edition of a text. [2.2.2. The Edition Statement]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: bibl
header: editionStmt
May containCharacter data only
Example
<edition>2.1</edition>
Content model
+}

Appendix A.1.20 <edition>

<edition> (edition) describes the particularities of one edition of a text. [2.2.2. The Edition Statement]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: bibl
header: editionStmt
May containCharacter data only
Example
<edition>2.1</edition>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element edition { tei_att.global.attribute.xmllang, text }

Appendix A.1.21 <editionStmt>

<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
header: edition
Example
<editionStmt> +
Schema Declaration
+element edition { tei_att.global.attribute.xmllang, text }

Appendix A.1.21 <editionStmt>

<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
header: edition
Example
<editionStmt>  <edition>2.1</edition> -</editionStmt>
Content model
+</editionStmt>
Content model
 <content>
  <elementRef key="edition" minOccurs="1"
   maxOccurs="1"/>
 </content>
-    
Schema Declaration
-element editionStmt { tei_edition }

Appendix A.1.22 <editorialDecl>

<editorialDecl> (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text. [2.3.3. The Editorial Practices Declaration 2.3. The Encoding Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
Example
<editorialDecl> +
Schema Declaration
+element editionStmt { tei_edition }

Appendix A.1.22 <editorialDecl>

<editorialDecl> (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text. [2.3.3. The Editorial Practices Declaration 2.3. The Encoding Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
Example
<editorialDecl>  <correction>   <p>No correction of source texts was performed.</p>  </correction> @@ -1276,7 +1285,7 @@  <segmentation>   <p>The texts are segmented into utterances (contributions) and segments (corresponding to paragraphs in the source transcription).</p>  </segmentation> -</editorialDecl>
Content model
+</editorialDecl>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -1287,7 +1296,7 @@
   <elementRef key="segmentation"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element editorialDecl
 {
    (
@@ -1297,11 +1306,11 @@
     | tei_quotation
     | tei_segmentation
    )+
-}

Appendix A.1.23 <education>

<education> (education) contains a description of the educational experience of a person. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, xml:base, xml:space, @n, @xml:lang) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Contained by
namesdates: person
May containCharacter data only
Example
<education>Bachelor of Science, Electrical and Information Technology Engineer</education>
Content model
+}

Appendix A.1.23 <education>

<education> (education) contains a description of the educational experience of a person. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, xml:base, xml:space, @n, @xml:lang) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Contained by
namesdates: person
May containCharacter data only
Example
<education>Bachelor of Science, Electrical and Information Technology Engineer</education>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element education
 {
    tei_att.global.attribute.n,
@@ -1310,12 +1319,12 @@
    tei_att.datable.w3c.attribute.from,
    tei_att.datable.w3c.attribute.to,
    text
-}

Appendix A.1.24 <email>

<email> (electronic mail address) contains an email address identifying a location to which email messages can be delivered. [3.6.2. Addresses]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana)
Member of
Contained by
core: unit
May contain
analysis: pc w
character data
Note

The format of a modern Internet email address is defined in RFC 2822

ExampleThe element can be used for fine-grained Named Entities which include e-mail addresses:
<email ana="ne:me" +}

Appendix A.1.24 <email>

<email> (electronic mail address) contains an email address identifying a location to which email messages can be delivered. [3.6.2. Addresses]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana)
Member of
Contained by
core: unit
May contain
analysis: pc w
character data
Note

The format of a modern Internet email address is defined in RFC 2822

ExampleThe element can be used for fine-grained Named Entities which include e-mail addresses:
<email ana="ne:me"  xml:id="ParlaMint-CZ_2014-12-09-ps2013-023-05-003-133.ne87">  <w xml:id="ParlaMint-CZ_2014-12-09-ps2013-023-05-003-133.u4.p9.s3.w13"   lemma="namraza@cd.cz"   msd="UPosTag=NOUN|Case=Gen|Gender=Fem|Number=Plur|Polarity=Pos">namraza@cd.cz</w> -</email>
Content model
+</email>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -1324,19 +1333,19 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element email
 {
    tei_att.global.attribute.xmlid,
    tei_att.global.attribute.xmllang,
    tei_att.global.analytic.attribute.ana,
    ( tei_w | tei_pc | text )+
-}

Appendix A.1.25 <encodingDesc>

<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Contained by
header: teiHeader
May contain
ExampleGeneral structure of an encoding description:
<encodingDesc> +}

Appendix A.1.25 <encodingDesc>

<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Contained by
header: teiHeader
May contain
ExampleGeneral structure of an encoding description:
<encodingDesc>  <projectDesc>...</projectDesc>  <editorialDecl>...</editorialDecl>  <tagsDecl>...</tagsDecl>  <classDecl>...</classDecl> -</encodingDesc>
ExampleStructure of an encoding description for unannotated corpus root:
<encodingDesc> +</encodingDesc>
ExampleStructure of an encoding description for unannotated corpus root:
<encodingDesc>  <projectDesc>   <p xml:lang="sl">    <ref target="https://www.clarin.eu/content/parlamint">ParlaMint</ref> @@ -1359,7 +1368,7 @@   </namespace>  </tagsDecl>  <classDecl>...</classDecl> -</encodingDesc>
ExampleExample of encoding description of an annotated corpus root. The structure includes two additional elements, <listPrefixDef> and <appInfo>.
<encodingDesc> +</encodingDesc>
ExampleExample of encoding description of an annotated corpus root. The structure includes two additional elements, <listPrefixDef> and <appInfo>.
<encodingDesc>  <projectDesc>... </projectDesc>  <editorialDecl>...</editorialDecl>  <tagsDecl>...</tagsDecl> @@ -1374,10 +1383,10 @@  <appInfo>   <application>...</application>  </appInfo> -</encodingDesc>
ExampleExample of encoding description of a corpus component (annotated or unannotated). In contrast to the corpus root, the encoding description of a corpus component contains only two elements, namely, the <projectDesc> and the <tagsDecl>.
<encodingDesc> +</encodingDesc>
ExampleExample of encoding description of a corpus component (annotated or unannotated). In contrast to the corpus root, the encoding description of a corpus component contains only two elements, namely, the <projectDesc> and the <tagsDecl>.
<encodingDesc>  <projectDesc>...</projectDesc>  <tagsDecl>...</tagsDecl> -</encodingDesc>
Content model
+</encodingDesc>
Content model
 <content>
  <elementRef key="projectDesc"/>
  <elementRef key="editorialDecl"
@@ -1390,7 +1399,7 @@
  <elementRef key="appInfo" minOccurs="0"
   maxOccurs="1"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element encodingDesc
 {
    tei_projectDesc,
@@ -1399,57 +1408,57 @@
    tei_classDecl?,
    tei_listPrefixDef?,
    tei_appInfo?
-}

Appendix A.1.26 <equipment>

<equipment> (equipment) provides technical details of the equipment and media used for an audio or video recording used as the source for a spoken text. [8.2. Documenting the Source of Transcribed Speech 15.3.2. Declarable Elements]
Modulespoken — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @next, @prev)) (att.global.analytic (@ana)) (att.global.responsibility (@resp)) (att.global.source (@source)) att.declarable (@default)
Contained by
May contain
core: p
Example
<equipment> +}

Appendix A.1.26 <equipment>

<equipment> (equipment) provides technical details of the equipment and media used for an audio or video recording used as the source for a spoken text. [8.2. Documenting the Source of Transcribed Speech 15.3.2. Declarable Elements]
Modulespoken — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @next, @prev)) (att.global.analytic (@ana)) (att.global.responsibility (@resp)) (att.global.source (@source)) att.declarable (@default)
Contained by
May contain
core: p
Example
<equipment>  <p>"Hi-8" 8 mm NTSC camcorder with integral directional    microphone and windshield and stereo digital sound    recording channel.  </p> -</equipment>
Example
<equipment> +</equipment>
Example
<equipment>  <p>8-track analogue transfer mixed down to 19 cm/sec audio    tape for cassette mastering</p> -</equipment>
Content model
+</equipment>
Content model
 <content>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element equipment
 {
    tei_att.global.attributes,
    tei_att.declarable.attributes,
    tei_model.pLike+
-}

Appendix A.1.27 <equipment>

<equipment> (equipment) provides technical details of the equipment and media used for an audio or video recording used as the source for a spoken text.
Modulespoken — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @next, @prev)) (att.global.analytic (@ana)) (att.global.responsibility (@resp)) (att.global.source (@source)) att.declarable (@default)
Contained by
May contain
core: p
Example
<equipment> +}

Appendix A.1.27 <equipment>

<equipment> (equipment) provides technical details of the equipment and media used for an audio or video recording used as the source for a spoken text.
Modulespoken — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @next, @prev)) (att.global.analytic (@ana)) (att.global.responsibility (@resp)) (att.global.source (@source)) att.declarable (@default)
Contained by
May contain
core: p
Example
<equipment>  <p>"Hi-8" 8 mm NTSC camcorder with integral directional    microphone and windshield and stereo digital sound    recording channel.  </p> -</equipment>
Content model
+</equipment>
Content model
 <content>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element equipment
 {
    tei_att.global.attributes,
    tei_att.declarable.attributes,
    tei_model.pLike+
-}

Appendix A.1.28 <event>

<event> (event) contains data relating to any kind of significant event associated with a person, place, or organisation. [13.3.1. Basic Principles]
Modulenamesdates — Formal specification
Attributesatt.global (n, xml:lang, xml:base, xml:space, @xml:id) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Contained by
namesdates: listEvent org
May contain
core: label
Example
<event xml:id="PoGB.55from="2010-05-18" +}

Appendix A.1.28 <event>

<event> (event) contains data relating to any kind of significant event associated with a person, place, or organisation. [13.3.1. Basic Principles]
Modulenamesdates — Formal specification
Attributesatt.global (n, xml:lang, xml:base, xml:space, @xml:id) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Contained by
namesdates: listEvent org
May contain
core: label
Example
<event xml:id="PoGB.55from="2010-05-18"  to="2015-03-30">  <label>Fifty-fifth Parliament of the United Kingdom</label> -</event>
Example
<org xml:id="government.HR" +</event>
Example
<org xml:id="government.HR"  role="government">  <orgName xml:lang="hrfull="yes">Vlada Republike Hrvatske</orgName>  <orgName xml:lang="enfull="yes">Government of the Republic of Croatia</orgName>  <event from="1990-05-30">   <label xml:lang="en">existence</label>  </event> -</org>
Content model
+</org>
Content model
 <content>
  <elementRef key="label" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element event
 {
    tei_att.global.attribute.xmlid,
@@ -1457,7 +1466,7 @@
    tei_att.datable.w3c.attribute.from,
    tei_att.datable.w3c.attribute.to,
    tei_label+
-}

Appendix A.1.29 <extent>

<extent> (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 10.7.1. Object Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
core: measure
Example
<extent> +}

Appendix A.1.29 <extent>

<extent> (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 10.7.1. Object Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
core: measure
Example
<extent>  <measure unit="speechesquantity="75122"   xml:lang="sl">75.122 govorov</measure>  <measure unit="speechesquantity="75122" @@ -1466,29 +1475,29 @@   xml:lang="sl">20.190.034 besed</measure>  <measure unit="wordsquantity="20190034"   xml:lang="en">20,190,034 words</measure> -</extent>
Content model
+</extent>
Content model
 <content>
  <elementRef key="measure" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element extent { tei_measure+ }

Appendix A.1.30 <figure>

<figure> (figure) groups elements representing or containing graphic information such as an illustration, formula, or figure. [14.4. Specific Elements for Graphic Images]
Modulefigures — Formal specification
Member of
Contained by
namesdates: person
May contain
Example
<figure> +
Schema Declaration
+element extent { tei_measure+ }

Appendix A.1.30 <figure>

<figure> (figure) groups elements representing or containing graphic information such as an illustration, formula, or figure. [14.4. Specific Elements for Graphic Images]
Modulefigures — Formal specification
Member of
Contained by
namesdates: person
May contain
Example
<figure>  <graphic url="https://www.psp.cz/eknih/cdrom/2017ps/eknih/2017ps/poslanci/i6497.jpg"/> -</figure>
Content model
+</figure>
Content model
 <content>
  <elementRef key="head" minOccurs="0"
   maxOccurs="1"/>
  <elementRef key="graphic" minOccurs="1"
   maxOccurs="1"/>
 </content>
-    
Schema Declaration
-element figure { tei_head?, tei_graphic }

Appendix A.1.31 <fileDesc>

<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Contained by
header: teiHeader
May contain
Note

The major source of information for those seeking to create a catalogue entry or bibliographic citation for an electronic file. As such, it provides a title and statements of responsibility together with details of the publication or distribution of the file, of any series to which it belongs, and detailed bibliographic notes for matters not addressed elsewhere in the header. It also contains a full bibliographic description for the source or sources from which the electronic text was derived.

ExampleBasic structure of the <fileDesc> element:
<fileDesc> +
Schema Declaration
+element figure { tei_head?, tei_graphic }

Appendix A.1.31 <fileDesc>

<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Contained by
header: teiHeader
May contain
Note

The major source of information for those seeking to create a catalogue entry or bibliographic citation for an electronic file. As such, it provides a title and statements of responsibility together with details of the publication or distribution of the file, of any series to which it belongs, and detailed bibliographic notes for matters not addressed elsewhere in the header. It also contains a full bibliographic description for the source or sources from which the electronic text was derived.

ExampleBasic structure of the <fileDesc> element:
<fileDesc>  <titleStmt>...</titleStmt>  <editionStmt>...</editionStmt>  <extent>...</extent>  <publicationStmt>...</publicationStmt>  <sourceDesc>...</sourceDesc> -</fileDesc>
ExampleExample of the <fileDesc> element in a corpus root:
<fileDesc> +</fileDesc>
ExampleExample of the <fileDesc> element in a corpus root:
<fileDesc>  <titleStmt>   <title type="mainxml:lang="en">Dutch parliamentary corpus ParlaMint-NL [ParlaMint]</title>   <title type="mainxml:lang="nl">Corpus van het Nederlandse Parlement ParlaMint-NL [ParlaMint]</title> @@ -1550,7 +1559,7 @@    <date from="2014-04-16to="2020-10-14">2014-04-16 - 2020-10-14</date>   </bibl>  </sourceDesc> -</fileDesc>
ExampleExample of the <fileDesc> element in a corpus component:
<fileDesc> +</fileDesc>
ExampleExample of the <fileDesc> element in a corpus component:
<fileDesc>  <titleStmt>   <title type="mainxml:lang="en">Dutch parliamentary corpus ParlaMint-NL, Lower House 2014-04-16 [ParlaMint]</title>   <title type="mainxml:lang="nl">Corpus van het Nederlandse parlement ParlaMint-NL, Tweede Kamer 2014-04-16 [ParlaMint]</title> @@ -1602,7 +1611,7 @@    <date when="2014-04-16">2014-04-16</date>   </bibl>  </sourceDesc> -</fileDesc>
Content model
+</fileDesc>
Content model
 <content>
  <elementRef key="titleStmt"/>
  <elementRef key="editionStmt"/>
@@ -1610,7 +1619,7 @@
  <elementRef key="publicationStmt"/>
  <elementRef key="sourceDesc"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element fileDesc
 {
    tei_titleStmt,
@@ -1618,65 +1627,65 @@
    tei_extent,
    tei_publicationStmt,
    tei_sourceDesc
-}

Appendix A.1.32 <forename>

<forename> (forename) contains a forename, given or baptismal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @next, @prev)) (att.global.analytic (@ana)) (att.global.responsibility (@resp)) (att.global.source (@source)) att.personal (@full) (att.naming (@role) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
namesdates: persName
May containCharacter data only
Example
<persName> +}

Appendix A.1.32 <forename>

<forename> (forename) contains a forename, given or baptismal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @next, @prev)) (att.global.analytic (@ana)) (att.global.responsibility (@resp)) (att.global.source (@source)) att.personal (@full) (att.naming (@role) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
namesdates: persName
May containCharacter data only
Example
<persName>  <surname>Bongiorno</surname>  <forename>Giulia</forename> -</persName>
Content model
+</persName>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element forename
 {
    tei_att.global.attributes,
    tei_att.personal.attributes,
    tei_att.typed.attributes,
    text
-}

Appendix A.1.33 <funder>

<funder> (funding body) specifies the name of an individual, institution, or organisation responsible for the funding of a project or text. [2.2.1. The Title Statement]
Moduleheader — Formal specification
Contained by
header: titleStmt
May contain
core: ref
namesdates: orgName
Note

Funders provide financial support for a project; they are distinct from sponsors (see element <sponsor>), who provide intellectual support and authority.

Example
<funder> +}

Appendix A.1.33 <funder>

<funder> (funding body) specifies the name of an individual, institution, or organisation responsible for the funding of a project or text. [2.2.1. The Title Statement]
Moduleheader — Formal specification
Contained by
header: titleStmt
May contain
core: ref
namesdates: orgName
Note

Funders provide financial support for a project; they are distinct from sponsors (see element <sponsor>), who provide intellectual support and authority.

Example
<funder>  <orgName xml:lang="es">CLARIN infraestructura de investigación científica</orgName>  <orgName xml:lang="en">The CLARIN research infrastructure</orgName> -</funder>
Content model
+</funder>
Content model
 <content>
  <elementRef key="orgName" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="ref" minOccurs="0"
   maxOccurs="1"/>
 </content>
-    
Schema Declaration
-element funder { tei_orgName+, tei_ref? }

Appendix A.1.34 <gap>

<gap> (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible. [3.5.3. Additions, Deletions, and Omissions]
Modulecore — Formal specification
Attributes
reason
StatusRecommended
Legal values are:
inaudible
editorial
foreign
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Note

The <gap>, <unclear>, and <del> core tag elements may be closely allied in use with the <damage> and <supplied> elements, available when using the additional tagset for transcription of primary sources. See section 11.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for discussion of which element is appropriate for which circumstance.

The <gap> tag simply signals the editors decision to omit or inability to transcribe a span of text. Other information, such as the interpretation that text was deliberately erased or covered, should be indicated using the relevant tags, such as <del> in the case of deliberate deletion.

Example
<gap reason="inaudible"> +
Schema Declaration
+element funder { tei_orgName+, tei_ref? }

Appendix A.1.34 <gap>

<gap> (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible. [3.5.3. Additions, Deletions, and Omissions]
Modulecore — Formal specification
Attributes
reason
StatusRecommended
Legal values are:
inaudible
editorial
foreign
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Note

The <gap>, <unclear>, and <del> core tag elements may be closely allied in use with the <damage> and <supplied> elements, available when using the additional tagset for transcription of primary sources. See section 11.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for discussion of which element is appropriate for which circumstance.

The <gap> tag simply signals the editors decision to omit or inability to transcribe a span of text. Other information, such as the interpretation that text was deliberately erased or covered, should be indicated using the relevant tags, such as <del> in the case of deliberate deletion.

Example
<gap reason="inaudible">  <desc>microphone muted</desc> -</gap>
Example
<gap reason="editorial"> +</gap>
Example
<gap reason="editorial">  <desc xml:lang="de">Zitierte Druckfassung entfernt</desc>  <desc xml:lang="en">Quoted printed matter omited</desc> -</gap>
Example
<gap reason="foreign"> +</gap>
Example
<gap reason="foreign">  <desc xml:lang="und">Huliniahuanngittunga</desc> -</gap>
Content model
+</gap>
Content model
 <content>
  <elementRef key="desc" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element gap
 {
    attribute reason { "inaudible" | "editorial" | "foreign" }?,
    tei_desc+
-}

Appendix A.1.35 <graphic>

<graphic> (graphic) indicates the location of a graphic or illustration, either forming part of a text, or providing an image of it. [3.10. Graphics and Other Non-textual Components 11.1. Digital Facsimiles]
Modulecore — Formal specification
Attributesatt.resourced (@url) att.media (width, height, @scale)
Member of
Contained by
figures: figure
May containEmpty element
Note

The mimeType attribute should be used to supply the MIME media type of the image specified by the url attribute.

Within the body of a text, a <graphic> element indicates the presence of a graphic component in the source itself. Within the context of a <facsimile> or <sourceDoc> element, however, a <graphic> element provides an additional digital representation of some part of the source being encoded.

Example
<figure> +}

Appendix A.1.35 <graphic>

<graphic> (graphic) indicates the location of a graphic or illustration, either forming part of a text, or providing an image of it. [3.10. Graphics and Other Non-textual Components 11.1. Digital Facsimiles]
Modulecore — Formal specification
Attributesatt.resourced (@url) att.media (width, height, @scale)
Member of
Contained by
figures: figure
May containEmpty element
Note

The mimeType attribute should be used to supply the MIME media type of the image specified by the url attribute.

Within the body of a text, a <graphic> element indicates the presence of a graphic component in the source itself. Within the context of a <facsimile> or <sourceDoc> element, however, a <graphic> element provides an additional digital representation of some part of the source being encoded.

Example
<figure>  <graphic url="https://www.dekamer.be//site/wwwroot/images/cv/06595.gif"/> -</figure>
Content model
+</figure>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element graphic
 {
    tei_att.media.attribute.scale,
    tei_att.resourced.attributes,
    empty
-}

Appendix A.1.36 <head>

<head> (heading) contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc. [4.2.1. Headings and Trailers]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.typed (subtype, @type)
Contained by
figures: figure
textstructure: div
May containCharacter data only
Note

The <head> element is used for headings at all levels; software which treats (e.g.) chapter headings, section headings, and list titles differently must determine the proper processing of a <head> element based on its structural position. A <head> occurring as the first element of a list is the title of that list; one occurring as the first element of a <div1> is the title of that chapter or section.

ExampleThe most common use for the <head> element is to mark the headings of sections:
<div type="debateSection"> +}

Appendix A.1.36 <head>

<head> (heading) contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc. [4.2.1. Headings and Trailers]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.typed (subtype, @type)
Contained by
figures: figure
textstructure: div
May containCharacter data only
Note

The <head> element is used for headings at all levels; software which treats (e.g.) chapter headings, section headings, and list titles differently must determine the proper processing of a <head> element based on its structural position. A <head> occurring as the first element of a list is the title of that list; one occurring as the first element of a <div1> is the title of that chapter or section.

ExampleThe most common use for the <head> element is to mark the headings of sections:
<div type="debateSection">  <head>Regulation of Health and Social Care Professions Etc. Bill [HL]</head> ... -</div>
ExampleThe <head> element may also be used to give the title to specialised lists:
<listEvent> +</div>
ExampleThe <head> element may also be used to give the title to specialised lists:
<listEvent>  <head xml:lang="nl">Zittingsperiode</head>  <head xml:lang="en">Legislative period</head>  <event to="2007-05-02from="2003-06-05" @@ -1686,46 +1695,46 @@  </event> ... -</listEvent>
Content model
+</listEvent>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element head
 {
    tei_att.global.attribute.xmlid,
    tei_att.global.attribute.xmllang,
    tei_att.typed.attribute.type,
    text
-}

Appendix A.1.37 <hyphenation>

<hyphenation> (hyphenation) summarizes the way in which hyphenation in a source text has been treated in an encoded version of it. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> ... +}

Appendix A.1.37 <hyphenation>

<hyphenation> (hyphenation) summarizes the way in which hyphenation in a source text has been treated in an encoded version of it. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> ... <hyphenation>   <p xml:lang="en">No end-of-line hyphens were present in the source.</p>  </hyphenation> ... -</editorialDecl>
Content model
+</editorialDecl>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element hyphenation { tei_p+ }

Appendix A.1.38 <idno>

<idno> (identifier) supplies an identifier used to identify some object, such as a person or organisation. If it is a URL, it should have @type="URI". [13.3.1. Basic Principles 2.2.4. Publication, Distribution, Licensing, etc. 2.2.5. The Series Statement 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
typecategorizes the identifier.
StatusRequired
Legal values are:
URI
Uniform Resource Identifier ParlaMint should be a resolvable URL, with the subtype classifying the type of web site.
VIAF
The URL of the Virtual Internet Authority File assigned to link different names in catalogs around the world for the same entity.
subtype
StatusOptional
Legal values are:
handle
The permanent identifier of type handle.
government
A governmental web site.
politicalParty
The web site of a political party.
parliament
A web site of the parliament.
ministry
The web site of a ministry.
personal
The personal web site of a person.
business
A web site belonging to a bussiness.
publicService
The web site of a pubic service.
wikimedia
A web site of Wikimedia, e.g. Wikipedia.
facebook
A Facebook web site.
twitter
A Twitter web site.
tiktok
A TikTok web site.
instagram
An Instagram web site.
Note

this attribute should always be used with type="URI"

Member of
Contained by
core: bibl
namesdates: org person
May containCharacter data only
Note

<idno> should be used for labels which identify an object or concept in a formal cataloguing system such as a database or an RDF store, or in a distributed system such as the World Wide Web. Some suggested values for type on <idno> are ISBN, ISSN, DOI, and URI.

Example
<publicationStmt> ... +
Schema Declaration
+element hyphenation { tei_p+ }

Appendix A.1.38 <idno>

<idno> (identifier) supplies an identifier used to identify some object, such as a person or organisation. If it is a URL, it should have @type="URI". [13.3.1. Basic Principles 2.2.4. Publication, Distribution, Licensing, etc. 2.2.5. The Series Statement 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
typecategorizes the identifier.
StatusRequired
Legal values are:
URI
Uniform Resource Identifier ParlaMint should be a resolvable URL, with the subtype classifying the type of web site.
VIAF
The URL of the Virtual Internet Authority File assigned to link different names in catalogs around the world for the same entity.
subtype
StatusOptional
Legal values are:
handle
The permanent identifier of type handle.
government
A governmental web site.
politicalParty
The web site of a political party.
parliament
A web site of the parliament.
ministry
The web site of a ministry.
personal
The personal web site of a person.
business
A web site belonging to a bussiness.
publicService
The web site of a pubic service.
wikimedia
A web site of Wikimedia, e.g. Wikipedia.
facebook
A Facebook web site.
twitter
A Twitter web site.
tiktok
A TikTok web site.
instagram
An Instagram web site.
Note

this attribute should always be used with type="URI"

Member of
Contained by
core: bibl
namesdates: org person
May containCharacter data only
Note

<idno> should be used for labels which identify an object or concept in a formal cataloguing system such as a database or an RDF store, or in a distributed system such as the World Wide Web. Some suggested values for type on <idno> are ISBN, ISSN, DOI, and URI.

Example
<publicationStmt> ... <idno type="URIsubtype="handle">http://hdl.handle.net/11356/1432</idno> ... -</publicationStmt>
Example
<sourceDesc> +</publicationStmt>
Example
<sourceDesc>  <bibl>   <title type="mainxml:lang="sl">Zapisi sej Državnega zbora Republike Slovenije</title>    ...  <idno type="URI">https://www.dz-rs.si</idno>    ...  </bibl> -</sourceDesc>
Example
<idno type="URIsubtype="wikimedia" +</sourceDesc>
Example
<idno type="URIsubtype="wikimedia"  xml:lang="sl">https://sl.wikipedia.org/wiki/Pozitivna_Slovenija</idno> <idno type="URIsubtype="wikimedia" - xml:lang="en">https://en.wikipedia.org/wiki/Positive_Slovenia</idno>
Content model
xml:lang="en">https://en.wikipedia.org/wiki/Positive_Slovenia</idno>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element idno
 {
    tei_att.global.attribute.xmllang,
@@ -1747,18 +1756,18 @@
     | "instagram"
    }?,
    text
-}

Appendix A.1.39 <incident>

<incident> (incident) marks any phenomenon or occurrence, not necessarily vocalized or communicative, for example incidental noises or other events affecting communication. [8.3.3. Vocal, Kinesic, Incident]
Modulespoken — Formal specification
Attributesatt.ascribed (@who) att.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.typed (type, @subtype)
type
StatusRecommended
Legal values are:
action
incident
leaving
entering
break
pause
sound
editorial
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Example
<incident type="action"> +}

Appendix A.1.39 <incident>

<incident> (incident) marks any phenomenon or occurrence, not necessarily vocalized or communicative, for example incidental noises or other events affecting communication. [8.3.3. Vocal, Kinesic, Incident]
Modulespoken — Formal specification
Attributesatt.ascribed (@who) att.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.typed (type, @subtype)
type
StatusRecommended
Legal values are:
action
incident
leaving
entering
break
pause
sound
editorial
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Example
<incident type="action">  <desc>He stands and with him the whole Assembly</desc> -</incident>
Example
<incident type="sound"> +</incident>
Example
<incident type="sound">  <desc>The Assembly observed a minute of silence. Applause.</desc> -</incident>
Example
<incident type="entering"> +</incident>
Example
<incident type="entering">  <desc>Arrival of the President of the Republic of Poland</desc> -</incident>
Content model
+</incident>
Content model
 <content>
  <elementRef key="desc" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element incident
 {
    tei_att.global.attribute.xmlid,
@@ -1778,7 +1787,7 @@
     | "editorial"
    }?,
    tei_desc+
-}

Appendix A.1.40 <include>

<include> is an element from the XML namespace of the XML Inclusions (XInclude) W3C recommendation. It is used to include, into a ParlaMint <teiCorpus> root file the elements of the corpus that are stored as separate files. These are the <TEI> corpus components and parts of the corpus root <teiHeader>. Inside <particDesc> these are <listPerson> & <listOrg>, and <taxonomy> inside <classDecl>.
Namespacehttp://www.w3.org/2001/XInclude
Modulederived-module-parlamint
Attributes
href
StatusOptional
Datatypeteidata.pointer
Contained by
core: teiCorpus
corpus: particDesc
header: classDecl
May containEmpty element
ExampleUsing XInclude in ParlaMint to include corpus components into the corpus root:
<teiCorpus xml:lang="en" +}

Appendix A.1.40 <include>

<include> is an element from the XML namespace of the XML Inclusions (XInclude) W3C recommendation. It is used to include, into a ParlaMint <teiCorpus> root file the elements of the corpus that are stored as separate files. These are the <TEI> corpus components and parts of the corpus root <teiHeader>. Inside <particDesc> these are <listPerson> & <listOrg>, and <taxonomy> inside <classDecl>.
Namespacehttp://www.w3.org/2001/XInclude
Modulederived-module-parlamint
Attributes
href
StatusOptional
Datatypeteidata.pointer
Contained by
core: teiCorpus
corpus: particDesc
header: classDecl
May containEmpty element
ExampleUsing XInclude in ParlaMint to include corpus components into the corpus root:
<teiCorpus xml:lang="en"  xml:id="ParlaMint-GB" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader> ...TEI header of the corpus...  </teiHeader> @@ -1788,18 +1797,18 @@ href="2015/ParlaMint-GB_2015-01-06-commons.xml"/> ... -</teiCorpus>

Appendix A.1.41 <kinesic>

<kinesic> (kinesic) marks any communicative phenomenon, not necessarily vocalized, for example a gesture, frown, etc. [8.3.3. Vocal, Kinesic, Incident]
Modulespoken — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.typed (type, @subtype) att.ascribed (@who)
type
StatusRecommended
Legal values are:
kinesic
applause
ringing
signal
playback
gesture
smiling
laughter
snapping
noise
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Example
<kinesic type="signal"> +</teiCorpus>

Appendix A.1.41 <kinesic>

<kinesic> (kinesic) marks any communicative phenomenon, not necessarily vocalized, for example a gesture, frown, etc. [8.3.3. Vocal, Kinesic, Incident]
Modulespoken — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.typed (type, @subtype) att.ascribed (@who)
type
StatusRecommended
Legal values are:
kinesic
applause
ringing
signal
playback
gesture
smiling
laughter
snapping
noise
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Example
<kinesic type="signal">  <desc>sign for the end of discussion</desc> -</kinesic>
Example
<kinesic type="laughter"> +</kinesic>
Example
<kinesic type="laughter">  <desc xml:lang="hr">smijeh.</desc> -</kinesic>
Example
<kinesic type="applause"> +</kinesic>
Example
<kinesic type="applause">  <desc xml:lang="sl">ploskanje</desc> -</kinesic>
Content model
+</kinesic>
Content model
 <content>
  <elementRef key="desc" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element kinesic
 {
    tei_att.global.attribute.xmlid,
@@ -1821,7 +1830,7 @@
     | "noise"
    }?,
    tei_desc+
-}

Appendix A.1.42 <label>

<label> (label) contains any label or heading used to identify part of a text, typically but not exclusively in a list or glossary. [3.8. Lists]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Member of
Contained by
header: application
namesdates: event state
May contain
namesdates: orgName
character data
ExampleLabels denote the existence of organisations and connected events:
<org xml:id="DZrole="parliament" +}

Appendix A.1.42 <label>

<label> (label) contains any label or heading used to identify part of a text, typically but not exclusively in a list or glossary. [3.8. Lists]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Member of
Contained by
header: application
namesdates: event state
May contain
namesdates: orgName
character data
ExampleLabels denote the existence of organisations and connected events:
<org xml:id="DZrole="parliament"  ana="#parla.national #parla.lower">  <orgName xml:lang="slfull="yes">Državni zbor Republike Slovenije</orgName>  <orgName xml:lang="enfull="yes">National Assembly of the Republic of Slovenia</orgName> @@ -1842,11 +1851,11 @@    <label xml:lang="en">Term 8</label>   </event>  </listEvent> -</org>
ExampleLabels may also be used to give a name to the tools used in compiling the corpus:
<application ident="int-tagger" +</org>
ExampleLabels may also be used to give a name to the tools used in compiling the corpus:
<application ident="int-tagger"  version="1.0">  <label>INT Tagger, lemmatizer and Tokenizer</label>  <desc xml:lang="en">INT Tagger, lemmatizer and Tokenizer for modern Dutch, based on old-school machine learning (SVM). It provides the legacy PoS tags (encoded in w/@ana) and the lemmata for Dutch. Not publicly available.</desc> -</application>
ExampleLabels may also be used for other structured list items:
<listEvent> +</application>
ExampleLabels may also be used for other structured list items:
<listEvent>  <head xml:lang="lv">Saeimas sasaukumi</head>  <head xml:lang="en">Legislative period</head>  <event xml:id="PT.12from="2014-11-04" @@ -1858,29 +1867,29 @@   <label xml:lang="lv">13. Saeima</label>   <label xml:lang="en">Term 13</label>  </event> -</listEvent>
Content model
+</listEvent>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <textNode/>
   <elementRef key="orgName"/>
  </alternate>
 </content>
-    
Schema Declaration
-element label { tei_att.global.attribute.xmllang, ( text | tei_orgName ) }

Appendix A.1.43 <langUsage>

<langUsage> (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text. [2.4.2. Language Usage 2.4. The Profile Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
header: profileDesc
May contain
header: language
Example
<langUsage> +
Schema Declaration
+element label { tei_att.global.attribute.xmllang, ( text | tei_orgName ) }

Appendix A.1.43 <langUsage>

<langUsage> (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text. [2.4.2. Language Usage 2.4. The Profile Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
header: profileDesc
May contain
header: language
Example
<langUsage>  <language ident="slxml:lang="sl">slovenski</language>  <language ident="enxml:lang="sl">angleški</language>  <language ident="slxml:lang="en">Slovenian</language>  <language ident="enxml:lang="en">English</language> -</langUsage>
Content model
+</langUsage>
Content model
 <content>
  <elementRef key="language" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element langUsage { tei_language+ }

Appendix A.1.44 <language>

<language> (language) characterizes a single language or sublanguage used within a text. [2.4.2. Language Usage]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
ident(identifier) Supplies a language code constructed as defined in BCP 47 which is used to identify the language documented by this element, and which is referenced by the global xml:lang attribute.
StatusRequired
Datatypeteidata.language
usagespecifies the approximate percentage (by volume) of the text which uses this language.
StatusOptional
DatatypenonNegativeInteger
Contained by
header: langUsage
May containCharacter data only
Note

Particularly for sublanguages, an informal prose characterization should be supplied as content for the element.

Example
<langUsage> +
Schema Declaration
+element langUsage { tei_language+ }

Appendix A.1.44 <language>

<language> (language) characterizes a single language or sublanguage used within a text. [2.4.2. Language Usage]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
ident(identifier) Supplies a language code constructed as defined in BCP 47 which is used to identify the language documented by this element, and which is referenced by the global xml:lang attribute.
StatusRequired
Datatypeteidata.language
usagespecifies the approximate percentage (by volume) of the text which uses this language.
StatusOptional
DatatypenonNegativeInteger
Contained by
header: langUsage
May containCharacter data only
Note

Particularly for sublanguages, an informal prose characterization should be supplied as content for the element.

Example
<langUsage>  <language ident="esxml:lang="es">Español</language>  <language ident="esxml:lang="en">Spanish</language> -</langUsage>
Example
<langUsage> +</langUsage>
Example
<langUsage>  <language ident="bg-Latnxml:lang="en">Bulgarian in Latin script</language>  <language ident="bgxml:lang="bg">български</language>  <language ident="bgxml:lang="en">Bulgarian</language> @@ -1888,29 +1897,29 @@  <language ident="enxml:lang="en">English</language>  <language ident="frxml:lang="bg">френски</language>  <language ident="frxml:lang="en">French</language> -</langUsage>
Content model
+</langUsage>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element language
 {
    tei_att.global.attribute.xmllang,
    attribute ident { text },
    attribute usage { text }?,
    text
-}

Appendix A.1.45 <licence>

<licence> contains information about a licence or other legal agreement applicable to the text. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Contained by
header: availability
May contain
XSD anyURI
Note

A <licence> element should be supplied for each licence agreement applicable to the text in question. The target attribute may be used to reference a full version of the licence. The when, notBefore, notAfter, from or to attributes may be used in combination to indicate the date or dates of applicability of the licence.

ExampleThe <licence> specifies fixed-value CC BY 4.0 URL, and in the following paragraph gives a prose description of the licence:
<licence>http://creativecommons.org/licenses/by/4.0/</licence> +}

Appendix A.1.45 <licence>

<licence> contains information about a licence or other legal agreement applicable to the text. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Contained by
header: availability
May contain
XSD anyURI
Note

A <licence> element should be supplied for each licence agreement applicable to the text in question. The target attribute may be used to reference a full version of the licence. The when, notBefore, notAfter, from or to attributes may be used in combination to indicate the date or dates of applicability of the licence.

ExampleThe <licence> specifies fixed-value CC BY 4.0 URL, and in the following paragraph gives a prose description of the licence:
<licence>http://creativecommons.org/licenses/by/4.0/</licence> <p>This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref> -</p>
ExampleThe textual information on licence can be given in more than one language:
<licence>http://creativecommons.org/licenses/by/4.0/</licence> +</p>
ExampleThe textual information on licence can be given in more than one language:
<licence>http://creativecommons.org/licenses/by/4.0/</licence> <p xml:lang="hr">Ovaj rad je dostupan pod <ref target="http://creativecommons.org/licenses/by/4.0/">međunarodnom licencom Creative Commons Imenovanje 4.0</ref> </p> <p xml:lang="en">This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref> -</p>
Content model
+</p>
Content model
 <content>
  <dataRef name="anyURI"/>
 </content>
-    
Schema Declaration
-element licence { xsd:anyURI }

Appendix A.1.47 <linkGrp>

<linkGrp> (link group) defines a collection of associations or hypertextual links. [16.1. Links]
Modulelinking — Formal specification
Attributes
targFunc
StatusRequired
Legal values are:
head argument
type
StatusRequired
Legal values are:
UD-SYN
Member of
Contained by
analysis: s
May contain
linking: link
Note

May contain one or more <link> or <ptr> elements.

A web or link group is an administrative convenience, which should be used to collect a set of links together for any purpose, not simply to supply a default value for the type attribute.

ExampleSyntactic analysis is stored in the link group, <linkGrp> element, which is then composed of <link> elements. The example below illustrating this is given, for readability, without the word-level linguistic attributes and with shortened IDs:
<s xml:id="ParlaMint-GB_2021-01-06.seg393.8"> +
Schema Declaration
+element link { attribute ana { text }, attribute target { list { ? } }, empty }

Appendix A.1.47 <linkGrp>

<linkGrp> (link group) defines a collection of associations or hypertextual links. [16.1. Links]
Modulelinking — Formal specification
Attributes
targFunc
StatusRequired
Legal values are:
head argument
type
StatusRequired
Legal values are:
UD-SYN
Member of
Contained by
analysis: s
May contain
linking: link
Note

May contain one or more <link> or <ptr> elements.

A web or link group is an administrative convenience, which should be used to collect a set of links together for any purpose, not simply to supply a default value for the type attribute.

ExampleSyntactic analysis is stored in the link group, <linkGrp> element, which is then composed of <link> elements. The example below illustrating this is given, for readability, without the word-level linguistic attributes and with shortened IDs:
<s xml:id="ParlaMint-GB_2021-01-06.seg393.8">  <w xml:id="ParlaMint-GB_2021-01-06.seg393.8.1">I</w>  <w xml:id="ParlaMint-GB_2021-01-06.seg393.8.2">support</w>  <w xml:id="ParlaMint-GB_2021-01-06.seg393.8.3">the</w> @@ -1957,18 +1966,18 @@   <link ana="ud-syn:punct"    target="#ParlaMint-GB_2021-01-06.seg393.8.2 #ParlaMint-GB_2021-01-06.seg393.8.5"/>  </linkGrp> -</s>
Content model
+</s>
Content model
 <content>
  <elementRef maxOccurs="unbounded"
   key="link"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element linkGrp
 {
    attribute targFunc { "head argument" },
    attribute type { "UD-SYN" },
    tei_link+
-}

Appendix A.1.48 <listEvent>

<listEvent> (list of events) contains a list of descriptions, each of which provides information about an identifiable event. [13.3.1. Basic Principles]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: org
May contain
core: head
namesdates: event
Example
<listEvent> +}

Appendix A.1.48 <listEvent>

<listEvent> (list of events) contains a list of descriptions, each of which provides information about an identifiable event. [13.3.1. Basic Principles]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: org
May contain
core: head
namesdates: event
Example
<listEvent>  <event xml:id="GOV.11from="2013-03-20"   to="2014-09-18">   <label xml:lang="sl">11. vlada Republike Slovenije (20. marec 2013 - 18. september 2014)</label> @@ -1979,7 +1988,7 @@   <label xml:lang="sl">14. vlada Republike Slovenije (13. marec 2020 - danes)</label>   <label xml:lang="en">14th Government of the Republic of Slovenia (March 13, 2020 - today)</label>  </event> -</listEvent>
Example
<org ana="#parla.national #parla.upper" +</listEvent>
Example
<org ana="#parla.national #parla.upper"  role="parliamentxml:id="LEG">  <orgName full="yesxml:lang="it">Senato della Repubblica Italiana</orgName>  <orgName full="yesxml:lang="it">Senate of the Republic of Italy</orgName> @@ -1995,7 +2004,7 @@    <label xml:lang="en">XVIII Legislative Term</label>   </event>  </listEvent> -</org>
Content model
+</org>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="head" minOccurs="0"
@@ -2004,8 +2013,8 @@
    maxOccurs="unbounded"/>
  </sequence>
 </content>
-    
Schema Declaration
-element listEvent { tei_head*, tei_event* }

Appendix A.1.49 <listOrg>

<listOrg> (list of organizations) contains a list of elements, each of which provides information about an identifiable organisation. [13.2.2. Organizational Names]
Modulenamesdates — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang)
Member of
Contained by
corpus: particDesc
May contain
core: head
namesdates: listRelation org
Note

The type attribute may be used to distinguish lists of organizations of a particular type if convenient.

Example
<listOrg> +
Schema Declaration
+element listEvent { tei_head*, tei_event* }

Appendix A.1.49 <listOrg>

<listOrg> (list of organizations) contains a list of elements, each of which provides information about an identifiable organisation. [13.2.2. Organizational Names]
Modulenamesdates — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang)
Member of
Contained by
corpus: particDesc
May contain
core: head
namesdates: listRelation org
Note

The type attribute may be used to distinguish lists of organizations of a particular type if convenient.

Example
<listOrg>  <org xml:id="government.GB"   role="government"> ...  </org> @@ -2017,7 +2026,7 @@ ... <listRelation> ...  </listRelation> -</listOrg>
Content model
+</listOrg>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="head" minOccurs="0"
@@ -2028,13 +2037,13 @@
    minOccurs="0" maxOccurs="1"/>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element listOrg
 {
    tei_att.global.attribute.xmlid,
    tei_att.global.attribute.xmllang,
    ( tei_head*, tei_org+, tei_listRelation? )
-}

Appendix A.1.50 <listPerson>

<listPerson> (list of persons) contains a list of descriptions, each of which provides information about an identifiable person or a group of people, for example the participants in a language interaction, or the people referred to in a historical source. [13.3.2. The Person Element 15.2. Contextual Information 2.4. The Profile Description 15.3.2. Declarable Elements]
Modulenamesdates — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang)
Member of
Contained by
corpus: particDesc
May contain
core: head
namesdates: person
Note

The type attribute may be used to distinguish lists of people of a particular type if convenient.

Example
<listPerson> +}

Appendix A.1.50 <listPerson>

<listPerson> (list of persons) contains a list of descriptions, each of which provides information about an identifiable person or a group of people, for example the participants in a language interaction, or the people referred to in a historical source. [13.3.2. The Person Element 15.2. Contextual Information 2.4. The Profile Description 15.3.2. Declarable Elements]
Modulenamesdates — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang)
Member of
Contained by
corpus: particDesc
May contain
core: head
namesdates: person
Note

The type attribute may be used to distinguish lists of people of a particular type if convenient.

Example
<listPerson>  <head>List of speakers</head>  <person xml:id="SayeedaWarsi"> ...  </person> @@ -2042,7 +2051,7 @@  </person> ... -</listPerson>
Content model
+</listPerson>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="head" minOccurs="0"
@@ -2051,13 +2060,13 @@
    maxOccurs="unbounded"/>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element listPerson
 {
    tei_att.global.attribute.xmlid,
    tei_att.global.attribute.xmllang,
    ( tei_head*, tei_person+ )
-}

Appendix A.1.51 <listPrefixDef>

<listPrefixDef> (list of prefix definitions) contains a list of definitions of prefixing schemes used in teidata.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs. [16.2.3. Using Abbreviated Pointers]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
header: prefixDef
ExampleIn this example, two private URI scheme prefixes are defined and patterns are provided for dereferencing them. Each prefix is also supplied with a human-readable explanation in a <p> element.
<listPrefixDef> +}

Appendix A.1.51 <listPrefixDef>

<listPrefixDef> (list of prefix definitions) contains a list of definitions of prefixing schemes used in teidata.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs. [16.2.3. Using Abbreviated Pointers]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
header: prefixDef
ExampleIn this example, two private URI scheme prefixes are defined and patterns are provided for dereferencing them. Each prefix is also supplied with a human-readable explanation in a <p> element.
<listPrefixDef>  <prefixDef ident="ud-syn"   matchPattern="(.+)replacementPattern="#$1">   <p>Private URIs with this prefix point to elements giving their name. In this document they are simply local references into the UD-SYN taxonomy categories in the corpus root TEI header.</p> @@ -2066,13 +2075,13 @@   replacementPattern="#NER.cnec2.0.$1">   <p>Taxonomy for named entities (cnec2.0)</p>  </prefixDef> -</listPrefixDef>
Content model
+</listPrefixDef>
Content model
 <content>
  <elementRef key="prefixDef" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element listPrefixDef { tei_prefixDef+ }

Appendix A.1.52 <listRelation>

<listRelation> provides information about relationships identified amongst people, places, and organisations, either informally as prose or as formally expressed relation links. [13.3.2.3. Personal Relationships]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: listOrg
May contain
namesdates: relation
Note

May contain a prose description organized as paragraphs, or a sequence of <relation> elements.

Example
<listOrg> +
Schema Declaration
+element listPrefixDef { tei_prefixDef+ }

Appendix A.1.52 <listRelation>

<listRelation> provides information about relationships identified amongst people, places, and organisations, either informally as prose or as formally expressed relation links. [13.3.2.3. Personal Relationships]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: listOrg
May contain
namesdates: relation
Note

May contain a prose description organized as paragraphs, or a sequence of <relation> elements.

Example
<listOrg>  <org role="parliamentaryGroup"   xml:id="party.LD">   <orgName full="yes">Liberal Democrat</orgName> @@ -2102,31 +2111,31 @@   <relation>...</relation>    ...  </listRelation> -</listOrg>
Content model
+</listOrg>
Content model
 <content>
  <elementRef key="relation" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element listRelation { tei_relation+ }

Appendix A.1.53 <measure>

<measure> (measure) contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
unit
StatusRequired
Legal values are:
speeches
words
tokens
optional value
quantity(quantity) specifies the number of the specified units that comprise the measurement
Derived fromatt.measurement
StatusRequired
Datatypeteidata.numeric
Member of
Contained by
header: extent
May containCharacter data only
Example
<measure unit="speechesquantity="75122" +
Schema Declaration
+element listRelation { tei_relation+ }

Appendix A.1.53 <measure>

<measure> (measure) contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
unit
StatusRequired
Legal values are:
speeches
words
tokens
optional value
quantity(quantity) specifies the number of the specified units that comprise the measurement
Derived fromatt.measurement
StatusRequired
Datatypeteidata.numeric
Member of
Contained by
header: extent
May containCharacter data only
Example
<measure unit="speechesquantity="75122"  xml:lang="sl">75.122 govorov</measure> <measure unit="speechesquantity="75122"  xml:lang="en">75,122 speeches</measure> <measure unit="wordsquantity="20190034"  xml:lang="sl">20.190.034 besed</measure> <measure unit="wordsquantity="20190034" - xml:lang="en">20,190,034 words</measure>
Content model
xml:lang="en">20,190,034 words</measure>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element measure
 {
    tei_att.global.attribute.xmllang,
    attribute unit { "speeches" | "words" | "tokens" },
    attribute quantity { text },
    text
-}

Appendix A.1.54 <media>

<media> indicates the location of any form of external media such as an audio or video clip etc. [3.10. Graphics and Other Non-textual Components]
Modulecore — Formal specification
Attributesatt.resourced (@url) att.global (n, xml:lang, xml:base, xml:space, @xml:id) att.global.source (@source)
mimeType(MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type
Derived fromatt.internetMedia
StatusRequired
Datatype1–∞ occurrences of teidata.word separated by whitespace
Member of
Contained by
spoken: recording
May containEmpty element
Note

The attributes available for this element are not appropriate in all cases. For example, it makes no sense to specify the temporal duration of a graphic. Such errors are not currently detected.

The mimeType attribute must be used to specify the MIME media type of the resource specified by the url attribute.

Example
<recording type="audio"> +}

Appendix A.1.54 <media>

<media> indicates the location of any form of external media such as an audio or video clip etc. [3.10. Graphics and Other Non-textual Components]
Modulecore — Formal specification
Attributesatt.resourced (@url) att.global (n, xml:lang, xml:base, xml:space, @xml:id) att.global.source (@source)
mimeType(MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type
Derived fromatt.internetMedia
StatusRequired
Datatype1–∞ occurrences of teidata.word separated by whitespace
Member of
Contained by
spoken: recording
May containEmpty element
Note

The attributes available for this element are not appropriate in all cases. For example, it makes no sense to specify the temporal duration of a graphic. Such errors are not currently detected.

The mimeType attribute must be used to specify the MIME media type of the resource specified by the url attribute.

Example
<recording type="audio">  <media xml:id="ps2013-009-01-001-001.audio1"   mimeType="audio/mp3"   source="https://www.psp.cz/eknih/2013ps/audio/2014/05/07/2014050713581412.mp3" @@ -2141,11 +2150,11 @@   url="2013ps/audio/2014/05/07/2014050714181432.mp3"/> ... -</recording>
Content model
+</recording>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element media
 {
    tei_att.global.attribute.xmlid,
@@ -2153,14 +2162,14 @@
    tei_att.resourced.attributes,
    attribute mimeType { list { + } },
    empty
-}

Appendix A.1.55 <meeting>

<meeting> contains the formalized descriptive title for a meeting or conference, for use in a bibliographic description for an item derived from such a meeting, or as a heading or preamble to publications emanating from it. [3.12.2.2. Titles, Authors, and Editors]
Modulecore — Formal specification
Attributesatt.global (xml:id, xml:base, xml:space, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp) att.global.analytic (@ana)
Contained by
header: titleStmt
May containCharacter data only
ExampleThe specification of the particular sessions that the corpus or corpus component contains are encoded with <meeting>:
<meeting n="7corresp="#DZ" +}

Appendix A.1.55 <meeting>

<meeting> contains the formalized descriptive title for a meeting or conference, for use in a bibliographic description for an item derived from such a meeting, or as a heading or preamble to publications emanating from it. [3.12.2.2. Titles, Authors, and Editors]
Modulecore — Formal specification
Attributesatt.global (xml:id, xml:base, xml:space, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp) att.global.analytic (@ana)
Contained by
header: titleStmt
May containCharacter data only
ExampleThe specification of the particular sessions that the corpus or corpus component contains are encoded with <meeting>:
<meeting n="7corresp="#DZ"  ana="#parla.lower #parla.term #DZ.7">7. mandat</meeting> <meeting n="8corresp="#DZ" - ana="#parla.lower #parla.term #DZ.8">8. mandat</meeting>
Content model
ana="#parla.lower #parla.term #DZ.8">8. mandat</meeting>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element meeting
 {
    tei_att.global.attribute.n,
@@ -2168,14 +2177,14 @@
    tei_att.global.linking.attribute.corresp,
    tei_att.global.analytic.attribute.ana,
    text
-}

Appendix A.1.56 <name>

<name> (name, proper noun) contains a proper noun or noun phrase. [3.6.1. Referring Strings]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.personal (@full) att.canonical (@key, @ref) att.typed (type, @subtype)
type
StatusOptional
Legal values are:
PER
LOC
ORG
MISC
city
country
address
org
place
Member of
Contained by
analysis: s
core: name unit
corpus: setting
header: change
namesdates: placeName
May contain
analysis: pc w
core: date name num pb
character data
Note

Proper nouns referring to people, places, and organizations may be tagged instead with <persName>, <placeName>, or <orgName>, when the TEI module for names and dates is included.

ExampleElement <name> is used in the TEI header to specify location of the parliament:
<name type="place">Westminster</name> +}

Appendix A.1.56 <name>

<name> (name, proper noun) contains a proper noun or noun phrase. [3.6.1. Referring Strings]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.personal (@full) att.canonical (@key, @ref) att.typed (type, @subtype)
type
StatusOptional
Legal values are:
PER
LOC
ORG
MISC
city
country
address
org
place
Member of
Contained by
analysis: s
core: name unit
corpus: setting
header: change
namesdates: placeName
May contain
analysis: pc w
core: date name num pb
character data
Note

Proper nouns referring to people, places, and organizations may be tagged instead with <persName>, <placeName>, or <orgName>, when the TEI module for names and dates is included.

ExampleElement <name> is used in the TEI header to specify location of the parliament:
<name type="place">Westminster</name> <name type="city">London</name> -<name type="countrykey="GB">U.K.</name>
ExampleThe element is used in the TEI header to denote person's responsibility for changes:
<revisionDesc> +<name type="countrykey="GB">U.K.</name>
ExampleThe element is used in the TEI header to denote person's responsibility for changes:
<revisionDesc>  <change when="2021-06-11">   <name>Tomaž Erjavec</name>: Finalized encoding.</change>  <change when="2021-05-28">   <name>Tomaž Erjavec</name>: Built corpus.</change> -</revisionDesc>
ExampleThe element is also used to mark up Named Entities in the linguistically analysed corpus, in which case it should have the type attribute with one of the allowed values. It can also have a ref attribute to link it a definition:
... +</revisionDesc>
ExampleThe element is also used to mark up Named Entities in the linguistically analysed corpus, in which case it should have the type attribute with one of the allowed values. It can also have a ref attribute to link it a definition:
... <w lemma="andmsd="UPosTag=CCONJ">and</w> <name type="ORG"  ref="https://en.wikipedia.org/wiki/Westminster"> @@ -2184,7 +2193,7 @@ </name> <w lemma=",msd="UPosTag=PUNCT">,</w> ... -
Content model
+
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -2197,7 +2206,7 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element name
 {
    tei_att.global.attribute.xmlid,
@@ -2220,7 +2229,7 @@
     | "place"
    }?,
    ( tei_w | tei_pc | tei_name | tei_date | tei_num | tei_pb | text )+
-}

Appendix A.1.58 <namespace>

<namespace> (namespace) supplies the formal name of the namespace to which the elements documented by its children belong. [2.3.4. The Tagging Declaration]
Moduleheader — Formal specification
Attributes
name
StatusRequired
Legal values are:
http://www.tei-c.org/ns/1.0
Contained by
header: tagsDecl
May contain
header: tagUsage
ExampleTo distinguish the TEI elements from the possible use of elements from other namespaces, a <namespace> element giving the TEI namespace is introduced first:
<tagsDecl> +
Schema Declaration
+element nameLink { text }

Appendix A.1.58 <namespace>

<namespace> (namespace) supplies the formal name of the namespace to which the elements documented by its children belong. [2.3.4. The Tagging Declaration]
Moduleheader — Formal specification
Attributes
name
StatusRequired
Legal values are:
http://www.tei-c.org/ns/1.0
Contained by
header: tagsDecl
May contain
header: tagUsage
ExampleTo distinguish the TEI elements from the possible use of elements from other namespaces, a <namespace> element giving the TEI namespace is introduced first:
<tagsDecl>  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="textoccurs="414"/>   <tagUsage gi="bodyoccurs="414"/> @@ -2249,28 +2258,28 @@   <tagUsage gi="kinesicoccurs="560"/>   <tagUsage gi="descoccurs="10234"/>  </namespace> -</tagsDecl>
Content model
+</tagsDecl>
Content model
 <content>
  <elementRef key="tagUsage" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element namespace
 {
    attribute name { "http://www.tei-c.org/ns/1.0" },
    tei_tagUsage+
-}

Appendix A.1.59 <normalization>

<normalization> (normalization) indicates the extent of normalization or regularization of the original source carried out in converting it to electronic form. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> ... +}

Appendix A.1.59 <normalization>

<normalization> (normalization) indicates the extent of normalization or regularization of the original source carried out in converting it to electronic form. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> ... <normalization>   <p xml:lang="en">Text has not been normalised, except for spacing.</p>  </normalization> ... -</editorialDecl>
Content model
+</editorialDecl>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element normalization { tei_p+ }

Appendix A.1.60 <note>

<note> (note) contains a note or annotation. [3.9.1. Notes and Simple Annotation 2.2.6. The Notes Statement 3.12.2.8. Notes and Statement of Language 9.3.5.4. Notes within Entries]
Modulecore — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.typed (type, @subtype)
type
StatusRecommended
Sample values include:
narrative
Description in the third person of events taking place in the meeting, e.g. "Mr X. takes the Chair".
summary
Summaries of speeches that are individually not interesting, e.g. "Question put and agreed to".
speaker
Name, role and possible description of a person doing the speech
vote
Outcome of a vote
location
The location of the speaker, who was not on the podium
date
Date of the session
president
Chairman of a meeting
comment
Comment of parliamentary reporter
time
Date and time of the beginning and end of the debate
quorum
The presence of the members of parliament
debate
Comments on the conduct of debates
Member of
Contained by
analysis: s
core: unit
linking: seg
namesdates: state
spoken: u
textstructure: div
May contain
core: pb time
character data
Example<note> element is used to encode transcriber comments such as who spoke, what the time was, interruptions, notes on what is happening in the chamber, results of voting etc.:
<note type="speaker">The president, Dr. Milan Brglez:</note> +
Schema Declaration
+element normalization { tei_p+ }

Appendix A.1.60 <note>

<note> (note) contains a note or annotation. [3.9.1. Notes and Simple Annotation 2.2.6. The Notes Statement 3.12.2.8. Notes and Statement of Language 9.3.5.4. Notes within Entries]
Modulecore — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.typed (type, @subtype)
type
StatusRecommended
Sample values include:
narrative
Description in the third person of events taking place in the meeting, e.g. "Mr X. takes the Chair".
summary
Summaries of speeches that are individually not interesting, e.g. "Question put and agreed to".
speaker
Name, role and possible description of a person doing the speech
vote
Outcome of a vote
location
The location of the speaker, who was not on the podium
date
Date of the session
president
Chairman of a meeting
comment
Comment of parliamentary reporter
time
Date and time of the beginning and end of the debate
quorum
The presence of the members of parliament
debate
Comments on the conduct of debates
Member of
Contained by
analysis: s
core: unit
linking: seg
namesdates: state
spoken: u
textstructure: div
May contain
core: pb time
character data
Example<note> element is used to encode transcriber comments such as who spoke, what the time was, interruptions, notes on what is happening in the chamber, results of voting etc.:
<note type="speaker">The president, Dr. Milan Brglez:</note> ... <note type="time">The session began at 10 o'clock.</note> ... @@ -2278,13 +2287,13 @@ ... <note type="vote-noes">2 voted against the adoption of the measure.</note> ... -
ExampleThe <note> element can be further qualified by the <time> element to specify the date and time recorded in the note; and can also contain a page break, <pb>:
<note type="time">The session began <pb/> at <time when="2016-04-13T010:00:00">10 o'clock</time>.</note>
ExampleThe <note> element may also be used to mark any additional information on debate sections:
<div type="debateSection"> +
ExampleThe <note> element can be further qualified by the <time> element to specify the date and time recorded in the note; and can also contain a page break, <pb>:
<note type="time">The session began <pb/> at <time when="2016-04-13T010:00:00">10 o'clock</time>.</note>
ExampleThe <note> element may also be used to mark any additional information on debate sections:
<div type="debateSection">  <head>Business Before Questions</head>  <note>Death of a Member</note>  <u xml:id="ParlaMint-GB_2019-02-18-commons.u1">...</u> ... <note>End of debateSection.</note> -</div>
Content model
+</div>
Content model
 <content>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
@@ -2293,7 +2302,7 @@
   <elementRef key="time"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element note
 {
    tei_att.global.attribute.xmlid,
@@ -2302,12 +2311,12 @@
    tei_att.typed.attribute.subtype,
    attribute type { text }?,
    ( text | tei_pb | tei_time )*
-}

Appendix A.1.61 <num>

<num> (number) contains a number, written in any form. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.typed (type, @subtype)
typeindicates the type of numeric value.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
cardinal
absolute number, e.g. 21, 21.5
ordinal
ordinal number, e.g. 21st
fraction
fraction, e.g. one half or three-quarters
percentage
a percentage
Note

If a different typology is desired, other values can be used for this attribute.

Member of
Contained by
analysis: s
core: name unit
May contain
analysis: pc w
character data
Note

Detailed analyses of quantities and units of measure in historical documents may also use the feature structure mechanism described in chapter 18. Feature Structures. The <num> element is intended for use in simple applications.

ExampleThe element can be used for fine-grained Named Entities which include numbers:
<num ana="ne:n_" +}

Appendix A.1.61 <num>

<num> (number) contains a number, written in any form. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributesatt.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.typed (type, @subtype)
typeindicates the type of numeric value.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
cardinal
absolute number, e.g. 21, 21.5
ordinal
ordinal number, e.g. 21st
fraction
fraction, e.g. one half or three-quarters
percentage
a percentage
Note

If a different typology is desired, other values can be used for this attribute.

Member of
Contained by
analysis: s
core: name unit
May contain
analysis: pc w
character data
Note

Detailed analyses of quantities and units of measure in historical documents may also use the feature structure mechanism described in chapter 18. Feature Structures. The <num> element is intended for use in simple applications.

ExampleThe element can be used for fine-grained Named Entities which include numbers:
<num ana="ne:n_"  xml:id="ParlaMint-CZ_2018-11-13-ps2017-020-09-004-010.ne138">  <w xml:id="ParlaMint-CZ_2018-11-13-ps2017-020-09-004-010.u6.p17.s3.w12"   lemma="428"   msd="UPosTag=NUM|NumForm=Digit|NumType=Cardjoin="right">428</w> -</num>
Content model
+</num>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -2316,7 +2325,7 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element num
 {
    tei_att.global.attribute.xmlid,
@@ -2325,7 +2334,7 @@
    tei_att.typed.attribute.subtype,
    attribute type { "cardinal" | "ordinal" | "fraction" | "percentage" }?,
    ( tei_w | tei_pc | text )+
-}

Appendix A.1.62 <occupation>

<occupation> (occupation) contains an informal description of a person's trade, profession or occupation. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Contained by
namesdates: person
May containCharacter data only
Note

The content of this element may be used as an alternative to the more formal specification made possible by its attributes; it may also be used to supplement the formal specification with commentary or clarification.

Example
<person n="2678xml:id="SimeonovValeri"> +}

Appendix A.1.62 <occupation>

<occupation> (occupation) contains an informal description of a person's trade, profession or occupation. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Contained by
namesdates: person
May containCharacter data only
Note

The content of this element may be used as an alternative to the more formal specification made possible by its attributes; it may also be used to supplement the formal specification with commentary or clarification.

Example
<person n="2678xml:id="SimeonovValeri">  <persName xml:lang="bg">   <forename>Валери</forename>   <surname>Симеонов</surname> @@ -2338,11 +2347,11 @@  <occupation>политик</occupation> ... -</person>
Content model
+</person>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element occupation
 {
    tei_att.global.attribute.xmllang,
@@ -2350,7 +2359,7 @@
    tei_att.datable.w3c.attribute.from,
    tei_att.datable.w3c.attribute.to,
    text
-}

Appendix A.1.63 <org>

<org> (organization) provides information about an identifiable organisation such as the government, political party, ministry etc. [13.3.3. Organizational Data]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.global.analytic (@ana)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
role
StatusRequired
Legal values are:
country
federatedState
republic
government
ministry
parliament
politicalParty
parliamentaryGroup
conferenceOfChairs
boardOfParliament
ngo
institution
senate
committee
subcommittee
commission
delegation
supervisoryBoard
workingGroup
interparliamentaryFriendshipGroup
nationalCouncil
chamberOfThePeople
chamberOfTheNations
europeanCommission
europeanParliament
europeanInstitution
internationalOrganisation
boardOfDirectors
ethnicCommunity
Contained by
namesdates: listOrg
May contain
core: desc head
header: idno
Example
<org xml:id="government.BE" +}

Appendix A.1.63 <org>

<org> (organization) provides information about an identifiable organisation such as the government, political party, ministry etc. [13.3.3. Organizational Data]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.global.analytic (@ana)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
role
StatusRequired
Legal values are:
country
federatedState
republic
government
ministry
parliament
politicalParty
parliamentaryGroup
conferenceOfChairs
boardOfParliament
ngo
institution
senate
committee
subcommittee
commission
delegation
supervisoryBoard
workingGroup
interparliamentaryFriendshipGroup
nationalCouncil
chamberOfThePeople
chamberOfTheNations
europeanCommission
europeanParliament
europeanInstitution
internationalOrganisation
boardOfDirectors
ethnicCommunity
Contained by
namesdates: listOrg
May contain
core: desc head
header: idno
Example
<org xml:id="government.BE"  role="government">  <orgName xml:lang="enfull="yes">Federal Government of Belgium</orgName>  <orgName xml:lang="nlfull="yes">Federale regering</orgName> @@ -2365,7 +2374,7 @@  </event> ... -</org>
Example
<org xml:id="party.PS2" +</org>
Example
<org xml:id="party.PS2"  role="parliamentaryGroup">  <orgName full="yesxml:lang="sl">Pozitivna Slovenija</orgName>  <orgName full="yesxml:lang="en">Positive Slovenia</orgName> @@ -2377,7 +2386,7 @@   subtype="wikimedia">https://sl.wikipedia.org/wiki/Pozitivna_Slovenija</idno>  <idno type="URIxml:lang="en"   subtype="wikimedia">https://en.wikipedia.org/wiki/Positive_Slovenia</idno> -</org>
Content model
+</org>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="head" minOccurs="0"
@@ -2396,7 +2405,7 @@
    maxOccurs="unbounded"/>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element org
 {
    tei_att.global.attribute.xmllang,
@@ -2443,19 +2452,19 @@
       tei_listEvent?,
       tei_state*
    )
-}

Appendix A.1.64 <orgName>

<orgName> (organization name) contains an organisational name. [13.2.2. Organizational Names]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.canonical (key, @ref)
fromindicates the starting point of the period in standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusOptional
Datatypeteidata.temporal.w3c
Note

Used when "the same" party changes its name

toindicates the ending point of the period in standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusOptional
Datatypeteidata.temporal.w3c
Note

Used when "the same" party changes its name

full
StatusOptional
Legal values are:
yes
abb
Member of
Contained by
header: funder
namesdates: affiliation org
May containCharacter data only
Example
<funder> +}

Appendix A.1.64 <orgName>

<orgName> (organization name) contains an organisational name. [13.2.2. Organizational Names]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.canonical (key, @ref)
fromindicates the starting point of the period in standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusOptional
Datatypeteidata.temporal.w3c
Note

Used when "the same" party changes its name

toindicates the ending point of the period in standard form, e.g. yyyy-mm-dd.
Derived fromatt.datable.w3c
StatusOptional
Datatypeteidata.temporal.w3c
Note

Used when "the same" party changes its name

full
StatusOptional
Legal values are:
yes
abb
Member of
Contained by
header: funder
namesdates: affiliation org
May containCharacter data only
Example
<funder>  <orgName xml:lang="en">The CLARIN research infrastructure</orgName>  <orgName xml:lang="sl">Raziskovalna infrastruktura CLARIN</orgName> -</funder>
Example
<org xml:id="party.PS1" +</funder>
Example
<org xml:id="party.PS1"  role="parliamentaryGroup">  <orgName full="yesxml:lang="en">Positive Slovenia</orgName>  <orgName full="yesxml:lang="sl">Pozitivna Slovenija</orgName>  <orgName full="abbxml:lang="sl">PS</orgName> -</org>
Content model
+</org>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element orgName
 {
    tei_att.global.attribute.xmllang,
@@ -2464,11 +2473,11 @@
    attribute to { text }?,
    attribute full { "yes" | "abb" }?,
    text
-}

Appendix A.1.65 <p>

<p> (paragraph) marks paragraphs in prose. [3.1. Paragraphs 7.2.5. Speech Contents]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Member of
Contained by
May contain
core: ref
character data
Example
<projectDesc> +}

Appendix A.1.65 <p>

<p> (paragraph) marks paragraphs in prose. [3.1. Paragraphs 7.2.5. Speech Contents]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Member of
Contained by
May contain
core: ref
character data
Example
<projectDesc>  <p>   <ref target="https://www.clarin.eu/content/parlamint">ParlaMint</ref>  </p> -</projectDesc>
Example
<availability status="free"> +</projectDesc>
Example
<availability status="free">  <licence>http://creativecommons.org/licenses/by/4.0/</licence>  <p>This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref>.</p>  <p>This work is also licensed under the <ref target="https://www.parliament.uk/site-information/copyright-parliament/open-parliament-licence/">Open Parliament Licence v3.0</ref>.</p> @@ -2480,7 +2489,7 @@ </sch:report>
Schematron
<sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not( ancestor::tei:floatingText |parent::tei:figure |parent::tei:note )"> Abstract model violation: Lines may not contain higher-level structural elements such as div, p, or ab, unless p is a child of figure or note, or is a descendant of floatingText. -</sch:report>
Content model
+</sch:report>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -2488,12 +2497,12 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
-element p { tei_att.global.attribute.xmllang, ( tei_ref | text )+ }

Appendix A.1.66 <particDesc>

<particDesc> (participation description) describes the identifiable speakers and organisations in a ParlaMint corpus. This informations is given in the corpus root teiHeder. Note that the listPerson and listOrg elements are typically stored in separate files. [15.2. Contextual Information]
Modulecorpus — Formal specification
Contained by
header: profileDesc
May contain
derived-module-parlamint: include
namesdates: listOrg listPerson
Note

May contain a prose description organized as paragraphs, or a structured list of persons and person groups, with an optional formal specification of any relationships amongst them.

Example
<particDesc> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" +
Schema Declaration
+element p { tei_att.global.attribute.xmllang, ( tei_ref | text )+ }

Appendix A.1.66 <particDesc>

<particDesc> (participation description) describes the identifiable speakers and organisations in a ParlaMint corpus. This informations is given in the corpus root teiHeder. Note that the listPerson and listOrg elements are typically stored in separate files. [15.2. Contextual Information]
Modulecorpus — Formal specification
Contained by
header: profileDesc
May contain
derived-module-parlamint: include
namesdates: listOrg listPerson
Note

May contain a prose description organized as paragraphs, or a structured list of persons and person groups, with an optional formal specification of any relationships amongst them.

Example
<particDesc> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="href="ParlaMint-SI-listOrg.xml"/> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="href="ParlaMint-SI-listPerson.xml"/> -</particDesc>
Content model
+</particDesc>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <alternate minOccurs="1" maxOccurs="1">
@@ -2506,22 +2515,22 @@
   </alternate>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element particDesc
 {
    ( tei_listOrg | tei_include ), ( tei_listPerson | tei_include )
-}

Appendix A.1.67 <pb>

<pb> (page beginning) marks the beginning of a new page in a paginated document. [3.11.3. Milestone Elements]
Modulecore — Formal specification
Attributesatt.global (xml:lang, xml:base, xml:space, @xml:id, @n) att.global.linking (synch, next, prev, @corresp) att.global.source (@source)
Member of
Contained by
analysis: s
core: name note
linking: seg
spoken: u
textstructure: div
May containEmpty element
Note

A <pb> element should appear at the start of the page which it identifies. The global n attribute indicates the number or other value associated with this page. This will normally be the page number or signature printed on it, since the physical sequence number is implicit in the presence of the <pb> element itself.

The type attribute may be used to characterize the page break in any respect. The more specialized attributes break, ed, or edRef should be preferred when the intent is to indicate whether or not the page break is word-breaking, or to note the source from which it derives.

Example
<body> +}

Appendix A.1.67 <pb>

<pb> (page beginning) marks the beginning of a new page in a paginated document. [3.11.3. Milestone Elements]
Modulecore — Formal specification
Attributesatt.global (xml:lang, xml:base, xml:space, @xml:id, @n) att.global.linking (synch, next, prev, @corresp) att.global.source (@source)
Member of
Contained by
analysis: s
core: name note
linking: seg
spoken: u
textstructure: div
May containEmpty element
Note

A <pb> element should appear at the start of the page which it identifies. The global n attribute indicates the number or other value associated with this page. This will normally be the page number or signature printed on it, since the physical sequence number is implicit in the presence of the <pb> element itself.

The type attribute may be used to characterize the page break in any respect. The more specialized attributes break, ed, or edRef should be preferred when the intent is to indicate whether or not the page break is word-breaking, or to note the source from which it derives.

Example
<body>  <div type="debateSection">   <pb source="https://www.psp.cz/eknih/2013ps/stenprot/017schuz/s017357.htm"    n="1"    xml:id="ParlaMint-CZ_2014-10-01-ps2013-017-09-003-036.pb1corresp="#ps2013-017-09-003-036.audio1"/>    ...  </div> -</body>
Content model
+</body>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element pb
 {
    tei_att.global.attribute.xmlid,
@@ -2529,7 +2538,7 @@
    tei_att.global.linking.attribute.corresp,
    tei_att.global.source.attribute.source,
    empty
-}

Appendix A.1.68 <pc>

<pc> (punctuation character) contains a character or string of characters regarded as constituting a single punctuation mark. [17.1.2. Below the Word Level 17.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.global.analytic (@ana) att.linguistic (lemma, msd, @pos, @join) att.lexicographic.normalized (@norm)
xml:id
StatusRequired
DatatypeID
msd
StatusRequired
Datatypeteidata.text
Member of
Contained by
analysis: s
May containCharacter data only
Example
<s> +}

Appendix A.1.68 <pc>

<pc> (punctuation character) contains a character or string of characters regarded as constituting a single punctuation mark. [17.1.2. Below the Word Level 17.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.global.analytic (@ana) att.linguistic (lemma, msd, @pos, @join) att.lexicographic.normalized (@norm)
xml:id
StatusRequired
DatatypeID
msd
StatusRequired
Datatypeteidata.text
Member of
Contained by
analysis: s
May containCharacter data only
Example
<s>  <w lemma="I"   msd="UPosTag=PRON|Case=Nom|Number=Sing|Person=1|PronType=Prspos="PRP">I</w>  <w lemma="support" @@ -2539,11 +2548,11 @@  <w lemma="amendment"   msd="UPosTag=NOUN|Number=Singpos="NNjoin="right">amendment</w>  <pc msd="UPosTag=PUNCTpos=".">.</pc> -</s>
Content model
+</s>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element pc
 {
    tei_att.global.attribute.xmllang,
@@ -2554,13 +2563,13 @@
    attribute xml:id { text },
    attribute msd { text },
    text
-}

Appendix A.1.69 <persName>

<persName> (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including one or more of the person's forenames, surnames, honorifics, added names, etc. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.datable.w3c (when, notBefore, notAfter, @from, @to) att.canonical (key, @ref)
Member of
Contained by
core: respStmt
namesdates: person
May contain
core: term
character data
Note

Special persons (like 'anonymous', 'group' etc.) have their name in <term>.

Example
<persName> +}

Appendix A.1.69 <persName>

<persName> (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including one or more of the person's forenames, surnames, honorifics, added names, etc. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang) att.datable.w3c (when, notBefore, notAfter, @from, @to) att.canonical (key, @ref)
Member of
Contained by
core: respStmt
namesdates: person
May contain
core: term
character data
Note

Special persons (like 'anonymous', 'group' etc.) have their name in <term>.

Example
<persName>  <surname>Broekers-Knol</surname>  <forename>Ankie</forename> -</persName>
Example
<respStmt> +</persName>
Example
<respStmt>  <persName>Matthew Coole</persName>  <resp>TEI corpus encoding</resp> -</respStmt>
Content model
+</respStmt>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <alternate minOccurs="1"
@@ -2585,7 +2594,7 @@
   </alternate>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element persName
 {
    tei_att.global.attribute.xmllang,
@@ -2603,7 +2612,7 @@
     | tei_term+
     | ( text )
    )
-}

Appendix A.1.70 <person>

<person> (person) provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source. [13.3.2. The Person Element 15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang)
Contained by
namesdates: listPerson
May contain
Note

May contain either a prose description organized as paragraphs, or a sequence of more specific demographic elements drawn from the model.personPart class.

Example
<person xml:id="AliciaKearns"> +}

Appendix A.1.70 <person>

<person> (person) provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source. [13.3.2. The Person Element 15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang)
Contained by
namesdates: listPerson
May contain
Note

May contain either a prose description organized as paragraphs, or a sequence of more specific demographic elements drawn from the model.personPart class.

Example
<person xml:id="AliciaKearns">  <persName>   <forename>Alicia</forename>   <forename>Alexandra Martha</forename> @@ -2615,7 +2624,7 @@  <affiliation from="2019-12-12"   ref="#party.CONrole="member"/>  <idno subtype="contacttype="URI">https://members.parliament.uk/member/4805/contact</idno> -</person>
Example
<person xml:id="AdamowiczPiotr"> +</person>
Example
<person xml:id="AdamowiczPiotr">  <persName>   <forename>Piotr</forename>   <surname>Adamowicz</surname> @@ -2623,7 +2632,7 @@  <birth when="1961-06-26">26.06.1961</birth>  <sex value="M"/>  <affiliation role="memberref="#party.KO"/> -</person>
Content model
+</person>
Content model
 <content>
  <sequence minOccurs="1" maxOccurs="1">
   <elementRef key="persName" minOccurs="1"
@@ -2649,7 +2658,7 @@
   </alternate>
  </sequence>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element person
 {
    tei_att.global.attribute.xmlid,
@@ -2668,11 +2677,11 @@
        | tei_figure*
       )*
    )
-}

Appendix A.1.71 <placeName>

<placeName> (place name) contains an absolute or relative place name. [13.2.3. Place Names]
Modulenamesdates — Formal specification
Attributesatt.canonical (key, @ref)
Member of
Contained by
namesdates: birth death
May contain
core: name
character data
Example
<birth when="1966-03-22"> +}

Appendix A.1.71 <placeName>

<placeName> (place name) contains an absolute or relative place name. [13.2.3. Place Names]
Modulenamesdates — Formal specification
Attributesatt.canonical (key, @ref)
Member of
Contained by
namesdates: birth death
May contain
core: name
character data
Example
<birth when="1966-03-22">  <placeName ref="https://www.geonames.org/2523918">Palermo</placeName> -</birth>
Example
<birth when="1952-06-06"> +</birth>
Example
<birth when="1952-06-06">  <placeName>Tours-Saint-Symphorien, Indre-et-Loire</placeName> -</birth>
Content model
+</birth>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <elementRef key="name" minOccurs="0"
@@ -2680,28 +2689,28 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
-element placeName { tei_att.canonical.attribute.ref, ( tei_name? | text ) }

Appendix A.1.72 <prefixDef>

<prefixDef> (prefix definition) defines a prefixing scheme used in teidata.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs. [16.2.3. Using Abbreviated Pointers]
Moduleheader — Formal specification
Attributes
matchPatternspecifies a regular expression against which the values of other attributes can be matched.
Derived fromatt.patternReplacement
StatusRequired
Datatypeteidata.pattern
replacementPatternspecifies a ‘replacement pattern’, that is, the skeleton of a relative or absolute URI containing references to groups in the matchPattern which, once subpattern substitution has been performed, complete the URI.
Derived fromatt.patternReplacement
StatusRequired
Datatypeteidata.replacement
Note

Using TEI-defined XPointer schemes is not allowed.

identsupplies a name which functions as the prefix for an abbreviated pointing scheme such as a private URI scheme. The prefix constitutes the text preceding the first colon.
StatusRequired
Datatypeteidata.prefix
Note

The value is limited to teidata.prefix so that it may be mapped directly to a URI prefix.

Contained by
May contain
core: p
Note

The abbreviated pointer may be dereferenced to produce either an absolute or a relative URI reference. In the latter case it is combined with the value of xml:base in force at the place where the pointing attribute occurs to form an absolute URI in the usual manner as prescribed by XML Base.

Example
<prefixDef ident="mtematchPattern="(.+)" +
Schema Declaration
+element placeName { tei_att.canonical.attribute.ref, ( tei_name? | text ) }

Appendix A.1.72 <prefixDef>

<prefixDef> (prefix definition) defines a prefixing scheme used in teidata.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs. [16.2.3. Using Abbreviated Pointers]
Moduleheader — Formal specification
Attributes
matchPatternspecifies a regular expression against which the values of other attributes can be matched.
Derived fromatt.patternReplacement
StatusRequired
Datatypeteidata.pattern
replacementPatternspecifies a ‘replacement pattern’, that is, the skeleton of a relative or absolute URI containing references to groups in the matchPattern which, once subpattern substitution has been performed, complete the URI.
Derived fromatt.patternReplacement
StatusRequired
Datatypeteidata.replacement
Note

Using TEI-defined XPointer schemes is not allowed.

identsupplies a name which functions as the prefix for an abbreviated pointing scheme such as a private URI scheme. The prefix constitutes the text preceding the first colon.
StatusRequired
Datatypeteidata.prefix
Note

The value is limited to teidata.prefix so that it may be mapped directly to a URI prefix.

Contained by
May contain
core: p
Note

The abbreviated pointer may be dereferenced to produce either an absolute or a relative URI reference. In the latter case it is combined with the value of xml:base in force at the place where the pointing attribute occurs to form an absolute URI in the usual manner as prescribed by XML Base.

Example
<prefixDef ident="mtematchPattern="(.+)"  replacementPattern="http://nl.ijs.si/ME/V6/msd/tables/msd-fslib-hbs.xml#$1">  <p xml:lang="en">Private URIs with this prefix point to feature-structure elements defining the Serbocroatian MULTEXT-East Version 6 MSDs.</p> -</prefixDef>
Content model
+</prefixDef>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element prefixDef
 {
    attribute matchPattern { text },
    attribute replacementPattern { text },
    attribute ident { text },
    tei_p+
-}

Appendix A.1.73 <profileDesc>

<profileDesc> (text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting. [2.4. The Profile Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Contained by
header: teiHeader
May contain
Note

Although the content model permits it, it is rarely meaningful to supply multiple occurrences for any of the child elements of <profileDesc> unless these are documenting multiple texts.

ExampleGeneral structure of the element <profileDesc>:
<profileDesc> +}

Appendix A.1.73 <profileDesc>

<profileDesc> (text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting. [2.4. The Profile Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Contained by
header: teiHeader
May contain
Note

Although the content model permits it, it is rarely meaningful to supply multiple occurrences for any of the child elements of <profileDesc> unless these are documenting multiple texts.

ExampleGeneral structure of the element <profileDesc>:
<profileDesc>  <settingDesc>...</settingDesc>  <textClass>...</textClass>  <particDesc>...</particDesc>  <langUsage>...</langUsage> -</profileDesc>
ExampleProfile description of a corpus root:
<profileDesc> +</profileDesc>
ExampleProfile description of a corpus root:
<profileDesc>  <settingDesc>   <setting>    <name type="address">Šubičeva ulica 4</name> @@ -2730,7 +2739,7 @@    <language ident="enxml:lang="en">English</language>   </langUsage>  </langUsage> -</profileDesc>
ExampleProfile description for a corpus component. In contrast to the corpus root, only the first, the <settingDesc> is used in corpus components.
<profileDesc> +</profileDesc>
ExampleProfile description for a corpus component. In contrast to the corpus root, only the first, the <settingDesc> is used in corpus components.
<profileDesc>  <settingDesc>   <setting>    <name type="city">Ljubljana</name> @@ -2739,7 +2748,7 @@     ana="#parla.sitting">28.8.2014</date>   </setting>  </settingDesc> -</profileDesc>
Content model
+</profileDesc>
Content model
 <content>
  <elementRef key="settingDesc"/>
  <elementRef key="textClass" minOccurs="0"
@@ -2749,37 +2758,37 @@
  <elementRef key="langUsage" minOccurs="0"
   maxOccurs="1"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element profileDesc
 {
    tei_settingDesc,
    tei_textClass?,
    tei_particDesc?,
    tei_langUsage?
-}

Appendix A.1.74 <projectDesc>

<projectDesc> (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected. [2.3.1. The Project Description 2.3. The Encoding Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
core: p
Example
<projectDesc> +}

Appendix A.1.74 <projectDesc>

<projectDesc> (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected. [2.3.1. The Project Description 2.3. The Encoding Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
core: p
Example
<projectDesc>  <p xml:lang="sl">Glavni cilji projekta <ref target="https://www.clarin.eu/content/parlamint">ParlaMint</ref> so    (1) izdelati večjezično množico na enak način kodiranih korpusov    zapiskov parlamentarnih sej, ...</p>  <p xml:lang="en">The <ref target="https://www.clarin.eu/content/parlamint">ParlaMint</ref>    project aims to (1) create a multilingual set of uniformly encoded    comparable corpora of parliamentary proceedings, ...</p> -</projectDesc>
Content model
+</projectDesc>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element projectDesc { tei_p+ }

Appendix A.1.75 <pubPlace>

<pubPlace> (publication place) contains the name of the place where a bibliographic item was published. [3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore — Formal specification
Contained by
May contain
core: ref
character data
Example
<pubPlace> +
Schema Declaration
+element projectDesc { tei_p+ }

Appendix A.1.75 <pubPlace>

<pubPlace> (publication place) contains the name of the place where a bibliographic item was published. [3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore — Formal specification
Contained by
May contain
core: ref
character data
Example
<pubPlace>  <ref target="https://github.com/clarin-eric/ParlaMint">https://github.com/clarin-eric/ParlaMint</ref> -</pubPlace>
Content model
+</pubPlace>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <elementRef key="ref"/>
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
-element pubPlace { tei_ref | text }

Appendix A.1.76 <publicationStmt>

<publicationStmt> (publication statement) groups information concerning the publication or distribution of an electronic or other text. [2.2.4. Publication, Distribution, Licensing, etc. 2.2. The File Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
Note

Where a publication statement contains several members of the model.publicationStmtPart.agency or model.publicationStmtPart.detail classes rather than one or more paragraphs or anonymous blocks, care should be taken to ensure that the repeated elements are presented in a meaningful order. It is a conformance requirement that elements supplying information about publication place, address, identifier, availability, and date be given following the name of the publisher, distributor, or authority concerned, and preferably in that order.

Example
<publicationStmt> +
Schema Declaration
+element pubPlace { tei_ref | text }

Appendix A.1.76 <publicationStmt>

<publicationStmt> (publication statement) groups information concerning the publication or distribution of an electronic or other text. [2.2.4. Publication, Distribution, Licensing, etc. 2.2. The File Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
Note

Where a publication statement contains several members of the model.publicationStmtPart.agency or model.publicationStmtPart.detail classes rather than one or more paragraphs or anonymous blocks, care should be taken to ensure that the repeated elements are presented in a meaningful order. It is a conformance requirement that elements supplying information about publication place, address, identifier, availability, and date be given following the name of the publisher, distributor, or authority concerned, and preferably in that order.

Example
<publicationStmt>  <publisher>   <orgName xml:lang="sl">Raziskovalna infrastrukutra CLARIN</orgName>   <orgName xml:lang="en">CLARIN research infrastructure</orgName> @@ -2792,7 +2801,7 @@   <p xml:lang="en">This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref>.</p>  </availability>  <date when="2021-06-11">11. 6. 2021</date> -</publicationStmt>
Content model
+</publicationStmt>
Content model
 <content>
  <elementRef key="publisher"/>
  <elementRef key="idno"/>
@@ -2801,7 +2810,7 @@
  <elementRef key="availability"/>
  <elementRef key="date"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element publicationStmt
 {
    tei_publisher,
@@ -2809,10 +2818,10 @@
    tei_pubPlace?,
    tei_availability,
    tei_date
-}

Appendix A.1.77 <publisher>

<publisher> (publisher) provides the name of the organisation responsible for the publication or distribution of a bibliographic item. [3.12.2.4. Imprint, Size of a Document, and Reprint Information 2.2.4. Publication, Distribution, Licensing, etc.]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: bibl
May contain
core: ref
namesdates: orgName
character data
Note

Use the full form of the name by which a company is usually referred to, rather than any abbreviation of it which may appear on a title page

Example
<publisher> +}

Appendix A.1.77 <publisher>

<publisher> (publisher) provides the name of the organisation responsible for the publication or distribution of a bibliographic item. [3.12.2.4. Imprint, Size of a Document, and Reprint Information 2.2.4. Publication, Distribution, Licensing, etc.]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: bibl
May contain
core: ref
namesdates: orgName
character data
Note

Use the full form of the name by which a company is usually referred to, rather than any abbreviation of it which may appear on a title page

Example
<publisher>  <orgName>CLARIN research infrastructure</orgName>  <ref target="https://www.clarin.eu/">www.clarin.eu</ref> -</publisher>
Content model
+</publisher>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <sequence minOccurs="1" maxOccurs="1">
@@ -2824,34 +2833,34 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element publisher
 {
    tei_att.global.attribute.xmllang,
    ( ( tei_orgName+, tei_ref? ) | text )
-}

Appendix A.1.78 <quotation>

<quotation> (quotation) specifies editorial practice adopted with respect to quotation marks in the original. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> ... +}

Appendix A.1.78 <quotation>

<quotation> (quotation) specifies editorial practice adopted with respect to quotation marks in the original. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> ... <quotation>   <p xml:lang="en">Quotation marks have been left in the text and are not explicitly marked up.</p>  </quotation> </editorialDecl>
Schematron
-<sch:report test="not(@marks) and not (tei:p)">On <sch:name/>, either the @marks attribute should be used, or a paragraph of description provided</sch:report>
Content model
+<sch:report test="not(@marks) and not (tei:p)">On <sch:name/>, either the @marks attribute should be used, or a paragraph of description provided</sch:report>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element quotation { tei_p+ }

Appendix A.1.79 <recording>

<recording> (recording event) provides details of an audio or video recording event used as the source of a spoken text, either directly or from a public broadcast. [8.2. Documenting the Source of Transcribed Speech 15.3.2. Declarable Elements]
Modulespoken — Formal specification
Attributes
typethe kind of recording.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Legal values are:
audio
audio recording[Default]
video
audio and video recording
Contained by
May contain
core: media
Note

The dur attribute is used to indicate the original duration of the recording.

Example
<recording type="audio"> +
Schema Declaration
+element quotation { tei_p+ }

Appendix A.1.79 <recording>

<recording> (recording event) provides details of an audio or video recording event used as the source of a spoken text, either directly or from a public broadcast. [8.2. Documenting the Source of Transcribed Speech 15.3.2. Declarable Elements]
Modulespoken — Formal specification
Attributes
typethe kind of recording.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Legal values are:
audio
audio recording[Default]
video
audio and video recording
Contained by
May contain
core: media
Note

The dur attribute is used to indicate the original duration of the recording.

Example
<recording type="audio">  <media xml:id="ps2013-044-02-000-000.audio1"   mimeType="audio/mp3"   source="https://www.psp.cz/eknih/2013ps/audio/2016/04/13/2016041308580912.mp3"   url="2013ps/audio/2016/04/13/2016041308580912.mp3"/> -</recording>
Content model
+</recording>
Content model
 <content>
  <elementRef key="media" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element recording { attribute type { "audio" | "video" }?, tei_media+ }

Appendix A.1.80 <recordingStmt>

<recordingStmt> (recording statement) describes a set of recordings used as the basis for transcription of a spoken text. [8.2. Documenting the Source of Transcribed Speech 2.2.7. The Source Description]
Modulespoken — Formal specification
Contained by
header: sourceDesc
May contain
spoken: recording
Example
<recordingStmt> +
Schema Declaration
+element recording { attribute type { "audio" | "video" }?, tei_media+ }

Appendix A.1.80 <recordingStmt>

<recordingStmt> (recording statement) describes a set of recordings used as the basis for transcription of a spoken text. [8.2. Documenting the Source of Transcribed Speech 2.2.7. The Source Description]
Modulespoken — Formal specification
Contained by
header: sourceDesc
May contain
spoken: recording
Example
<recordingStmt>  <recording type="audio">   <media xml:id="ps2017-020-09-004-010.audio1"    mimeType="audio/mp3" @@ -2867,13 +2876,13 @@    url="2017ps/audio/2018/11/13/2018111318281842.mp3"/>    ...  </recording> -</recordingStmt>
Content model
+</recordingStmt>
Content model
 <content>
  <elementRef key="recording" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element recordingStmt { tei_recording+ }

Appendix A.1.81 <ref>

<ref> (reference) defines a reference to another location, possibly modified by additional text or comment. [3.7. Simple Links and Cross-References 16.1. Links]
Modulecore — Formal specification
Attributes
targetspecifies the destination of the reference by supplying one or more URI References
Derived fromatt.pointing
StatusRecommended
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Member of
Contained by
May containCharacter data only
Note

The target and cRef attributes are mutually exclusive.

Example
<projectDesc> +
Schema Declaration
+element recordingStmt { tei_recording+ }

Appendix A.1.81 <ref>

<ref> (reference) defines a reference to another location, possibly modified by additional text or comment. [3.7. Simple Links and Cross-References 16.1. Links]
Modulecore — Formal specification
Attributes
targetspecifies the destination of the reference by supplying one or more URI References
Derived fromatt.pointing
StatusRecommended
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Member of
Contained by
May containCharacter data only
Note

The target and cRef attributes are mutually exclusive.

Example
<projectDesc>  <p>   <ref target="https://www.clarin.eu/content/parlamint">ParlaMint</ref> is a    project that aims to create a multilingual set of comparable corpora of @@ -2882,26 +2891,26 @@ </projectDesc>
Schematron
<sch:report test="@target and @cRef">Only one of the attributes @target' and @cRef' may be supplied on <sch:name/> -</sch:report>
Content model
+</sch:report>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element ref { attribute target { list { + } }?, text }

Appendix A.1.82 <relation>

<relation> (relationship) describes a relationship between two organisations. [13.3.2.3. Personal Relationships]
Modulenamesdates — Formal specification
Attributesatt.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
name
StatusRequired
Legal values are:
coalition
opposition
renaming
successor
representing
activeidentifies the ‘active’ participants in a non-mutual relationship, or all the participants in a mutual one.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
mutualsupplies a list of participants amongst all of whom the relationship holds equally.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
passiveidentifies the ‘passive’ participants in a non-mutual relationship.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
namesdates: listRelation
May containEmpty element
Note

Only one of the attributes active and mutual may be supplied; the attribute passive may be supplied only if the attribute active is supplied. Not all of these constraints can be enforced in all schema languages.

ExampleSpecification of coalition and opposition political parties (or parliamentary groups) in a given time period and legislative period:
<relation name="coalition" +
Schema Declaration
+element ref { attribute target { list { + } }?, text }

Appendix A.1.82 <relation>

<relation> (relationship) describes a relationship between two organisations. [13.3.2.3. Personal Relationships]
Modulenamesdates — Formal specification
Attributesatt.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
name
StatusRequired
Legal values are:
coalition
opposition
renaming
successor
representing
activeidentifies the ‘active’ participants in a non-mutual relationship, or all the participants in a mutual one.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
mutualsupplies a list of participants amongst all of whom the relationship holds equally.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
passiveidentifies the ‘passive’ participants in a non-mutual relationship.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
namesdates: listRelation
May containEmpty element
Note

Only one of the attributes active and mutual may be supplied; the attribute passive may be supplied only if the attribute active is supplied. Not all of these constraints can be enforced in all schema languages.

ExampleSpecification of coalition and opposition political parties (or parliamentary groups) in a given time period and legislative period:
<relation name="coalition"  mutual="#MR #OpenVld #N-VA #CD_en_Vfrom="2014-10-11to="2018-12-09"  ana="#period_54"/> <relation name="opposition"  active="#Ecolo #cdH #DéFi #Vuye_Wouters #sp.a #PP #PS #PTB #FDFpassive="#government.BE" - from="2014-10-11to="2018-12-09ana="#period_54"/>
ExampleSpecification of parliamentary group representing political parties in the parliament:
<relation name="representing" + from="2014-10-11to="2018-12-09ana="#period_54"/>
ExampleSpecification of parliamentary group representing political parties in the parliament:
<relation name="representing"  active="#parliamentaryGroup.CSSD.1107"  passive="#politicalParty.CSSD.153 #politicalParty.ENO.1from="2013-10-29to="2017-10-26"/>
Schematron
<sch:assert test="@ref or @key or @name">One of the attributes 'name', 'ref' or 'key' must be supplied</sch:assert>
Schematron
<sch:report test="@active and @mutual">Only one of the attributes @active and @mutual may be supplied</sch:report>
Schematron
-<sch:report test="@passive and not(@active)">the attribute 'passive' may be supplied only if the attribute 'active' is supplied</sch:report>
Content model
+<sch:report test="@passive and not(@active)">the attribute 'passive' may be supplied only if the attribute 'active' is supplied</sch:report>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element relation
 {
    tei_att.global.analytic.attribute.ana,
@@ -2915,19 +2924,19 @@
    ( attribute active { list { + } }? | attribute mutual { list { + } }? ),
    attribute passive { list { + } }?,
    empty
-}

Appendix A.1.83 <resp>

<resp> (responsibility) contains a phrase describing the nature of a person's intellectual responsibility, or an organisation's role in the production or distribution of a work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: respStmt
May containCharacter data only
Note

The attribute ref, inherited from the class att.canonical may be used to indicate the kind of responsibility in a normalized form by referring directly to a standardized list of responsibility types, such as that maintained by a naming authority, for example the list maintained at http://www.loc.gov/marc/relators/relacode.html for bibliographic usage.

Example
<respStmt> +}

Appendix A.1.83 <resp>

<resp> (responsibility) contains a phrase describing the nature of a person's intellectual responsibility, or an organisation's role in the production or distribution of a work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: respStmt
May containCharacter data only
Note

The attribute ref, inherited from the class att.canonical may be used to indicate the kind of responsibility in a normalized form by referring directly to a standardized list of responsibility types, such as that maintained by a naming authority, for example the list maintained at http://www.loc.gov/marc/relators/relacode.html for bibliographic usage.

Example
<respStmt>  <persName>Andrej Pančur</persName>  <resp>Kodiranje TEI</resp>  <resp xml:lang="en">TEI corpus encoding</resp> -</respStmt>
Content model
+</respStmt>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element resp { tei_att.global.attribute.xmllang, text }

Appendix A.1.84 <respStmt>

<respStmt> (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply. May also be used to encode information about individuals or organisations which have played a role in the production or distribution of a bibliographic work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Contained by
header: titleStmt
May contain
core: resp
namesdates: persName
Example
<respStmt> +
Schema Declaration
+element resp { tei_att.global.attribute.xmllang, text }

Appendix A.1.84 <respStmt>

<respStmt> (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply. May also be used to encode information about individuals or organisations which have played a role in the production or distribution of a bibliographic work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Contained by
header: titleStmt
May contain
core: resp
namesdates: persName
Example
<respStmt>  <persName>Matthew Coole</persName>  <resp>Data retrieval, Parla-CLARIN TEI XML corpus encoding and linguistic annotation.</resp> -</respStmt>
Example
<respStmt> +</respStmt>
Example
<respStmt>  <persName ref="https://orcid.org/0000-0003-3063-2239">Tommaso Agnoloni</persName>  <persName ref="https://orcid.org/0000-0002-8126-6294">Francesca Frontini</persName>  <persName ref="https://orcid.org/0000-0002-2953-8619">Simonetta Montemagni</persName> @@ -2951,39 +2960,39 @@  <resp xml:lang="en">Cleaning, normalisation and conversion to ParlaMint TEI XML</resp> </respStmt> ... -
Content model
+
Content model
 <content>
  <elementRef key="persName" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="resp" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element respStmt { tei_persName+, tei_resp+ }

Appendix A.1.85 <revisionDesc>

<revisionDesc> (revision description) summarizes the revision history for a file [2.6. The Revision Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
header: teiHeader
May contain
header: change
Note

If present on this element, the status attribute should indicate the current status of the document. The same attribute may appear on any <change> to record the status at the time of that change. Conventionally <change> elements should be given in reverse date order, with the most recent change at the start of the list.

Example
<revisionDesc> +
Schema Declaration
+element respStmt { tei_persName+, tei_resp+ }

Appendix A.1.85 <revisionDesc>

<revisionDesc> (revision description) summarizes the revision history for a file [2.6. The Revision Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
header: teiHeader
May contain
header: change
Note

If present on this element, the status attribute should indicate the current status of the document. The same attribute may appear on any <change> to record the status at the time of that change. Conventionally <change> elements should be given in reverse date order, with the most recent change at the start of the list.

Example
<revisionDesc>  <change when="2021-06-11">   <name>Tomaž Erjavec</name>: Finalized encoding.</change>  <change when="2021-05-28">   <name>Tomaž Erjavec</name>: Built corpus.</change> -</revisionDesc>
Content model
+</revisionDesc>
Content model
 <content>
  <elementRef key="change" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element revisionDesc { tei_att.global.attribute.xmllang, tei_change+ }

Appendix A.1.86 <roleName>

<roleName> (role name) contains a name component which indicates that the referent has a particular role or position in society, such as an official title or rank. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributes
xml:lang
StatusOptional
Datatypeteidata.language
Member of
Contained by
namesdates: affiliation persName
May containCharacter data only
Note

A <roleName> may be distinguished from an <addName> by virtue of the fact that, like a title, it typically exists independently of its holder.

Example
<persName> +
Schema Declaration
+element revisionDesc { tei_att.global.attribute.xmllang, tei_change+ }

Appendix A.1.86 <roleName>

<roleName> (role name) contains a name component which indicates that the referent has a particular role or position in society, such as an official title or rank. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributes
xml:lang
StatusOptional
Datatypeteidata.language
Member of
Contained by
namesdates: affiliation persName
May containCharacter data only
Note

A <roleName> may be distinguished from an <addName> by virtue of the fact that, like a title, it typically exists independently of its holder.

Example
<persName>  <surname>Murgel</surname>  <forename>Jasna</forename>  <roleName>dr.</roleName> -</persName>
Example
<affiliation role="ministerref="#GOV" +</persName>
Example
<affiliation role="ministerref="#GOV"  from="2020-08-01">  <roleName xml:lang="sl">Minister za obrambo</roleName>  <roleName xml:lang="en">Minister of Defence</roleName> -</affiliation>
Content model
+</affiliation>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element roleName { attribute xml:lang { text }?, text }

Appendix A.1.87 <s>

<s> (s-unit) contains a sentence-like division of a text. [17.1. Linguistic Segment Categories 8.4.1. Segmentation]
Moduleanalysis — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp)
Member of
Contained by
linking: seg
May contain
analysis: pc w
linking: linkGrp
Note

The <s> element may be used to mark orthographic sentences, or any other segmentation of a text, provided that the segmentation is end-to-end, complete, and non-nesting. For segmentation which is partial or recursive, the <seg> should be used instead.

The type attribute may be used to indicate the type of segmentation intended, according to any convenient typology.

Example
<s xml:id="ParlaMint-GB_2017-10-30-lords.seg4.1"> +
Schema Declaration
+element roleName { attribute xml:lang { text }?, text }

Appendix A.1.87 <s>

<s> (s-unit) contains a sentence-like division of a text. [17.1. Linguistic Segment Categories 8.4.1. Segmentation]
Moduleanalysis — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp)
Member of
Contained by
linking: seg
May contain
analysis: pc w
linking: linkGrp
Note

The <s> element may be used to mark orthographic sentences, or any other segmentation of a text, provided that the segmentation is end-to-end, complete, and non-nesting. For segmentation which is partial or recursive, the <seg> should be used instead.

The type attribute may be used to indicate the type of segmentation intended, according to any convenient typology.

Example
<s xml:id="ParlaMint-GB_2017-10-30-lords.seg4.1">  <w lemma="I"   msd="UPosTag=PRON|Case=Nom|Number=Sing|Person=1|PronType=Prspos="PRP">I</w>  <w lemma="support" @@ -2995,7 +3004,7 @@  <pc msd="UPosTag=PUNCTpos=".">.</pc> </s>
Schematron
<sch:report test="tei:s">You may not nest one s element within - another: use seg instead</sch:report>
Content model
+ another: use seg instead</sch:report>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -3015,7 +3024,7 @@
  <elementRef key="linkGrp" minOccurs="0"
   maxOccurs="1"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element s
 {
    tei_att.global.attribute.xmlid,
@@ -3037,10 +3046,10 @@
     | tei_pb
    )+,
    tei_linkGrp?
-}

Appendix A.1.88 <seg>

<seg> (arbitrary segment) represents any segmentation of text below the ‘chunk’ level. [16.3. Blocks, Segments, and Anchors 6.2. Components of the Verse Line 7.2.5. Speech Contents]
Modulelinking — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp)
Member of
Contained by
spoken: u
May contain
analysis: s
core: gap note pb
character data
Note

The <seg> element may be used at the encoder's discretion to mark any segments of the text of interest for processing. One use of the element is to mark text features for which no appropriate markup is otherwise defined. Another use is to provide an identifier for some segment which is to be pointed at by some other element—i.e. to provide a target, or a part of a target, for a <ptr> or other similar element.

Example
<u who="#DavidPriorana="#regular"> +}

Appendix A.1.88 <seg>

<seg> (arbitrary segment) represents any segmentation of text below the ‘chunk’ level. [16.3. Blocks, Segments, and Anchors 6.2. Components of the Verse Line 7.2.5. Speech Contents]
Modulelinking — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, next, prev, @corresp)
Member of
Contained by
spoken: u
May contain
analysis: s
core: gap note pb
character data
Note

The <seg> element may be used at the encoder's discretion to mark any segments of the text of interest for processing. One use of the element is to mark text features for which no appropriate markup is otherwise defined. Another use is to provide an identifier for some segment which is to be pointed at by some other element—i.e. to provide a target, or a part of a target, for a <ptr> or other similar element.

Example
<u who="#DavidPriorana="#regular">  <seg>I ask that the draft Regulations laid before the House on 5 December be approved.</seg>  <seg>The relevant document is the 20th Report from the Legislation Committee.</seg> -</u>
Content model
+</u>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -3057,7 +3066,7 @@
   </alternate>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element seg
 {
    tei_att.global.attribute.xmlid,
@@ -3073,55 +3082,55 @@
     | tei_pb
     | ( text | tei_s )*
    )+
-}

Appendix A.1.89 <segmentation>

<segmentation> (segmentation) describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl> +}

Appendix A.1.89 <segmentation>

<segmentation> (segmentation) describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc. [2.3.3. The Editorial Practices Declaration 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Contained by
May contain
core: p
Example
<editorialDecl>  <segmentation>   <p xml:lang="en">The texts are segmented into utterances (speeches) and segments (corresponding to paragraphs in the source transcription).</p>  </segmentation> -</editorialDecl>
Content model
+</editorialDecl>
Content model
 <content>
  <elementRef key="p" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element segmentation { tei_p+ }

Appendix A.1.90 <setting>

<setting> describes one particular setting in which a language interaction takes place. [15.2.3. The Setting Description]
Modulecorpus — Formal specification
Contained by
corpus: settingDesc
May contain
core: date name
Note

If the who attribute is not supplied, the setting is assumed to be that of all participants in the language interaction.

Example
<setting> +
Schema Declaration
+element segmentation { tei_p+ }

Appendix A.1.90 <setting>

<setting> describes one particular setting in which a language interaction takes place. [15.2.3. The Setting Description]
Modulecorpus — Formal specification
Contained by
corpus: settingDesc
May contain
core: date name
Note

If the who attribute is not supplied, the setting is assumed to be that of all participants in the language interaction.

Example
<setting>  <name type="place">Commons Chamber</name>  <name type="place">Westminster</name>  <name type="city">London</name>  <name type="countrykey="GB">U.K.</name>  <date when="2019-02-18">February 18th, 2019</date> -</setting>
Content model
+</setting>
Content model
 <content>
  <elementRef key="name" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="date"/>
 </content>
-    
Schema Declaration
-element setting { tei_name+, tei_date }

Appendix A.1.91 <settingDesc>

<settingDesc> (setting description) describes the setting or settings within which a language interaction takes place, or other places otherwise referred to in a text, edition, or metadata. [15.2. Contextual Information 2.4. The Profile Description]
Modulecorpus — Formal specification
Contained by
header: profileDesc
May contain
corpus: setting
Note

May contain a prose description organized as paragraphs, or a series of <setting> elements. If used to record not settings of language interactions, but other places mentioned in the text, then <place> optionally grouped by <listPlace> inside <standOff> should be preferred.

Example
<settingDesc> +
Schema Declaration
+element setting { tei_name+, tei_date }

Appendix A.1.91 <settingDesc>

<settingDesc> (setting description) describes the setting or settings within which a language interaction takes place, or other places otherwise referred to in a text, edition, or metadata. [15.2. Contextual Information 2.4. The Profile Description]
Modulecorpus — Formal specification
Contained by
header: profileDesc
May contain
corpus: setting
Note

May contain a prose description organized as paragraphs, or a series of <setting> elements. If used to record not settings of language interactions, but other places mentioned in the text, then <place> optionally grouped by <listPlace> inside <standOff> should be preferred.

Example
<settingDesc>  <setting>   <name type="address">Trg sv. Marka 6</name>   <name type="city">Zagreb</name>   <name type="countrykey="HR">Croatia</name>   <date from="2016-11-15to="2020-05-18">15.11.2016 - 18.5.2020</date>  </setting> -</settingDesc>
Content model
+</settingDesc>
Content model
 <content>
  <elementRef key="setting" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element settingDesc { tei_setting+ }

Appendix A.1.92 <sex>

<sex> (sex) specifies the sex of a person. [13.3.2.1. Personal Characteristics]
Modulenamesdates — Formal specification
Attributes
value
StatusRequired
Legal values are:
M
F
U
O
N
Contained by
namesdates: person
May containEmpty element
Note

As with other culturally-constructed traits such as age and gender, the way in which this concept is described in different cultural contexts varies. The normalizing attributes are provided only as an optional means of simplifying that variety for purposes of interoperability or project-internal taxonomies for consistency, and should not be used where that is inappropriate or unhelpful. The content of the element may be used to describe the intended concept in more detail.

Example
<sex value="M"/>
Content model
+    
Schema Declaration
+element settingDesc { tei_setting+ }

Appendix A.1.92 <sex>

<sex> (sex) specifies the sex of a person. [13.3.2.1. Personal Characteristics]
Modulenamesdates — Formal specification
Attributes
value
StatusRequired
Legal values are:
M
F
U
O
N
Contained by
namesdates: person
May containEmpty element
Note

As with other culturally-constructed traits such as age and gender, the way in which this concept is described in different cultural contexts varies. The normalizing attributes are provided only as an optional means of simplifying that variety for purposes of interoperability or project-internal taxonomies for consistency, and should not be used where that is inappropriate or unhelpful. The content of the element may be used to describe the intended concept in more detail.

Example
<sex value="M"/>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
-element sex { attribute value { "M" | "F" | "U" | "O" | "N" }, empty }

Appendix A.1.93 <sourceDesc>

<sourceDesc> (source description) describes the source(s) from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as "born digital" for a text which has no previous existence. [2.2.7. The Source Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
core: bibl
ExampleThe source description <sourceDesc> of the corpus root encodes the original digital source of the ParlaMint corpus:
<sourceDesc> +
Schema Declaration
+element sex { attribute value { "M" | "F" | "U" | "O" | "N" }, empty }

Appendix A.1.93 <sourceDesc>

<sourceDesc> (source description) describes the source(s) from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as "born digital" for a text which has no previous existence. [2.2.7. The Source Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
core: bibl
ExampleThe source description <sourceDesc> of the corpus root encodes the original digital source of the ParlaMint corpus:
<sourceDesc>  <bibl>   <title type="mainxml:lang="sl">Zapisi sej Državnega zbora Republike Slovenije</title>   <title type="mainxml:lang="en">Minutes of the National Assembly of the Republic of Slovenia</title>   <idno type="URI">https://www.dz-rs.si</idno>   <date from="2014-08-01to="2020-07-16">1.8.2014 - 16.7.2020</date>  </bibl> -</sourceDesc>
ExampleFor corpus components the source description is very similar to the one for the corpus root, except it reflects information of the exact meeting. Furthermore, if the audio or video of the meeting is available, this information can also be given:
<sourceDesc> +</sourceDesc>
ExampleFor corpus components the source description is very similar to the one for the corpus root, except it reflects information of the exact meeting. Furthermore, if the audio or video of the meeting is available, this information can also be given:
<sourceDesc>  <bibl>   <title type="mainxml:lang="cs">Parlament České republiky, Poslanecká sněmovna</title>   <title type="mainxml:lang="en">Parliament of the Czech Republic, Chamber of Deputies</title> @@ -3136,20 +3145,20 @@     url="2013ps/audio/2016/04/13/2016041308580912.mp3"/>   </recording>  </recordingStmt> -</sourceDesc>
Content model
+</sourceDesc>
Content model
 <content>
  <elementRef key="bibl" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="recordingStmt"
   minOccurs="0" maxOccurs="1"/>
 </content>
-    
Schema Declaration
-element sourceDesc { tei_bibl+, tei_recordingStmt? }

Appendix A.1.94 <state>

<state> (state) defines additional metadata on the political orientation of a political party or parliamentary group, e.g. its political orientation. [13.3.1. Basic Principles 13.3.2.1. Personal Characteristics]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, xml:lang, xml:base, xml:space, @n) att.global.source (@source) att.datable.w3c (when, notBefore, notAfter, @from, @to)
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
Derived fromatt.global.analytic
StatusOptional
Datatypeteidata.pointer
type
StatusOptional
Legal values are:
politicalOrientation
encoder
Wikipedia
CHES
Member of
Contained by
namesdates: org
May contain
core: label note
Note

Where there is confusion between <trait> and <state> the more general purpose element <state> should be used even for unchanging characteristics. If you wish to distinguish between characteristics that are generally perceived to be time-bound states and those assumed to be fixed traits, then <trait> is available for the more static of these. The <state> element encodes characteristics which are sometimes assumed to change, often at specific times or over a date range, whereas the <trait> elements are used to record characteristics, such as eye-colour, which are less subject to change. Traits are typically, but not necessarily, independent of the volition or action of the holder.

Example
<state type="politicalOrientation" - subtype="unknownana="#orientation.L"/>
Example
<state type="politicalOrientation" +
Schema Declaration
+element sourceDesc { tei_bibl+, tei_recordingStmt? }

Appendix A.1.94 <state>

<state> (state) defines additional metadata on the political orientation of a political party or parliamentary group, e.g. its political orientation. [13.3.1. Basic Principles 13.3.2.1. Personal Characteristics]
Modulenamesdates — Formal specification
Attributesatt.global (xml:id, xml:lang, xml:base, xml:space, @n) att.global.source (@source) att.datable.w3c (when, notBefore, notAfter, @from, @to)
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
Derived fromatt.global.analytic
StatusOptional
Datatypeteidata.pointer
type
StatusOptional
Legal values are:
politicalOrientation
encoder
Wikipedia
CHES
variable
value
Member of
Contained by
namesdates: org
May contain
core: label note
Note

Where there is confusion between <trait> and <state> the more general purpose element <state> should be used even for unchanging characteristics. If you wish to distinguish between characteristics that are generally perceived to be time-bound states and those assumed to be fixed traits, then <trait> is available for the more static of these. The <state> element encodes characteristics which are sometimes assumed to change, often at specific times or over a date range, whereas the <trait> elements are used to record characteristics, such as eye-colour, which are less subject to change. Traits are typically, but not necessarily, independent of the volition or action of the holder.

Example
<state type="politicalOrientation" + subtype="unknownana="#orientation.L"/>
Example
<state type="politicalOrientation"  subtype="Wikipedia"  source="https://en.wikipedia.org/wiki/Christian_Democratic_and_Flemishana="#orientation.CCR">  <note xml:lang="en">CD&amp;V, CDV, CVP (until 2001)</note> -</state>
Content model
+</state>
Content model
 <content>
  <alternate minOccurs="1" maxOccurs="1">
   <elementRef key="label"/>
@@ -3157,7 +3166,7 @@
    maxOccurs="unbounded"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element state
 {
    tei_att.global.attribute.n,
@@ -3165,17 +3174,25 @@
    tei_att.datable.w3c.attribute.from,
    tei_att.datable.w3c.attribute.to,
    attribute ana { text }?,
-   attribute type { "politicalOrientation" | "encoder" | "Wikipedia" | "CHES" }?,
+   attribute type
+   {
+      "politicalOrientation"
+    | "encoder"
+    | "Wikipedia"
+    | "CHES"
+    | "variable"
+    | "value"
+   }?,
    ( tei_label | tei_note* )
-}

Appendix A.1.95 <surname>

<surname> (surname) contains a family (inherited) name, as opposed to a given, baptismal, or nick name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: persName
May containCharacter data only
Example
<persName> +}

Appendix A.1.95 <surname>

<surname> (surname) contains a family (inherited) name, as opposed to a given, baptismal, or nick name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Member of
Contained by
namesdates: persName
May containCharacter data only
Example
<persName>  <surname>Accetto</surname>  <forename>Matej</forename> -</persName>
Content model
+</persName>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element surname { text }

Appendix A.1.96 <tagUsage>

<tagUsage> (element usage) documents the usage of a specific element within a specified document. [2.3.4. The Tagging Declaration]
Moduleheader — Formal specification
Attributes
gi(generic identifier) specifies the name (generic identifier) of the element indicated by the tag, within the namespace indicated by the parent <namespace> element. All descendats of <text> element and <text> element counts have to be included.
StatusRequired
Datatypeteidata.name
occursspecifies the number of occurrences of this element within the text.
StatusRequired
Datatypeteidata.count
Contained by
header: namespace
May containEmpty element
Example
<tagsDecl> +
Schema Declaration
+element surname { text }

Appendix A.1.96 <tagUsage>

<tagUsage> (element usage) documents the usage of a specific element within a specified document. [2.3.4. The Tagging Declaration]
Moduleheader — Formal specification
Attributes
gi(generic identifier) specifies the name (generic identifier) of the element indicated by the tag, within the namespace indicated by the parent <namespace> element. All descendats of <text> element and <text> element counts have to be included.
StatusRequired
Datatypeteidata.name
occursspecifies the number of occurrences of this element within the text.
StatusRequired
Datatypeteidata.count
Contained by
header: namespace
May containEmpty element
Example
<tagsDecl>  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="textoccurs="414"/>   <tagUsage gi="bodyoccurs="414"/> @@ -3190,12 +3207,12 @@   <tagUsage gi="kinesicoccurs="560"/>   <tagUsage gi="descoccurs="10234"/>  </namespace> -</tagsDecl>
Content model
+</tagsDecl>
Content model
 <content>
  <empty/>
 </content>
-    
Schema Declaration
-element tagUsage { attribute gi { text }, attribute occurs { text }, empty }

Appendix A.1.97 <tagsDecl>

<tagsDecl> (tagging declaration) provides detailed information about the tagging applied to a document. [2.3.4. The Tagging Declaration 2.3. The Encoding Description]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
header: namespace
ExampleThe tags declaration, <tagsDecl> of the corpus root gives the count of all the XML tags used in the data part (so, not in the TEI header) of the corpus (for the corpus root) or in an individual component of the corpus.
<encodingDesc> ... +
Schema Declaration
+element tagUsage { attribute gi { text }, attribute occurs { text }, empty }

Appendix A.1.97 <tagsDecl>

<tagsDecl> (tagging declaration) provides detailed information about the tagging applied to a document. [2.3.4. The Tagging Declaration 2.3. The Encoding Description]
Moduleheader — Formal specification
Contained by
header: encodingDesc
May contain
header: namespace
ExampleThe tags declaration, <tagsDecl> of the corpus root gives the count of all the XML tags used in the data part (so, not in the TEI header) of the corpus (for the corpus root) or in an individual component of the corpus.
<encodingDesc> ... <tagsDecl>   <namespace name="http://www.tei-c.org/ns/1.0">    <tagUsage gi="textoccurs="414"/> @@ -3204,12 +3221,12 @@      ...   </namespace>  </tagsDecl> -</encodingDesc>
Content model
+</encodingDesc>
Content model
 <content>
  <elementRef key="namespace"/>
 </content>
-    
Schema Declaration
-element tagsDecl { tei_namespace }

Appendix A.1.98 <taxonomy>

<taxonomy> (taxonomy) defines a typology explicitly by a structured taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
Contained by
header: classDecl
May contain
core: desc
header: category
Note

Nested taxonomies are common in many fields, so the <taxonomy> element can be nested.

Example
<taxonomy xml:id="subcorpus"> +
Schema Declaration
+element tagsDecl { tei_namespace }

Appendix A.1.98 <taxonomy>

<taxonomy> (taxonomy) defines a typology explicitly by a structured taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
Contained by
header: classDecl
May contain
core: desc
header: category
Note

Nested taxonomies are common in many fields, so the <taxonomy> element can be nested.

Example
<taxonomy xml:id="subcorpus">  <desc xml:lang="sl">   <term>Podkorpusi</term>  </desc> @@ -3228,7 +3245,7 @@   <catDesc xml:lang="en">    <term>COVID</term>: COVID subcorpus, from 2020-01-31 onwards</catDesc>  </category> -</taxonomy>
Example
<taxonomy xml:id="parla.legislature"> +</taxonomy>
Example
<taxonomy xml:id="parla.legislature">  <desc xml:lang="it">   <term>Legislatura</term>  </desc> @@ -3267,21 +3284,21 @@  role="parliamentxml:id="LEG">  <orgName full="yesxml:lang="it">Senato della Repubblica Italiana</orgName>  <orgName full="yesxml:lang="it">Senate of the Republic of Italy</orgName> -</org>
Content model
+</org>
Content model
 <content>
  <elementRef key="desc" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="category" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element taxonomy
 {
    tei_att.global.attribute.xmllang,
    attribute xml:id { text },
    tei_desc+,
    tei_category+
-}

Appendix A.1.99 <teiCorpus>

<teiCorpus> (TEI corpus) contains one whole corpus, stored in the corpus root file comprising the corpus header and XInclude references to corpus component files, each containing a <TEI> element. [4. Default Text Structure 15.1. Varieties of Composite Text]
Modulecore — Formal specification
Attributesatt.global.linking (synch, next, prev, @corresp)
xml:id
StatusRequired
DatatypeID
xml:lang
StatusRequired
Datatypeteidata.language
Contained by
May contain
derived-module-parlamint: include
header: teiHeader
textstructure: TEI
Note

Should contain one TEI header for the corpus, and a series of <TEI> elements, one for each text.

ExampleGeneral structure of a ParlaMint corpus root:
<teiCorpus xml:lang="en" +}

Appendix A.1.99 <teiCorpus>

<teiCorpus> (TEI corpus) contains one whole corpus, stored in the corpus root file comprising the corpus header and XInclude references to corpus component files, each containing a <TEI> element. [4. Default Text Structure 15.1. Varieties of Composite Text]
Modulecore — Formal specification
Attributesatt.global.linking (synch, next, prev, @corresp)
xml:id
StatusRequired
DatatypeID
xml:lang
StatusRequired
Datatypeteidata.language
Contained by
May contain
derived-module-parlamint: include
header: teiHeader
textstructure: TEI
Note

Should contain one TEI header for the corpus, and a series of <TEI> elements, one for each text.

ExampleGeneral structure of a ParlaMint corpus root:
<teiCorpus xml:lang="en"  xml:id="ParlaMint-GB" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader> ...TEI header of the corpus...  </teiHeader> @@ -3291,7 +3308,7 @@ href="2015/ParlaMint-GB_2015-01-06-commons.xml"/> ... -</teiCorpus>
Content model
+</teiCorpus>
Content model
 <content>
  <elementRef key="teiHeader"/>
  <alternate minOccurs="1"
@@ -3300,7 +3317,7 @@
   <elementRef key="include"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element teiCorpus
 {
    tei_att.global.linking.attribute.corresp,
@@ -3308,12 +3325,12 @@
    attribute xml:lang { text },
    tei_teiHeader,
    ( tei_TEI | tei_include )+
-}

Appendix A.1.100 <teiHeader>

<teiHeader> (TEI header) supplies descriptive and declarative metadata associated with a digital resource or set of resources. [2.1.1. The TEI Header and Its Components 15.1. Varieties of Composite Text]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: teiCorpus
textstructure: TEI
May contain
Note

One of the few elements unconditionally required in any TEI document.

ExampleBasic structure of the <teiHeader>:
<teiHeader> +}

Appendix A.1.100 <teiHeader>

<teiHeader> (TEI header) supplies descriptive and declarative metadata associated with a digital resource or set of resources. [2.1.1. The TEI Header and Its Components 15.1. Varieties of Composite Text]
Moduleheader — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
Contained by
core: teiCorpus
textstructure: TEI
May contain
Note

One of the few elements unconditionally required in any TEI document.

ExampleBasic structure of the <teiHeader>:
<teiHeader>  <fileDesc>...</fileDesc>  <encodingDesc>...</encodingDesc>  <profileDesc>...</profileDesc>  <revisionDesc>...</revisionDesc> -</teiHeader>
ExampleExample of a ParlaMint corpus component <teiHeader>:
<teiHeader> +</teiHeader>
ExampleExample of a ParlaMint corpus component <teiHeader>:
<teiHeader>  <fileDesc>   <titleStmt>    <title type="mainxml:lang="lv">Latvijas parlamenta corpus ParlaMint-LV, 12. Saeima, 2014-11-04 [ParlaMint]</title> @@ -3381,7 +3398,7 @@    </setting>   </settingDesc>  </profileDesc> -</teiHeader>
Content model
+</teiHeader>
Content model
 <content>
  <elementRef key="fileDesc"/>
  <elementRef key="encodingDesc"/>
@@ -3389,7 +3406,7 @@
  <elementRef key="revisionDesc"
   minOccurs="0" maxOccurs="1"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element teiHeader
 {
    tei_att.global.attribute.xmllang,
@@ -3397,7 +3414,7 @@
    tei_encodingDesc,
    tei_profileDesc,
    tei_revisionDesc?
-}

Appendix A.1.101 <term>

<term> (term) contains a single-word, multi-word, or symbolic designation which is regarded as a technical term. [3.4.1. Terms and Glosses]
Modulecore — Formal specification
Member of
Contained by
core: desc
header: catDesc
namesdates: persName
May containCharacter data only
Note

When this element appears within an <index> element, it is understood to supply the form under which an index entry is to be made for that location. Elsewhere, it is understood simply to indicate that its content is to be regarded as a technical or specialised term. It may be associated with a <gloss> element by means of its ref attribute; alternatively a <gloss> element may point to a <term> element by means of its target attribute.

In formal terminological work, there is frequently discussion over whether terms must be atomic or may include multi-word lexical items, symbolic designations, or phraseological units. The <term> element may be used to mark any of these. No position is taken on the philosophical issue of what a term can be; the looser definition simply allows the <term> element to be used by practitioners of any persuasion.

As with other members of the att.canonical class, instances of this element occuring in a text may be associated with a canonical definition, either by means of a URI (using the ref attribute), or by means of some system-specific code value (using the key attribute). Because the mutually exclusive target and cRef attributes overlap with the function of the ref attribute, they are deprecated and may be removed at a subsequent release.

Example<term> is used inside taxonomies to name the taxonomy and its categories:
<taxonomy xml:id="subcorpus"> +}

Appendix A.1.101 <term>

<term> (term) contains a single-word, multi-word, or symbolic designation which is regarded as a technical term. [3.4.1. Terms and Glosses]
Modulecore — Formal specification
Member of
Contained by
core: desc
header: catDesc
namesdates: persName
May containCharacter data only
Note

When this element appears within an <index> element, it is understood to supply the form under which an index entry is to be made for that location. Elsewhere, it is understood simply to indicate that its content is to be regarded as a technical or specialised term. It may be associated with a <gloss> element by means of its ref attribute; alternatively a <gloss> element may point to a <term> element by means of its target attribute.

In formal terminological work, there is frequently discussion over whether terms must be atomic or may include multi-word lexical items, symbolic designations, or phraseological units. The <term> element may be used to mark any of these. No position is taken on the philosophical issue of what a term can be; the looser definition simply allows the <term> element to be used by practitioners of any persuasion.

As with other members of the att.canonical class, instances of this element occuring in a text may be associated with a canonical definition, either by means of a URI (using the ref attribute), or by means of some system-specific code value (using the key attribute). Because the mutually exclusive target and cRef attributes overlap with the function of the ref attribute, they are deprecated and may be removed at a subsequent release.

Example<term> is used inside taxonomies to name the taxonomy and its categories:
<taxonomy xml:id="subcorpus">  <desc xml:lang="sl">   <term>Podkorpusi</term>  </desc> @@ -3412,7 +3429,7 @@  </category> ... -</taxonomy>
Example
<catDesc xml:lang="en"> +</taxonomy>
Example
<catDesc xml:lang="en">  <term>acl</term>: Clausal modifier of noun (adjectival clause) </catDesc> <catDesc xml:lang="en"> @@ -3420,22 +3437,22 @@ </catDesc> <catDesc xml:lang="en">  <term>punct</term>: Punctuation -</catDesc>
Content model
+</catDesc>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
-element term { text }

Appendix A.1.102 <text>

<text> (text) contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample. [4. Default Text Structure 15.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.analytic (@ana) att.global.source (@source)
Contained by
textstructure: TEI
May contain
textstructure: body
Note

This element should not be used to represent a text which is inserted at an arbitrary point within the structure of another, for example as in an embedded or quoted narrative; the <floatingText> is provided for this purpose.

Example
<text ana="#reference"> +
Schema Declaration
+element term { text }

Appendix A.1.102 <text>

<text> (text) contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample. [4. Default Text Structure 15.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.analytic (@ana) att.global.source (@source)
Contained by
textstructure: TEI
May contain
textstructure: body
Note

This element should not be used to represent a text which is inserted at an arbitrary point within the structure of another, for example as in an embedded or quoted narrative; the <floatingText> is provided for this purpose.

Example
<text ana="#reference">  <body>   <div type="debateSection">...</div>   <div type="debateSection">...</div>    ...  </body> -</text>
Content model
+</text>
Content model
 <content>
  <elementRef key="body"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element text
 {
    tei_att.global.attribute.xmlid,
@@ -3444,17 +3461,17 @@
    tei_att.global.analytic.attribute.ana,
    tei_att.global.source.attribute.source,
    tei_body
-}

Appendix A.1.103 <textClass>

<textClass> (text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Contained by
header: profileDesc
May contain
header: catRef
Example
<textClass> +}

Appendix A.1.103 <textClass>

<textClass> (text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Contained by
header: profileDesc
May contain
header: catRef
Example
<textClass>  <catRef scheme="#parla.legislature"   target="#parla.bi #parla.lower #parla.upper"/> -</textClass>
Content model
+</textClass>
Content model
 <content>
  <elementRef key="catRef"/>
 </content>
-    
Schema Declaration
-element textClass { tei_catRef }

Appendix A.1.104 <time>

<time> (time) contains a phrase defining a time of day in any format. [3.6.4. Dates and Times]
Modulecore — Formal specification
Attributesatt.typed (@type, @subtype) att.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Member of
Contained by
analysis: s
core: note unit
May contain
analysis: pc w
character data
ExampleA note giving the time when e.g. the session started:
<note type="time"> +
Schema Declaration
+element textClass { tei_catRef }

Appendix A.1.104 <time>

<time> (time) contains a phrase defining a time of day in any format. [3.6.4. Dates and Times]
Modulecore — Formal specification
Attributesatt.typed (@type, @subtype) att.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana) att.datable.w3c (notBefore, notAfter, @when, @from, @to)
Member of
Contained by
analysis: s
core: note unit
May contain
analysis: pc w
character data
ExampleA note giving the time when e.g. the session started:
<note type="time">  <time when="2016-04-13T09:10:00">(9.10 hodin)</time> -</note>
Content model
+</note>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -3463,7 +3480,7 @@
   <textNode/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element time
 {
    tei_att.global.attribute.xmlid,
@@ -3474,23 +3491,23 @@
    tei_att.datable.w3c.attribute.to,
    tei_att.typed.attributes,
    ( tei_w | tei_pc | text )+
-}

Appendix A.1.105 <title>

<title> (title) contains a title for any kind of work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
type
StatusRecommended
Legal values are:
main
sub
Note

Attribute is required in <titleStmt> context.

Member of
Contained by
core: bibl
header: titleStmt
May containCharacter data only
Note

The attributes key and ref, inherited from the class att.canonical may be used to indicate the canonical form for the title; the former, by supplying (for example) the identifier of a record in some external library system; the latter by pointing to an XML element somewhere containing the canonical form of the title.

ExampleThe <title> element as used in the <titleStmt> of the corpus root <teiHeader>:
<title type="mainxml:lang="cs">Český parlamentní korpus ParlaMint-CZ [ParlaMint]</title> +}

Appendix A.1.105 <title>

<title> (title) contains a title for any kind of work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributesatt.global (xml:id, n, xml:base, xml:space, @xml:lang)
type
StatusRecommended
Legal values are:
main
sub
Note

Attribute is required in <titleStmt> context.

Member of
Contained by
core: bibl
header: titleStmt
May containCharacter data only
Note

The attributes key and ref, inherited from the class att.canonical may be used to indicate the canonical form for the title; the former, by supplying (for example) the identifier of a record in some external library system; the latter by pointing to an XML element somewhere containing the canonical form of the title.

ExampleThe <title> element as used in the <titleStmt> of the corpus root <teiHeader>:
<title type="mainxml:lang="cs">Český parlamentní korpus ParlaMint-CZ [ParlaMint]</title> <title type="mainxml:lang="en">Czech parliamentary corpus ParlaMint-CZ [ParlaMint]</title> <title type="subxml:lang="cs">Parlament České republiky, Poslanecká sněmovna</title> -<title type="subxml:lang="en">Parliament of the Czech Republic, Chamber of Deputies</title>
ExampleThe <title> element as used in the <titleStmt> of the corpus component <teiHeader>:
<title type="mainxml:lang="cs">Český parlamentní korpus ParlaMint-CZ, 2013-11-25 ps2013-001-01-000-000 [ParlaMint]</title> +<title type="subxml:lang="en">Parliament of the Czech Republic, Chamber of Deputies</title>
ExampleThe <title> element as used in the <titleStmt> of the corpus component <teiHeader>:
<title type="mainxml:lang="cs">Český parlamentní korpus ParlaMint-CZ, 2013-11-25 ps2013-001-01-000-000 [ParlaMint]</title> <title type="mainxml:lang="en">Czech parliamentary corpus ParlaMint-CZ, 2013-11-25 ps2013-001-01-000-000 [ParlaMint]</title> <title type="subxml:lang="cs">Parlament České republiky, Poslanecká sněmovna, 2013-11-25, Začátek schůze Poslanecké sněmovny 25. listopadu 2013 ve 14.05 hodin Přítomno: 199 poslanců</title> -<title type="subxml:lang="en">Parliament of the Czech Republic, Chamber of Deputies, 2013-11-25</title>
Content model
+<title type="subxml:lang="en">Parliament of the Czech Republic, Chamber of Deputies, 2013-11-25</title>
Content model
 <content>
  <textNode/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element title
 {
    tei_att.global.attribute.xmllang,
    attribute type { "main" | "sub" }?,
    text
-}

Appendix A.1.106 <titleStmt>

<titleStmt> (title statement) groups information about the title of a work and those responsible for its content. [2.2.1. The Title Statement 2.2. The File Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
ExampleThe <titleStmt> element gives the title of the corpus root or component, along with the specification of the particular session(s) of the parliament contained, the persons responsible for compiling the corpus and the funder(s) of the project:
<titleStmt> +}

Appendix A.1.106 <titleStmt>

<titleStmt> (title statement) groups information about the title of a work and those responsible for its content. [2.2.1. The Title Statement 2.2. The File Description]
Moduleheader — Formal specification
Contained by
header: fileDesc
May contain
ExampleThe <titleStmt> element gives the title of the corpus root or component, along with the specification of the particular session(s) of the parliament contained, the persons responsible for compiling the corpus and the funder(s) of the project:
<titleStmt>  <title type="main">Slovenski parlamentarni korpus ParlaMint-SI [ParlaMint]</title>  <title type="mainxml:lang="en">Slovenian parliamentary corpus ParlaMint-SI [ParlaMint]</title>  <title type="sub">Zapisi sej Državnega zbora Republike Slovenije, 7. in 8. mandat (2014 - 2020)</title> @@ -3513,7 +3530,7 @@   <orgName>Slovenska raziskovalna infrastruktura CLARIN.SI</orgName>   <orgName xml:lang="en">The Slovenian research infrastructure CLARIN.SI</orgName>  </funder> -</titleStmt>
Content model
+</titleStmt>
Content model
 <content>
  <elementRef key="title" minOccurs="1"
   maxOccurs="unbounded"/>
@@ -3524,11 +3541,11 @@
  <elementRef key="funder" minOccurs="0"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
-element titleStmt { tei_title+, tei_meeting+, tei_respStmt*, tei_funder* }

Appendix A.1.107 <u>

<u> (utterance) contains a stretch of speech usually preceded and followed by silence or by a change of speaker. [8.3.1. Utterances]
Modulespoken — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, @corresp, @next, @prev) att.global.analytic (@ana) att.global.source (@source) att.ascribed (@who)
Member of
Contained by
textstructure: div
May contain
core: gap note pb
linking: seg
Note

Prose and a mixture of speech elements

Although individual transcriptions may consistently use <u> elements for turns or other units, and although in most cases a <u> will be delimited by pause or change of speaker, <u> is not required to represent a turn or any communicative event, nor to be bounded by pauses or change of speaker. At a minimum, a <u> is some phonetic production by a given speaker.

ExampleThe element <u> marks up a speech, as illustrated below:
<u who="#DavidPriorana="#regular"> +
Schema Declaration
+element titleStmt { tei_title+, tei_meeting+, tei_respStmt*, tei_funder* }

Appendix A.1.107 <u>

<u> (utterance) contains a stretch of speech usually preceded and followed by silence or by a change of speaker. [8.3.1. Utterances]
Modulespoken — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.linking (synch, @corresp, @next, @prev) att.global.analytic (@ana) att.global.source (@source) att.ascribed (@who)
Member of
Contained by
textstructure: div
May contain
core: gap note pb
linking: seg
Note

Prose and a mixture of speech elements

Although individual transcriptions may consistently use <u> elements for turns or other units, and although in most cases a <u> will be delimited by pause or change of speaker, <u> is not required to represent a turn or any communicative event, nor to be bounded by pauses or change of speaker. At a minimum, a <u> is some phonetic production by a given speaker.

ExampleThe element <u> marks up a speech, as illustrated below:
<u who="#DavidPriorana="#regular">  <seg>I ask that the draft Regulations laid before the House on 5 December be approved.</seg>  <seg>The relevant document is the 20th Report from the Legislation Committee.</seg> -</u>
Content model
+</u>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -3541,7 +3558,7 @@
   <elementRef key="seg"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element u
 {
    tei_att.global.attribute.xmlid,
@@ -3562,7 +3579,7 @@
     | tei_pb
     | tei_seg
    )+
-}

Appendix A.1.108 <unit>

<unit> contains a symbol, a word or a phrase referring to a unit of measurement in any kind of formal or informal system. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.analytic (@ana)
Member of
Contained by
core: unit
May contain
ExampleThe element can be used for fine-grained Named Entities which include units:
<num ana="ne:nc" +}

Appendix A.1.108 <unit>

<unit> contains a symbol, a word or a phrase referring to a unit of measurement in any kind of formal or informal system. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.global.analytic (@ana)
Member of
Contained by
core: unit
May contain
ExampleThe element can be used for fine-grained Named Entities which include units:
<num ana="ne:nc"  xml:id="ParlaMint-CZ_2013-12-06-ps2013-003-01-001-001.ne53">  <w xml:id="ParlaMint-CZ_2013-12-06-ps2013-003-01-001-001.u2.p10.s1.w9"   lemma="3" @@ -3576,7 +3593,7 @@  <w xml:id="ParlaMint-CZ_2013-12-06-ps2013-003-01-001-001.u2.p10.s1.w11"   lemma=""   msd="UPosTag=NOUN|Gender=Fem|Polarity=Posjoin="right"></w> -</unit>
Content model
+</unit>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -3596,7 +3613,7 @@
   <elementRef key="vocal"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element unit
 {
    tei_att.global.attribute.xmlid,
@@ -3619,14 +3636,14 @@
     | tei_incident
     | tei_vocal
    )+
-}

Appendix A.1.109 <vocal>

<vocal> (vocal) marks any vocalized but not necessarily lexical phenomenon, for example voiced pauses, non-lexical backchannels, etc. [8.3.3. Vocal, Kinesic, Incident]
Modulespoken — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.ascribed (@who) att.typed (type, @subtype)
type
StatusRecommended
Legal values are:
greeting
question
clarification
speaking
interruption
exclamat
laughter
shouting
murmuring
noise
signal
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Example
<vocal type="interruption"> +}

Appendix A.1.109 <vocal>

<vocal> (vocal) marks any vocalized but not necessarily lexical phenomenon, for example voiced pauses, non-lexical backchannels, etc. [8.3.3. Vocal, Kinesic, Incident]
Modulespoken — Formal specification
Attributesatt.global (xml:base, xml:space, @xml:id, @n, @xml:lang) att.ascribed (@who) att.typed (type, @subtype)
type
StatusRecommended
Legal values are:
greeting
question
clarification
speaking
interruption
exclamat
laughter
shouting
murmuring
noise
signal
Member of
Contained by
analysis: s
core: unit
linking: seg
spoken: u
textstructure: div
May contain
core: desc
Example
<vocal type="interruption">  <desc>Interruption from the chair: Your time is up.</desc> -</vocal>
Content model
+</vocal>
Content model
 <content>
  <elementRef key="desc" minOccurs="1"
   maxOccurs="unbounded"/>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element vocal
 {
    tei_att.global.attribute.xmlid,
@@ -3649,7 +3666,7 @@
     | "signal"
    }?,
    tei_desc+
-}

Appendix A.1.110 <w>

<w> (word) represents a grammatical (not necessarily orthographic) word. [17.1. Linguistic Segment Categories 17.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Attributesatt.linguistic (@lemma, @pos, @msd, @join) (att.lexicographic.normalized (@norm)) att.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana)
Member of
Contained by
analysis: s w
May contain
analysis: w
character data
Example
<s xml:id="ParlaMint-GB_2017-10-30-lords.seg4.1"> +}

Appendix A.1.110 <w>

<w> (word) represents a grammatical (not necessarily orthographic) word. [17.1. Linguistic Segment Categories 17.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Attributesatt.linguistic (@lemma, @pos, @msd, @join) (att.lexicographic.normalized (@norm)) att.global (n, xml:base, xml:space, @xml:id, @xml:lang) att.global.analytic (@ana)
Member of
Contained by
analysis: s w
May contain
analysis: w
character data
Example
<s xml:id="ParlaMint-GB_2017-10-30-lords.seg4.1">  <w lemma="I"   msd="UPosTag=PRON|Case=Nom|Number=Sing|Person=1|PronType=Prspos="PRP">I</w>  <w lemma="support" @@ -3659,12 +3676,12 @@  <w lemma="amendment"   msd="UPosTag=NOUN|Number=Singpos="NNjoin="right">amendment</w>  <pc msd="UPosTag=PUNCTpos=".">.</pc> -</s>
ExampleCertain frameworks, in particular the Universal Dependencies, allow for tokens to be decomposed into several words, and it is these syntactic words, and not tokens, that are further annotated. For example, Czech has the word ‘abyste’ which is in UD decomposed into two syntactic words, ‘aby’ and ‘byste’, which can be encoded in the <w> element:
<w>abyste +</s>
ExampleCertain frameworks, in particular the Universal Dependencies, allow for tokens to be decomposed into several words, and it is these syntactic words, and not tokens, that are further annotated. For example, Czech has the word ‘abyste’ which is in UD decomposed into two syntactic words, ‘aby’ and ‘byste’, which can be encoded in the <w> element:
<w>abyste <w norm="abylemma="aby"   msd="UPosTag=SCONJ"/>  <w norm="bystelemma="být"   msd="UPosTag=AUX|Mood=Cnd|Number=Plur|Person=2|VerbForm=Fin"/> -</w>
Content model
+</w>
Content model
 <content>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
@@ -3672,7 +3689,7 @@
   <elementRef key="w"/>
  </alternate>
 </content>
-    
Schema Declaration
+    
Schema Declaration
 element w
 {
    tei_att.global.attribute.xmlid,
@@ -3680,7 +3697,7 @@
    tei_att.global.analytic.attribute.ana,
    tei_att.linguistic.attributes,
    ( text | tei_w )+
-}

Appendix A.2 Model classes

Appendix A.2.1 model.addressLike

model.addressLike groups elements used to represent a postal or email address. [1. The TEI Infrastructure]
Moduletei — Formal specification
Used by
Membersaffiliation email

Appendix A.2.2 model.attributable

model.attributable groups elements that contain a word or phrase that can be attributed to a source. [3.3.3. Quotation 4.3.2. Floating Texts]
Moduletei — Formal specification
Used by
Membersmodel.quoteLike

Appendix A.2.3 model.biblLike

model.biblLike groups elements containing a bibliographic description. [3.12. Bibliographic Citations and References]
Moduletei — Formal specification
Used by
Membersbibl

Appendix A.2.4 model.dateLike

model.dateLike groups elements containing temporal expressions. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Used by
Membersdate time

Appendix A.2.5 model.divPart

model.divPart groups paragraph-level elements appearing directly within divisions. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.divPart.spoken[u] model.lLike model.pLike[p]
Note

Note that this element class does not include members of the model.inter class, which can appear either within or between paragraph-level items.

Appendix A.2.6 model.divPart.spoken

model.divPart.spoken groups elements structurally analogous to paragraphs within spoken texts. [8.1. General Considerations and Overview]
Modulespoken — Formal specification
Used by
Membersu
Note

Spoken texts may be structured in many ways; elements in this class are typically larger units such as turns or utterances.

Appendix A.2.7 model.emphLike

model.emphLike groups phrase-level elements which are typographically distinct and to which a specific function can be attributed. [3.3. Highlighting and Quotation]
Moduletei — Formal specification
Used by
Membersterm title

Appendix A.2.8 model.global

Appendix A.2.9 model.global.edit

model.global.edit groups globally available elements which perform a specifically editorial function. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersgap

Appendix A.2.10 model.global.meta

model.global.meta groups globally available elements which describe the status of other elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Memberslink linkGrp
Note

Elements in this class are typically used to hold groups of links or of abstract interpretations, or by provide indications of certainty etc. It may find be convenient to localize all metadata elements, for example to contain them within the same divison as the elements that they relate to; or to locate them all to a division of their own. They may however appear at any point in a TEI text.

Appendix A.2.11 model.global.spoken

model.global.spoken groups elements which may appear globally within spoken texts. [8.1. General Considerations and Overview]
Modulespoken — Formal specification
Used by
Membersincident kinesic vocal
Note

This class groups elements which can appear anywhere within transcribed speech.

Appendix A.2.12 model.graphicLike

model.graphicLike groups elements containing images, formulae, and similar objects. [3.10. Graphics and Other Non-textual Components]
Moduletei — Formal specification
Used by
Membersgraphic media

Appendix A.2.13 model.highlighted

model.highlighted groups phrase-level elements which are typographically distinct. [3.3. Highlighting and Quotation]
Moduletei — Formal specification
Used by
Membersmodel.emphLike[term title] model.hiLike

Appendix A.2.14 model.inter

model.inter groups elements which can appear either within or between paragraph-like elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.attributable[model.quoteLike] model.biblLike[bibl] model.egLike model.labelLike[desc label] model.listLike[listEvent listOrg listPerson listRelation] model.oddDecl model.stageLike

Appendix A.2.15 model.labelLike

model.labelLike groups elements used to gloss or explain other parts of a document.
Moduletei — Formal specification
Used by
Membersdesc label

Appendix A.2.16 model.limitedPhrase

model.limitedPhrase groups phrase-level elements excluding those elements primarily intended for transcription of existing sources. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.emphLike[term title] model.hiLike model.pPart.data[model.addressLike[affiliation email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name orgName persName] model.offsetLike model.persNamePart[addName forename nameLink roleName surname] model.placeStateLike[model.placeNamePart[placeName] state] idno]] model.pPart.editorial model.pPart.msdesc model.phrase.xml model.ptrLike[ref]

Appendix A.2.17 model.listLike

model.listLike groups list-like elements. [3.8. Lists]
Moduletei — Formal specification
Used by
MemberslistEvent listOrg listPerson listRelation

Appendix A.2.18 model.measureLike

model.measureLike groups elements which denote a number, a quantity, a measurement, or similar piece of text that conveys some numerical meaning. [3.6.3. Numbers and Measures]
Moduletei — Formal specification
Used by
Membersmeasure num unit

Appendix A.2.19 model.milestoneLike

model.milestoneLike groups milestone-style elements used to represent reference systems. [1.3. The TEI Class System 3.11.3. Milestone Elements]
Moduletei — Formal specification
Used by
Memberspb

Appendix A.2.20 model.nameLike

model.nameLike groups elements which name or refer to a person, place, or organization.
Moduletei — Formal specification
Used by
Membersmodel.nameLike.agent[name orgName persName] model.offsetLike model.persNamePart[addName forename nameLink roleName surname] model.placeStateLike[model.placeNamePart[placeName] state] idno
Note

A superset of the naming elements that may appear in datelines, addresses, statements of responsibility, etc.

Appendix A.2.21 model.nameLike.agent

model.nameLike.agent groups elements which contain names of individuals or corporate bodies. [3.6. Names, Numbers, Dates, Abbreviations, and Addresses]
Moduletei — Formal specification
Used by
Membersname orgName persName
Note

This class is used in the content model of elements which reference names of people or organizations.

Appendix A.2.22 model.noteLike

model.noteLike groups globally-available note-like elements. [3.9. Notes, Annotation, and Indexing]
Moduletei — Formal specification
Used by
Membersnote

Appendix A.2.23 model.pLike

model.pLike groups paragraph-like elements.
Moduletei — Formal specification
Used by
Membersp

Appendix A.2.25 model.pPart.edit

model.pPart.edit groups phrase-level elements for simple editorial correction and transcription. [3.5. Simple Editorial Changes]
Moduletei — Formal specification
Used by
Membersmodel.pPart.editorial model.pPart.transcriptional

Appendix A.2.26 model.paraPart

Appendix A.2.27 model.persNamePart

model.persNamePart groups elements which form part of a personal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Used by
MembersaddName forename nameLink roleName surname

Appendix A.2.28 model.phrase

model.phrase groups elements which can occur at the level of individual words or phrases. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.graphicLike[graphic media] model.highlighted[model.emphLike[term title] model.hiLike] model.lPart model.pPart.data[model.addressLike[affiliation email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name orgName persName] model.offsetLike model.persNamePart[addName forename nameLink roleName surname] model.placeStateLike[model.placeNamePart[placeName] state] idno]] model.pPart.edit[model.pPart.editorial model.pPart.transcriptional] model.pPart.msdesc model.phrase.xml model.ptrLike[ref] model.segLike[pc s seg w] model.specDescLike
Note

This class of elements can occur within paragraphs, list items, lines of verse, etc.

Appendix A.2.29 model.placeNamePart

model.placeNamePart groups elements which form part of a place name. [13.2.3. Place Names]
Moduletei — Formal specification
Used by
MembersplaceName

Appendix A.2.30 model.placeStateLike

model.placeStateLike groups elements which describe changing states of a place.
Moduletei — Formal specification
Used by
Membersmodel.placeNamePart[placeName] state

Appendix A.2.31 model.ptrLike

model.ptrLike groups elements used for purposes of location and reference. [3.7. Simple Links and Cross-References]
Moduletei — Formal specification
Used by
Membersref

Appendix A.2.32 model.segLike

model.segLike groups elements used for arbitrary segmentation. [16.3. Blocks, Segments, and Anchors 17.1. Linguistic Segment Categories]
Moduletei — Formal specification
Used by
Memberspc s seg w
Note

The principles on which segmentation is carried out, and any special codes or attribute values used, should be defined explicitly in the <segmentation> element of the <encodingDesc> within the associated TEI header.

Appendix A.3 Attribute classes

Appendix A.3.1 att.ascribed

att.ascribed provides attributes for elements representing speech or action that can be ascribed to a specific individual. [3.3.3. Quotation 8.3. Elements Unique to Spoken Texts]
Moduletei — Formal specification
Membersatt.ascribed.directed[kinesic u vocal] change incident setting
Attributes
whoindicates the person, or group of people, to whom the element content is ascribed.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
In the following example from Hamlet, speeches (<sp>) in the body of the play are linked to <castItem> elements in the <castList> using the who attribute.
<castItem type="role"> +}

Appendix A.2 Model classes

Appendix A.2.1 model.addressLike

model.addressLike groups elements used to represent a postal or email address. [1. The TEI Infrastructure]
Moduletei — Formal specification
Used by
Membersaffiliation email

Appendix A.2.2 model.attributable

model.attributable groups elements that contain a word or phrase that can be attributed to a source. [3.3.3. Quotation 4.3.2. Floating Texts]
Moduletei — Formal specification
Used by
Membersmodel.quoteLike

Appendix A.2.3 model.biblLike

model.biblLike groups elements containing a bibliographic description. [3.12. Bibliographic Citations and References]
Moduletei — Formal specification
Used by
Membersbibl

Appendix A.2.4 model.dateLike

model.dateLike groups elements containing temporal expressions. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Used by
Membersdate time

Appendix A.2.5 model.divPart

model.divPart groups paragraph-level elements appearing directly within divisions. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.divPart.spoken[u] model.lLike model.pLike[p]
Note

Note that this element class does not include members of the model.inter class, which can appear either within or between paragraph-level items.

Appendix A.2.6 model.divPart.spoken

model.divPart.spoken groups elements structurally analogous to paragraphs within spoken texts. [8.1. General Considerations and Overview]
Modulespoken — Formal specification
Used by
Membersu
Note

Spoken texts may be structured in many ways; elements in this class are typically larger units such as turns or utterances.

Appendix A.2.7 model.emphLike

model.emphLike groups phrase-level elements which are typographically distinct and to which a specific function can be attributed. [3.3. Highlighting and Quotation]
Moduletei — Formal specification
Used by
Membersterm title

Appendix A.2.8 model.global

Appendix A.2.9 model.global.edit

model.global.edit groups globally available elements which perform a specifically editorial function. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersgap

Appendix A.2.10 model.global.meta

model.global.meta groups globally available elements which describe the status of other elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Memberslink linkGrp
Note

Elements in this class are typically used to hold groups of links or of abstract interpretations, or by provide indications of certainty etc. It may find be convenient to localize all metadata elements, for example to contain them within the same divison as the elements that they relate to; or to locate them all to a division of their own. They may however appear at any point in a TEI text.

Appendix A.2.11 model.global.spoken

model.global.spoken groups elements which may appear globally within spoken texts. [8.1. General Considerations and Overview]
Modulespoken — Formal specification
Used by
Membersincident kinesic vocal
Note

This class groups elements which can appear anywhere within transcribed speech.

Appendix A.2.12 model.graphicLike

model.graphicLike groups elements containing images, formulae, and similar objects. [3.10. Graphics and Other Non-textual Components]
Moduletei — Formal specification
Used by
Membersgraphic media

Appendix A.2.13 model.highlighted

model.highlighted groups phrase-level elements which are typographically distinct. [3.3. Highlighting and Quotation]
Moduletei — Formal specification
Used by
Membersmodel.emphLike[term title] model.hiLike

Appendix A.2.14 model.inter

model.inter groups elements which can appear either within or between paragraph-like elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.attributable[model.quoteLike] model.biblLike[bibl] model.egLike model.labelLike[desc label] model.listLike[listEvent listOrg listPerson listRelation] model.oddDecl model.stageLike

Appendix A.2.15 model.labelLike

model.labelLike groups elements used to gloss or explain other parts of a document.
Moduletei — Formal specification
Used by
Membersdesc label

Appendix A.2.16 model.limitedPhrase

model.limitedPhrase groups phrase-level elements excluding those elements primarily intended for transcription of existing sources. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.emphLike[term title] model.hiLike model.pPart.data[model.addressLike[affiliation email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name orgName persName] model.offsetLike model.persNamePart[addName forename nameLink roleName surname] model.placeStateLike[model.placeNamePart[placeName] state] idno]] model.pPart.editorial model.pPart.msdesc model.phrase.xml model.ptrLike[ref]

Appendix A.2.17 model.listLike

model.listLike groups list-like elements. [3.8. Lists]
Moduletei — Formal specification
Used by
MemberslistEvent listOrg listPerson listRelation

Appendix A.2.18 model.measureLike

model.measureLike groups elements which denote a number, a quantity, a measurement, or similar piece of text that conveys some numerical meaning. [3.6.3. Numbers and Measures]
Moduletei — Formal specification
Used by
Membersmeasure num unit

Appendix A.2.19 model.milestoneLike

model.milestoneLike groups milestone-style elements used to represent reference systems. [1.3. The TEI Class System 3.11.3. Milestone Elements]
Moduletei — Formal specification
Used by
Memberspb

Appendix A.2.20 model.nameLike

model.nameLike groups elements which name or refer to a person, place, or organization.
Moduletei — Formal specification
Used by
Membersmodel.nameLike.agent[name orgName persName] model.offsetLike model.persNamePart[addName forename nameLink roleName surname] model.placeStateLike[model.placeNamePart[placeName] state] idno
Note

A superset of the naming elements that may appear in datelines, addresses, statements of responsibility, etc.

Appendix A.2.21 model.nameLike.agent

model.nameLike.agent groups elements which contain names of individuals or corporate bodies. [3.6. Names, Numbers, Dates, Abbreviations, and Addresses]
Moduletei — Formal specification
Used by
Membersname orgName persName
Note

This class is used in the content model of elements which reference names of people or organizations.

Appendix A.2.22 model.noteLike

model.noteLike groups globally-available note-like elements. [3.9. Notes, Annotation, and Indexing]
Moduletei — Formal specification
Used by
Membersnote

Appendix A.2.23 model.pLike

model.pLike groups paragraph-like elements.
Moduletei — Formal specification
Used by
Membersp

Appendix A.2.25 model.pPart.edit

model.pPart.edit groups phrase-level elements for simple editorial correction and transcription. [3.5. Simple Editorial Changes]
Moduletei — Formal specification
Used by
Membersmodel.pPart.editorial model.pPart.transcriptional

Appendix A.2.26 model.paraPart

Appendix A.2.27 model.persNamePart

model.persNamePart groups elements which form part of a personal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Used by
MembersaddName forename nameLink roleName surname

Appendix A.2.28 model.phrase

model.phrase groups elements which can occur at the level of individual words or phrases. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.graphicLike[graphic media] model.highlighted[model.emphLike[term title] model.hiLike] model.lPart model.pPart.data[model.addressLike[affiliation email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name orgName persName] model.offsetLike model.persNamePart[addName forename nameLink roleName surname] model.placeStateLike[model.placeNamePart[placeName] state] idno]] model.pPart.edit[model.pPart.editorial model.pPart.transcriptional] model.pPart.msdesc model.phrase.xml model.ptrLike[ref] model.segLike[pc s seg w] model.specDescLike
Note

This class of elements can occur within paragraphs, list items, lines of verse, etc.

Appendix A.2.29 model.placeNamePart

model.placeNamePart groups elements which form part of a place name. [13.2.3. Place Names]
Moduletei — Formal specification
Used by
MembersplaceName

Appendix A.2.30 model.placeStateLike

model.placeStateLike groups elements which describe changing states of a place.
Moduletei — Formal specification
Used by
Membersmodel.placeNamePart[placeName] state

Appendix A.2.31 model.ptrLike

model.ptrLike groups elements used for purposes of location and reference. [3.7. Simple Links and Cross-References]
Moduletei — Formal specification
Used by
Membersref

Appendix A.2.32 model.segLike

model.segLike groups elements used for arbitrary segmentation. [16.3. Blocks, Segments, and Anchors 17.1. Linguistic Segment Categories]
Moduletei — Formal specification
Used by
Memberspc s seg w
Note

The principles on which segmentation is carried out, and any special codes or attribute values used, should be defined explicitly in the <segmentation> element of the <encodingDesc> within the associated TEI header.

Appendix A.3 Attribute classes

Appendix A.3.1 att.ascribed

att.ascribed provides attributes for elements representing speech or action that can be ascribed to a specific individual. [3.3.3. Quotation 8.3. Elements Unique to Spoken Texts]
Moduletei — Formal specification
Membersatt.ascribed.directed[kinesic u vocal] change incident setting
Attributes
whoindicates the person, or group of people, to whom the element content is ascribed.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
In the following example from Hamlet, speeches (<sp>) in the body of the play are linked to <castItem> elements in the <castList> using the who attribute.
<castItem type="role">  <role xml:id="Barnardo">Bernardo</role> </castItem> <castItem type="role"> @@ -3695,14 +3712,14 @@ <sp who="#Francisco">  <speaker>Francisco</speaker>  <l n="2">Nay, answer me: stand, and unfold yourself.</l> -</sp>
Note

For transcribed speech, this will typically identify a participant or participant group; in other contexts, it will point to any identified <person> element.

Appendix A.3.2 att.canonical

att.canonical provides attributes that can be used to associate a representation such as a name or title with canonical information about the object being named or referenced. [13.1.1. Linking Names and Their Referents]
Moduletei — Formal specification
Membersatt.naming[att.personal[addName forename name orgName persName placeName roleName surname] affiliation birth death education event occupation pubPlace state] catDesc date funder meeting publisher relation resp respStmt term time title
Attributes
keyprovides an externally-defined means of identifying the entity (or entities) being named, using a coded value of some kind.
StatusOptional
Datatypeteidata.text
<author> +</sp>
Note

For transcribed speech, this will typically identify a participant or participant group; in other contexts, it will point to any identified <person> element.

Appendix A.3.2 att.canonical

att.canonical provides attributes that can be used to associate a representation such as a name or title with canonical information about the object being named or referenced. [13.1.1. Linking Names and Their Referents]
Moduletei — Formal specification
Membersatt.naming[att.personal[addName forename name orgName persName placeName roleName surname] affiliation birth death education event occupation pubPlace state] catDesc date funder meeting publisher relation resp respStmt term time title
Attributes
keyprovides an externally-defined means of identifying the entity (or entities) being named, using a coded value of some kind.
StatusOptional
Datatypeteidata.text
<author>  <name key="name 427308"   type="organisation">[New Zealand Parliament, Legislative Council]</name> -</author>
<author> +</author>
<author>  <name key="Hugo, Victor (1802-1885)"   ref="http://www.idref.fr/026927608">Victor Hugo</name> -</author>
Note

The value may be a unique identifier from a database, or any other externally-defined string identifying the referent.

No particular syntax is proposed for the values of the key attribute, since its form will depend entirely on practice within a given project. For the same reason, this attribute is not recommended in data interchange, since there is no way of ensuring that the values used by one project are distinct from those used by another. In such a situation, a preferable approach for magic tokens which follows standard practice on the Web is to use a ref attribute whose value is a tag URI as defined in RFC 4151.

ref(reference) provides an explicit means of locating a full definition or identity for the entity being named by means of one or more URIs.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<name ref="http://viaf.org/viaf/109557338" - type="person">Seamus Heaney</name>
Note

The value must point directly to one or more XML elements or other resources by means of one or more URIs, separated by whitespace. If more than one is supplied the implication is that the name identifies several distinct entities.

Appendix A.3.3 att.datable.custom

att.datable.custom provides attributes for normalization of elements that contain datable events to a custom dating system (i.e. other than the Gregorian used by W3 and ISO). [13.4. Dates]
Modulenamesdates — Formal specification
Membersatt.datable[affiliation application birth change date death education event funder idno licence meeting name occupation orgName persName placeName relation resp sex state time title]
Attributes
when-customsupplies the value of a date or time in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
The following are examples of custom date or time formats that are not valid ISO or W3C format normalizations, normalized to a different dating system
<p>Alhazen died in Cairo on the +</author>
Note

The value may be a unique identifier from a database, or any other externally-defined string identifying the referent.

No particular syntax is proposed for the values of the key attribute, since its form will depend entirely on practice within a given project. For the same reason, this attribute is not recommended in data interchange, since there is no way of ensuring that the values used by one project are distinct from those used by another. In such a situation, a preferable approach for magic tokens which follows standard practice on the Web is to use a ref attribute whose value is a tag URI as defined in RFC 4151.

ref(reference) provides an explicit means of locating a full definition or identity for the entity being named by means of one or more URIs.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<name ref="http://viaf.org/viaf/109557338" + type="person">Seamus Heaney</name>
Note

The value must point directly to one or more XML elements or other resources by means of one or more URIs, separated by whitespace. If more than one is supplied the implication is that the name identifies several distinct entities.

Appendix A.3.3 att.datable.custom

att.datable.custom provides attributes for normalization of elements that contain datable events to a custom dating system (i.e. other than the Gregorian used by W3 and ISO). [13.4. Dates]
Modulenamesdates — Formal specification
Membersatt.datable[affiliation application birth change date death education event funder idno licence meeting name occupation orgName persName placeName relation resp sex state time title]
Attributes
when-customsupplies the value of a date or time in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
The following are examples of custom date or time formats that are not valid ISO or W3C format normalizations, normalized to a different dating system
<p>Alhazen died in Cairo on the <date when="1040-03-06"   when-custom="431-06-12"> 12th day of Jumada t-Tania, 430 AH  </date>.</p> @@ -3713,31 +3730,31 @@ (<date when-custom="Thutmose_III:23">23rd year of reign of Thutmose III</date>).</p> <p>Esidorus bixit in pace annos LXX plus minus sub <date when-custom="Ind:4-10-11">die XI mensis Octobris indictione IIII</date> -</p>
Not all custom date formulations will have Gregorian equivalents.The when-custom attribute and other custom dating are not constrained to a datatype by the TEI, but individual projects are recommended to regularize and document their dating formats.
notBefore-customspecifies the earliest possible date for the event in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
notAfter-customspecifies the latest possible date for the event in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
from-customindicates the starting point of the period in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
<event xml:id="FIRE1" +</p>
Not all custom date formulations will have Gregorian equivalents.The when-custom attribute and other custom dating are not constrained to a datatype by the TEI, but individual projects are recommended to regularize and document their dating formats.
notBefore-customspecifies the earliest possible date for the event in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
notAfter-customspecifies the latest possible date for the event in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
from-customindicates the starting point of the period in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
<event xml:id="FIRE1"  datingMethod="#julian"  from-custom="1666-09-02"  to-custom="1666-09-05">  <head>The Great Fire of London</head>  <p>The Great Fire of London burned through a large part    of the city of London.</p> -</event>
to-customindicates the ending point of the period in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
datingPointsupplies a pointer to some location defining a named point in time with reference to which the datable item is understood to have occurred
StatusOptional
Datatypeteidata.pointer
datingMethodsupplies a pointer to a <calendar> element or other means of interpreting the values of the custom dating attributes.
StatusOptional
Datatypeteidata.pointer
Contayning the Originall, Antiquity, Increaſe, Moderne +</event>
to-customindicates the ending point of the period in some custom standard form.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
datingPointsupplies a pointer to some location defining a named point in time with reference to which the datable item is understood to have occurred
StatusOptional
Datatypeteidata.pointer
datingMethodsupplies a pointer to a <calendar> element or other means of interpreting the values of the custom dating attributes.
StatusOptional
Datatypeteidata.pointer
Contayning the Originall, Antiquity, Increaſe, Moderne eſtate, and deſcription of that Citie, written in the yeare <date when-custom="1598"  calendar="#julian"  datingMethod="#julian">1598</date>. by Iohn Stow - Citizen of London.
In this example, the calendar attribute points to a <calendar> element for the Julian calendar, specifying that the text content of the <date> element is a Julian date, and the datingMethod attribute also points to the Julian calendar to indicate that the content of the when-custom attribute value is Julian too.
<date when="1382-06-28" + Citizen of London.
In this example, the calendar attribute points to a <calendar> element for the Julian calendar, specifying that the text content of the <date> element is a Julian date, and the datingMethod attribute also points to the Julian calendar to indicate that the content of the when-custom attribute value is Julian too.
<date when="1382-06-28"  when-custom="6890-06-20"  datingMethod="#creationOfWorld"> μηνὶ Ἰουνίου εἰς <num>κ</num> ἔτους <num>ςωϞ</num> -</date>
In this example, a date is given in a Mediaeval text measured ‘from the creation of the world’, which is normalized (in when) to the Gregorian date, but is also normalized (in when-custom) to a machine-actionable, numeric version of the date from the Creation.
Note

Note that the datingMethod attribute (unlike calendar defined in att.datable) defines the calendar or dating system to which the date described by the parent element is normalized (i.e. in the when-custom or other X-custom attributes), not the calendar of the original date in the element.

Appendix A.3.4 att.datable.iso

att.datable.iso provides attributes for normalization of elements that contain datable events using the ISO 8601:2004 standard. [3.6.4. Dates and Times 13.4. Dates]
Modulenamesdates — Formal specification
Membersatt.datable[affiliation application birth change date death education event funder idno licence meeting name occupation orgName persName placeName relation resp sex state time title]
Attributes
when-isosupplies the value of a date or time in a standard form.
StatusOptional
Datatypeteidata.temporal.iso
The following are examples of ISO date, time, and date & time formats that are not valid W3C format normalizations.
<date when-iso="1996-09-24T07:25+00">Sept. 24th, 1996 at 3:25 in the morning</date> +</date>
In this example, a date is given in a Mediaeval text measured ‘from the creation of the world’, which is normalized (in when) to the Gregorian date, but is also normalized (in when-custom) to a machine-actionable, numeric version of the date from the Creation.
Note

Note that the datingMethod attribute (unlike calendar defined in att.datable) defines the calendar or dating system to which the date described by the parent element is normalized (i.e. in the when-custom or other X-custom attributes), not the calendar of the original date in the element.

Appendix A.3.4 att.datable.iso

att.datable.iso provides attributes for normalization of elements that contain datable events using the ISO 8601:2004 standard. [3.6.4. Dates and Times 13.4. Dates]
Modulenamesdates — Formal specification
Membersatt.datable[affiliation application birth change date death education event funder idno licence meeting name occupation orgName persName placeName relation resp sex state time title]
Attributes
when-isosupplies the value of a date or time in a standard form.
StatusOptional
Datatypeteidata.temporal.iso
The following are examples of ISO date, time, and date & time formats that are not valid W3C format normalizations.
<date when-iso="1996-09-24T07:25+00">Sept. 24th, 1996 at 3:25 in the morning</date> <date when-iso="1996-09-24T03:25-04">Sept. 24th, 1996 at 3:25 in the morning</date> <time when-iso="1999-01-04T20:42-05">4 Jan 1999 at 8:42 pm</time> <time when-iso="1999-W01-1T20,70-05">4 Jan 1999 at 8:42 pm</time> <date when-iso="2006-05-18T10:03">a few minutes after ten in the morning on Thu 18 May</date> <time when-iso="03:00">3 A.M.</time> <time when-iso="14">around two</time> -<time when-iso="15,5">half past three</time>
All of the examples of the when attribute in the att.datable.w3c class are also valid with respect to this attribute.
He likes to be punctual. I said <q> +<time when-iso="15,5">half past three</time>
All of the examples of the when attribute in the att.datable.w3c class are also valid with respect to this attribute.
He likes to be punctual. I said <q>  <time when-iso="12">around noon</time> -</q>, and he showed up at <time when-iso="12:00:00">12 O'clock</time> on the dot.
The second occurence of <time> could have been encoded with the when attribute, as 12:00:00 is a valid time with respect to the W3C XML Schema Part 2: Datatypes Second Edition specification. The first occurence could not.
notBefore-isospecifies the earliest possible date for the event in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.iso
notAfter-isospecifies the latest possible date for the event in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.iso
from-isoindicates the starting point of the period in standard form.
StatusOptional
Datatypeteidata.temporal.iso
to-isoindicates the ending point of the period in standard form.
StatusOptional
Datatypeteidata.temporal.iso
Note

The value of these attributes should be a normalized representation of the date, time, or combined date & time intended, in any of the standard formats specified by ISO 8601:2004, using the Gregorian calendar.

If both when-iso and dur-iso are specified, the values should be interpreted as indicating a span of time by its starting time (or date) and duration. That is,
<date when-iso="2007-06-01dur-iso="P8D"/>
indicates the same time period as
<date when-iso="2007-06-01/P8D"/>

In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Appendix A.3.5 att.datable.w3c

att.datable.w3c provides attributes for normalization of elements that contain datable events conforming to the W3C XML Schema Part 2: Datatypes Second Edition. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Membersatt.datable[affiliation application birth change date death education event funder idno licence meeting name occupation orgName persName placeName relation resp sex state time title]
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
Examples of W3C date, time, and date & time formats.
<p> +</q>, and he showed up at <time when-iso="12:00:00">12 O'clock</time> on the dot.
The second occurence of <time> could have been encoded with the when attribute, as 12:00:00 is a valid time with respect to the W3C XML Schema Part 2: Datatypes Second Edition specification. The first occurence could not.
notBefore-isospecifies the earliest possible date for the event in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.iso
notAfter-isospecifies the latest possible date for the event in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.iso
from-isoindicates the starting point of the period in standard form.
StatusOptional
Datatypeteidata.temporal.iso
to-isoindicates the ending point of the period in standard form.
StatusOptional
Datatypeteidata.temporal.iso
Note

The value of these attributes should be a normalized representation of the date, time, or combined date & time intended, in any of the standard formats specified by ISO 8601:2004, using the Gregorian calendar.

If both when-iso and dur-iso are specified, the values should be interpreted as indicating a span of time by its starting time (or date) and duration. That is,
<date when-iso="2007-06-01dur-iso="P8D"/>
indicates the same time period as
<date when-iso="2007-06-01/P8D"/>

In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Appendix A.3.5 att.datable.w3c

att.datable.w3c provides attributes for normalization of elements that contain datable events conforming to the W3C XML Schema Part 2: Datatypes Second Edition. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Membersatt.datable[affiliation application birth change date death education event funder idno licence meeting name occupation orgName persName placeName relation resp sex state time title]
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
Examples of W3C date, time, and date & time formats.
<p>  <date when="1945-10-24">24 Oct 45</date>  <date when="1996-09-24T07:25:00Z">September 24th, 1996 at 3:25 in the morning</date>  <time when="1999-01-04T20:42:00-05:00">Jan 4 1999 at 8 pm</time> @@ -3749,11 +3766,11 @@  <date when="2006">MMVI</date>  <date when="0056">AD 56</date>  <date when="-0056">56 BC</date> -</p>
This list begins in +</p>
This list begins in the year 1632, more precisely on Trinity Sunday, i.e. the Sunday after Pentecost, in that year the <date calendar="#julian" - when="1632-06-06">27th of May (old style)</date>.
<opener>when="1632-06-06">27th of May (old style)</date>.
<opener>  <dateline>   <placeName>Dorchester, Village,</placeName>   <date when="1828-03-02">March 2d. 1828.</date> @@ -3772,26 +3789,26 @@ <sch:rule context="tei:*[@to]"> <sch:report test="@notAfter"  role="nonfatal">The @to and @notAfter attributes cannot be used together.</sch:report> -</sch:rule>
Example
<date from="1863-05-28to="1863-06-01">28 May through 1 June 1863</date>
Note

The value of these attributes should be a normalized representation of the date, time, or combined date & time intended, in any of the standard formats specified by XML Schema Part 2: Datatypes Second Edition, using the Gregorian calendar.

The most commonly-encountered format for the date portion of a temporal attribute is yyyy-mm-dd, but yyyy, --mm, ---dd, yyyy-mm, or --mm-dd may also be used. For the time part, the form hh:mm:ss is used.

Note that this format does not currently permit use of the value 0000 to represent the year 1 BCE; instead the value -0001 should be used.

Appendix A.3.6 att.datcat

att.datcat provides attributes that are used to align XML elements or attributes with the appropriate Data Categories (DCs) defined by an external taxonomy, in this way establishing the identity of information containers and values, and providing means of interpreting them. [9.5.2. Lexical View 18.3. Other Atomic Feature Values]
Moduletei — Formal specification
Membersatt.segLike[pc s seg w] tagUsage
Attributes
datcatprovides a pointer to a definition of, and/or general information about, (a) an information container (element or attribute) or (b) a value of an information container (element content or attribute value), by referencing an external taxonomy or ontology. If valueDatcat is present in the immediate context, this attribute takes on role (a), while valueDatcat performs role (b).
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
valueDatcatprovides a definition of, and/or general information about a value of an information container (element content or attribute value), by reference to an external taxonomy or ontology. Used especially where a contrast with datcat is needed.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
targetDatcatprovides a definition of, and/or general information about, information structure of an object referenced or modeled by the containing element, by reference to an external taxonomy or ontology. This attribute has the characteristics of the datcat attribute, except that it addresses not its containing element, but an object that is being referenced or modeled by its containing element.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
ExampleThe example below presents the TEI encoding of the name-value pair <part of speech, common noun>, where the name (key) ‘part of speech’ is abbreviated as ‘POS’, and the value, ‘common noun’ is symbolized by ‘NN’. The entire name-value pair is encoded by means of the element <f>. In TEI XML, that element acts as the container, labeled with the name attribute. Its contents may be complex or simple. In the case at hand, the content is the symbol ‘NN’.The datcat attribute relates the feature name (i.e., the key) to the data category ‘part of speech’, while the attribute valueDatcat relates the feature value to the data category common noun. Both these data categories should be defined in an external and preferably open reference taxonomy or ontology.
<fs> +</sch:rule>
Example
<date from="1863-05-28to="1863-06-01">28 May through 1 June 1863</date>
Note

The value of these attributes should be a normalized representation of the date, time, or combined date & time intended, in any of the standard formats specified by XML Schema Part 2: Datatypes Second Edition, using the Gregorian calendar.

The most commonly-encountered format for the date portion of a temporal attribute is yyyy-mm-dd, but yyyy, --mm, ---dd, yyyy-mm, or --mm-dd may also be used. For the time part, the form hh:mm:ss is used.

Note that this format does not currently permit use of the value 0000 to represent the year 1 BCE; instead the value -0001 should be used.

Appendix A.3.6 att.datcat

att.datcat provides attributes that are used to align XML elements or attributes with the appropriate Data Categories (DCs) defined by an external taxonomy, in this way establishing the identity of information containers and values, and providing means of interpreting them. [9.5.2. Lexical View 18.3. Other Atomic Feature Values]
Moduletei — Formal specification
Membersatt.segLike[pc s seg w] tagUsage
Attributes
datcatprovides a pointer to a definition of, and/or general information about, (a) an information container (element or attribute) or (b) a value of an information container (element content or attribute value), by referencing an external taxonomy or ontology. If valueDatcat is present in the immediate context, this attribute takes on role (a), while valueDatcat performs role (b).
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
valueDatcatprovides a definition of, and/or general information about a value of an information container (element content or attribute value), by reference to an external taxonomy or ontology. Used especially where a contrast with datcat is needed.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
targetDatcatprovides a definition of, and/or general information about, information structure of an object referenced or modeled by the containing element, by reference to an external taxonomy or ontology. This attribute has the characteristics of the datcat attribute, except that it addresses not its containing element, but an object that is being referenced or modeled by its containing element.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
ExampleThe example below presents the TEI encoding of the name-value pair <part of speech, common noun>, where the name (key) ‘part of speech’ is abbreviated as ‘POS’, and the value, ‘common noun’ is symbolized by ‘NN’. The entire name-value pair is encoded by means of the element <f>. In TEI XML, that element acts as the container, labeled with the name attribute. Its contents may be complex or simple. In the case at hand, the content is the symbol ‘NN’.The datcat attribute relates the feature name (i.e., the key) to the data category ‘part of speech’, while the attribute valueDatcat relates the feature value to the data category common noun. Both these data categories should be defined in an external and preferably open reference taxonomy or ontology.
<fs>  <f name="POS"   datcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3">   <symbol valueDatcat="http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545"    value="NN"/>  </f> <!-- ... --> -</fs>
‘NN’ is the symbol for common noun used e.g. in the CLAWS-7 tagset defined by the University Centre for Computer Corpus Research on Language at the University of Lancaster. The very same data category used for tagging an early version of the British National Corpus, and coming from the BNC Basic (C5) tagset, uses the symbol ‘NN0’ (rather than ‘NN’). Making these values semantically interoperable would be extremely difficult without a human expert if they were not anchored in a single point of an established reference taxonomy of morphosyntactic data categories. In the case at hand, the string http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545 is both a persistent identifier of the data category in question, as well as a pointer to a shared definition of common noun.While the symbols ‘NN’, ‘NN0’, and many others (often coming from languages other than English) are implicitly members of the container category ‘part of speech’, it is sometimes useful not to rely on such an implicit relationship but rather use an explicit identifier for that data category, to distinguish it from other morphosyntactic data categories, such as gender, tense, etc. For that purpose, the above example uses the datcat attribute to reference a definition of part of speech. The reference taxonomy in this example is the CLARIN Concept Registry.If the feature structure markup exemplified above is to be repeated many times in a single document, it is much more efficient to gather the persistent identifiers in a single place and to only reference them, implicitly or directly, from feature structure markup. The following example is much more concise than the one above and relies on the concepts of feature structure declaration and feature value library, discussed in chapter [[undefined FS]].
<fs> +</fs>
‘NN’ is the symbol for common noun used e.g. in the CLAWS-7 tagset defined by the University Centre for Computer Corpus Research on Language at the University of Lancaster. The very same data category used for tagging an early version of the British National Corpus, and coming from the BNC Basic (C5) tagset, uses the symbol ‘NN0’ (rather than ‘NN’). Making these values semantically interoperable would be extremely difficult without a human expert if they were not anchored in a single point of an established reference taxonomy of morphosyntactic data categories. In the case at hand, the string http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545 is both a persistent identifier of the data category in question, as well as a pointer to a shared definition of common noun.While the symbols ‘NN’, ‘NN0’, and many others (often coming from languages other than English) are implicitly members of the container category ‘part of speech’, it is sometimes useful not to rely on such an implicit relationship but rather use an explicit identifier for that data category, to distinguish it from other morphosyntactic data categories, such as gender, tense, etc. For that purpose, the above example uses the datcat attribute to reference a definition of part of speech. The reference taxonomy in this example is the CLARIN Concept Registry.If the feature structure markup exemplified above is to be repeated many times in a single document, it is much more efficient to gather the persistent identifiers in a single place and to only reference them, implicitly or directly, from feature structure markup. The following example is much more concise than the one above and relies on the concepts of feature structure declaration and feature value library, discussed in chapter [[undefined FS]].
<fs>  <f name="POSfVal="#commonNoun"/> <!-- ... --> -</fs>
The assumption here is that the relevant feature values are collected in a place that the annotation document in question has access to — preferably, a single document per linguistic resource, for example an <fsdDecl> that is XIncluded as a sibling of <text> or a child of <encodingDesc>; a <taxonomy> available resource-wide (e.g., in a shared header) is also an option.The example below presents an <fvLib> element that collects the relevant feature values (most of them omitted). At the same time, this example shows one way of encoding a tagset, i.e., an established inventory of values of (in the case at hand) morphosyntactic categories.
<fvLib n="POS values"> +</fs>
The assumption here is that the relevant feature values are collected in a place that the annotation document in question has access to — preferably, a single document per linguistic resource, for example an <fsdDecl> that is XIncluded as a sibling of <text> or a child of <encodingDesc>; a <taxonomy> available resource-wide (e.g., in a shared header) is also an option.The example below presents an <fvLib> element that collects the relevant feature values (most of them omitted). At the same time, this example shows one way of encoding a tagset, i.e., an established inventory of values of (in the case at hand) morphosyntactic categories.
<fvLib n="POS values">  <symbol xml:id="commonNounvalue="NN"   datcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3"/>  <symbol xml:id="properNounvalue="NP"   datcat="http://hdl.handle.net/11459/CCR_C-1371_fbebd9ec-a7f4-9a36-d6e9-88ee16b944ae"/> <!-- ... --> -</fvLib>
Note that these Guidelines do not prescribe a specific choice between datcat and valueDatcat in such cases. The former is the generic way of referencing a data category, whereas the latter is more specific, in that it references a data category that represents a value. The choice between them comes into play where a single element — or a tight element complex, such as the <f>/<symbol> complex illustrated above — make it necessary or useful to distinguish between the container data category and its value.
ExampleIn the context of dictionaries designed with semantic interoperability in mind, the following example ensures that the <pos> element is interpreted as the same information container as in the case of the example of <f name="POS"> above.
<gramGrp> +</fvLib>
Note that these Guidelines do not prescribe a specific choice between datcat and valueDatcat in such cases. The former is the generic way of referencing a data category, whereas the latter is more specific, in that it references a data category that represents a value. The choice between them comes into play where a single element — or a tight element complex, such as the <f>/<symbol> complex illustrated above — make it necessary or useful to distinguish between the container data category and its value.
ExampleIn the context of dictionaries designed with semantic interoperability in mind, the following example ensures that the <pos> element is interpreted as the same information container as in the case of the example of <f name="POS"> above.
<gramGrp>  <pos datcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3"   valueDatcat="http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545">NN</pos> -</gramGrp>
Efficiency of this type of interoperable markup demands that the references to the particular data categories should best be provided in a single place within the dictionary (or a single place within the project), rather than being repeated inside every entry. For the container elements, this can be achieved at the level of <tagUsage>, although here, the valueDatcat attribute should be used, because it is not the <tagUsage> element that is associated with the relevant data category, but rather the element <pos> (or <case>, etc.) that is described by <tagUsage>:
<tagsDecl partial="true"> +</gramGrp>
Efficiency of this type of interoperable markup demands that the references to the particular data categories should best be provided in a single place within the dictionary (or a single place within the project), rather than being repeated inside every entry. For the container elements, this can be achieved at the level of <tagUsage>, although here, the valueDatcat attribute should be used, because it is not the <tagUsage> element that is associated with the relevant data category, but rather the element <pos> (or <case>, etc.) that is described by <tagUsage>:
<tagsDecl partial="true"> <!-- ... -->  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="pos" @@ -3800,7 +3817,7 @@    targetDatcat="http://hdl.handle.net/11459/CCR_C-1840_9f4e319c-f233-6c90-9117-7270e215f039">Contains information about the grammatical case that the described form is inflected for.</tagUsage> <!-- ... -->  </namespace> -</tagsDecl>
Another possibility is to shorten the URIs by means of the <prefixDef> mechanism, as illustrated below:
<listPrefixDef> +</tagsDecl>
Another possibility is to shorten the URIs by means of the <prefixDef> mechanism, as illustrated below:
<listPrefixDef>  <prefixDef ident="ccrmatchPattern="pos"   replacementPattern="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3"/>  <prefixDef ident="ccrmatchPattern="adj" @@ -3817,7 +3834,7 @@    valueDatcat="ccr:adj">adj</pos>  </gramGrp> <!--...--> -</entry>
This mechanism creates implications that are not always wanted, among others, in the case at hand, suggesting that the identifiers ‘pos’ and ‘adj’ belong to a namespace associated with the CLARIN Concept Repository (CCR), whereas that is solely a shorthand mechanism whose scope is the current resource. Documenting this clearly in the header of the dictionary is therefore advised.Yet another possibility is to associate the information about the relationship between a TEI markup element and the data category that it is intended to model already at the level of modeling the dictionary resource, that is, at the level of the ODD, in <equiv> element that is a child of <elementSpec> or <attDef>.
ExampleThe targetDatcat attribute is designed to be used in, e.g., feature structure declarations, and is analogous to the targetLang attribute of the att.pointing class, in that it describes the object that is being referenced, rather than the referencing object.
<fDecl name="POS" +</entry>
This mechanism creates implications that are not always wanted, among others, in the case at hand, suggesting that the identifiers ‘pos’ and ‘adj’ belong to a namespace associated with the CLARIN Concept Repository (CCR), whereas that is solely a shorthand mechanism whose scope is the current resource. Documenting this clearly in the header of the dictionary is therefore advised.Yet another possibility is to associate the information about the relationship between a TEI markup element and the data category that it is intended to model already at the level of modeling the dictionary resource, that is, at the level of the ODD, in <equiv> element that is a child of <elementSpec> or <attDef>.
ExampleThe targetDatcat attribute is designed to be used in, e.g., feature structure declarations, and is analogous to the targetLang attribute of the att.pointing class, in that it describes the object that is being referenced, rather than the referencing object.
<fDecl name="POS"  targetDatcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3">  <fDescr>part of speech (morphosyntactic category)</fDescr>  <vRange> @@ -3829,11 +3846,11 @@ <!-- ... -->   </vAlt>  </vRange> -</fDecl>
Above, the <fDecl> uses targetDatcat, because if it were to use datcat, it would be asserting that it is an instance of the container data category part of speech, whereas it is not — it models a container (<f>) that encodes a part of speech. Note also that it is the <f> that is modeled above, not its values, which are used as direct references to data categories; hence the use of datcat in the <symbol> element.
Note

The TEI Abstract Model can be expressed as a hierarchy of attribute-value matrices (AVMs) of various types and of various levels of complexity, nested or grouped in various ways. At the most abstract level, an AVM consists of an information container and the value (contents) of that container.

A simple example of an XML serialization of such structures is, on the one hand, the opening and closing tags that delimit and name the container, and, on the other, the content enclosed by the two tags that constitues the value. An analogous example is an attribute name and the value of that attribute.

In a TEI XML example of two equivalent serializations expressing the name-value pair <part-of-speech,common-noun>, namely <pos>commonNoun</pos> and pos="common-noun", one would classify the element <pos> and the attribute pos as containers (mapping onto the first member of the relevant name-value pair), while the character data content of <pos> or the value of pos would be seen as mapping onto the second member of the pair.

The att.datcat class provides means of addressing the containers and their values, while at the same time providing a way to interpret them in the context of external taxonomies or ontologies. Aligning e.g. both the <pos> element and the pos attribute with the same value of an external reference point (i.e., an entry in an agreed taxonomy) affirms the identity of the concept serialised by both the element container and the attribute container, and optionally provides a definition of that concept (in the case at hand, the concept part of speech).

The value of the att.datcat attributes should be a PID (persistent identifier) that points to a specific — and, ideally, shared — taxonomy or ontology. Among the resources that can, to a lesser or greater extent, be used as inventories of (more or less) standardized linguistic categories are the GOLD ontology, CLARIN CCR, OLiA, or TermWeb's DatCatInfo, and also the Universal Dependencies inventory, on the assumption that its URIs are going to persist. It is imaginable that a project may choose to address a local taxonomy store instead, but this risks losing the advantage of interchangeability with other projects.

Historically, datcat and valueDatcat originate from the (the now obsolete) ISO 12620:2009 standard, describing the data model and procedures for a Data Category Registry (DCR). The current version of that standard, ISO 12620-1, does not standardize the serialization of pointers, merely mentioning the TEI att.datcat as an example.

Note that no constraint prevents the occurrence of a combination of att.datcat attributes: the <fDecl> element, which is a natural bearer of the targetDatcat attribute, is an instance of a specific modeling element, and, in principle, could be semantically fixed by an appropriate reference taxonomy of modeling devices.

Appendix A.3.7 att.declarable

att.declarable provides attributes for those elements in the TEI header which may be independently selected by means of the special purpose decls attribute. [15.3. Associating Contextual Information with a Text]
Moduletei — Formal specification
Membersavailability bibl correction editorialDecl equipment equipment hyphenation langUsage listEvent listOrg listPerson normalization particDesc projectDesc quotation recording segmentation settingDesc sourceDesc textClass
Attributes
defaultindicates whether or not this element is selected by default when its parent is selected.
StatusOptional
Datatypeteidata.truthValue
Legal values are:
true
This element is selected if its parent is selected
false
This element can only be selected explicitly, unless it is the only one of its kind, in which case it is selected if its parent is selected.[Default]
Note

The rules governing the association of declarable elements with individual parts of a TEI text are fully defined in chapter 15.3. Associating Contextual Information with a Text. Only one element of a particular type may have a default attribute with a value of true.

Appendix A.3.8 att.duration

att.duration provides attributes for normalization of elements that contain datable events.
Modulespoken — Formal specification
Membersatt.timed[gap incident kinesic media u vocal] date recording time
Attributesatt.duration.w3c (@dur) att.duration.iso (@dur-iso)
Note

This ‘superclass’ provides attributes that can be used to provide normalized values of temporal information. By default, the attributes from the att.duration.w3c class are provided. If the module for names & dates is loaded, this class also provides attributes from the att.duration.iso class. In general, the possible values of attributes restricted to the W3C datatypes form a subset of those values available via the ISO 8601 standard. However, the greater expressiveness of the ISO datatypes is rarely needed, and there exists much greater software support for the W3C datatypes.

Appendix A.3.9 att.duration.iso

att.duration.iso provides attributes for recording normalized temporal durations. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Membersatt.duration[att.timed[gap incident kinesic media u vocal] date recording time]
Attributes
dur-iso(duration) indicates the length of this element in time.
StatusOptional
Datatypeteidata.duration.iso
Note

If both when and dur or dur-iso are specified, the values should be interpreted as indicating a span of time by its starting time (or date) and duration. In order to represent a time range by a duration and its ending time the when-iso attribute must be used.

In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Appendix A.3.10 att.duration.w3c

att.duration.w3c provides attributes for recording normalized temporal durations. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Membersatt.duration[att.timed[gap incident kinesic media u vocal] date recording time]
Attributes
dur(duration) indicates the length of this element in time.
StatusOptional
Datatypeteidata.duration.w3c
Note

If both when and dur are specified, the values should be interpreted as indicating a span of time by its starting time (or date) and duration. In order to represent a time range by a duration and its ending time the when-iso attribute must be used.

In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Appendix A.3.11 att.fragmentable

att.fragmentable provides attributes for representing fragmentation of a structural element, typically as a consequence of some overlapping hierarchy.
Moduletei — Formal specification
Membersatt.divLike[div] att.segLike[pc s seg w] p
Attributes
partspecifies whether or not its parent element is fragmented in some way, typically by some other overlapping structure: for example a speech which is divided between two or more verse stanzas, a paragraph which is split across a page division, a verse line which is divided between two speakers.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
Y
(yes) the element is fragmented in some (unspecified) respect
N
(no) the element is not fragmented, or no claim is made as to its completeness[Default]
I
(initial) this is the initial part of a fragmented element
M
(medial) this is a medial part of a fragmented element
F
(final) this is the final part of a fragmented element
Note

The values I, M, or F should be used only where it is clear how the element may be reconstituted.

Appendix A.3.12 att.global

att.global provides attributes common to all elements in the TEI encoding scheme. [1.3.1.1. Global Attributes]
Moduletei — Formal specification
MembersTEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w
Attributesatt.global.rendition (@rend, @style, @rendition) att.global.linking (@corresp, @synch, @next, @prev) att.global.analytic (@ana) att.global.responsibility (@resp) att.global.source (@source)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
StatusOptional
DatatypeID
Note

The xml:id attribute may be used to specify a canonical reference for an element; see section 3.11. Reference Systems.

n(number) gives a number (or other label) for an element, which is not necessarily unique within the document.
StatusOptional
Datatypeteidata.text
Note

The value of this attribute is always understood to be a single token, even if it contains space or other punctuation characters, and need not be composed of numbers only. It is typically used to specify the numbering of chapters, sections, list items, etc.; it may also be used in the specification of a standard reference system for the text.

xml:lang(language) indicates the language of the element content using a ‘tag’ generated according to BCP 47.
StatusOptional
Datatypeteidata.language
<p> … The consequences of +</fDecl>
Above, the <fDecl> uses targetDatcat, because if it were to use datcat, it would be asserting that it is an instance of the container data category part of speech, whereas it is not — it models a container (<f>) that encodes a part of speech. Note also that it is the <f> that is modeled above, not its values, which are used as direct references to data categories; hence the use of datcat in the <symbol> element.
Note

The TEI Abstract Model can be expressed as a hierarchy of attribute-value matrices (AVMs) of various types and of various levels of complexity, nested or grouped in various ways. At the most abstract level, an AVM consists of an information container and the value (contents) of that container.

A simple example of an XML serialization of such structures is, on the one hand, the opening and closing tags that delimit and name the container, and, on the other, the content enclosed by the two tags that constitues the value. An analogous example is an attribute name and the value of that attribute.

In a TEI XML example of two equivalent serializations expressing the name-value pair <part-of-speech,common-noun>, namely <pos>commonNoun</pos> and pos="common-noun", one would classify the element <pos> and the attribute pos as containers (mapping onto the first member of the relevant name-value pair), while the character data content of <pos> or the value of pos would be seen as mapping onto the second member of the pair.

The att.datcat class provides means of addressing the containers and their values, while at the same time providing a way to interpret them in the context of external taxonomies or ontologies. Aligning e.g. both the <pos> element and the pos attribute with the same value of an external reference point (i.e., an entry in an agreed taxonomy) affirms the identity of the concept serialised by both the element container and the attribute container, and optionally provides a definition of that concept (in the case at hand, the concept part of speech).

The value of the att.datcat attributes should be a PID (persistent identifier) that points to a specific — and, ideally, shared — taxonomy or ontology. Among the resources that can, to a lesser or greater extent, be used as inventories of (more or less) standardized linguistic categories are the GOLD ontology, CLARIN CCR, OLiA, or TermWeb's DatCatInfo, and also the Universal Dependencies inventory, on the assumption that its URIs are going to persist. It is imaginable that a project may choose to address a local taxonomy store instead, but this risks losing the advantage of interchangeability with other projects.

Historically, datcat and valueDatcat originate from the (the now obsolete) ISO 12620:2009 standard, describing the data model and procedures for a Data Category Registry (DCR). The current version of that standard, ISO 12620-1, does not standardize the serialization of pointers, merely mentioning the TEI att.datcat as an example.

Note that no constraint prevents the occurrence of a combination of att.datcat attributes: the <fDecl> element, which is a natural bearer of the targetDatcat attribute, is an instance of a specific modeling element, and, in principle, could be semantically fixed by an appropriate reference taxonomy of modeling devices.

Appendix A.3.7 att.declarable

att.declarable provides attributes for those elements in the TEI header which may be independently selected by means of the special purpose decls attribute. [15.3. Associating Contextual Information with a Text]
Moduletei — Formal specification
Membersavailability bibl correction editorialDecl equipment equipment hyphenation langUsage listEvent listOrg listPerson normalization particDesc projectDesc quotation recording segmentation settingDesc sourceDesc textClass
Attributes
defaultindicates whether or not this element is selected by default when its parent is selected.
StatusOptional
Datatypeteidata.truthValue
Legal values are:
true
This element is selected if its parent is selected
false
This element can only be selected explicitly, unless it is the only one of its kind, in which case it is selected if its parent is selected.[Default]
Note

The rules governing the association of declarable elements with individual parts of a TEI text are fully defined in chapter 15.3. Associating Contextual Information with a Text. Only one element of a particular type may have a default attribute with a value of true.

Appendix A.3.8 att.duration

att.duration provides attributes for normalization of elements that contain datable events.
Modulespoken — Formal specification
Membersatt.timed[gap incident kinesic media u vocal] date recording time
Attributesatt.duration.w3c (@dur) att.duration.iso (@dur-iso)
Note

This ‘superclass’ provides attributes that can be used to provide normalized values of temporal information. By default, the attributes from the att.duration.w3c class are provided. If the module for names & dates is loaded, this class also provides attributes from the att.duration.iso class. In general, the possible values of attributes restricted to the W3C datatypes form a subset of those values available via the ISO 8601 standard. However, the greater expressiveness of the ISO datatypes is rarely needed, and there exists much greater software support for the W3C datatypes.

Appendix A.3.9 att.duration.iso

att.duration.iso provides attributes for recording normalized temporal durations. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Membersatt.duration[att.timed[gap incident kinesic media u vocal] date recording time]
Attributes
dur-iso(duration) indicates the length of this element in time.
StatusOptional
Datatypeteidata.duration.iso
Note

If both when and dur or dur-iso are specified, the values should be interpreted as indicating a span of time by its starting time (or date) and duration. In order to represent a time range by a duration and its ending time the when-iso attribute must be used.

In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Appendix A.3.10 att.duration.w3c

att.duration.w3c provides attributes for recording normalized temporal durations. [3.6.4. Dates and Times 13.4. Dates]
Moduletei — Formal specification
Membersatt.duration[att.timed[gap incident kinesic media u vocal] date recording time]
Attributes
dur(duration) indicates the length of this element in time.
StatusOptional
Datatypeteidata.duration.w3c
Note

If both when and dur are specified, the values should be interpreted as indicating a span of time by its starting time (or date) and duration. In order to represent a time range by a duration and its ending time the when-iso attribute must be used.

In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Appendix A.3.11 att.fragmentable

att.fragmentable provides attributes for representing fragmentation of a structural element, typically as a consequence of some overlapping hierarchy.
Moduletei — Formal specification
Membersatt.divLike[div] att.segLike[pc s seg w] p
Attributes
partspecifies whether or not its parent element is fragmented in some way, typically by some other overlapping structure: for example a speech which is divided between two or more verse stanzas, a paragraph which is split across a page division, a verse line which is divided between two speakers.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
Y
(yes) the element is fragmented in some (unspecified) respect
N
(no) the element is not fragmented, or no claim is made as to its completeness[Default]
I
(initial) this is the initial part of a fragmented element
M
(medial) this is a medial part of a fragmented element
F
(final) this is the final part of a fragmented element
Note

The values I, M, or F should be used only where it is clear how the element may be reconstituted.

Appendix A.3.12 att.global

att.global provides attributes common to all elements in the TEI encoding scheme. [1.3.1.1. Global Attributes]
Moduletei — Formal specification
MembersTEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w
Attributesatt.global.rendition (@rend, @style, @rendition) att.global.linking (@corresp, @synch, @next, @prev) att.global.analytic (@ana) att.global.responsibility (@resp) att.global.source (@source)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
StatusOptional
DatatypeID
Note

The xml:id attribute may be used to specify a canonical reference for an element; see section 3.11. Reference Systems.

n(number) gives a number (or other label) for an element, which is not necessarily unique within the document.
StatusOptional
Datatypeteidata.text
Note

The value of this attribute is always understood to be a single token, even if it contains space or other punctuation characters, and need not be composed of numbers only. It is typically used to specify the numbering of chapters, sections, list items, etc.; it may also be used in the specification of a standard reference system for the text.

xml:lang(language) indicates the language of the element content using a ‘tag’ generated according to BCP 47.
StatusOptional
Datatypeteidata.language
<p> … The consequences of this rapid depopulation were the loss of the last <foreign xml:lang="rap">ariki</foreign> or chief (Routledge 1920:205,210) and their connections to - ancestral territorial organization.</p>
Note

The xml:lang value will be inherited from the immediately enclosing element, or from its parent, and so on up the document hierarchy. It is generally good practice to specify xml:lang at the highest appropriate level, noticing that a different default may be needed for the <teiHeader> from that needed for the associated resource element or elements, and that a single TEI document may contain texts in many languages.

Only attributes with free text values (rare in these guidelines) will be in the scope of xml:lang.

The authoritative list of registered language subtags is maintained by IANA and is available at http://www.iana.org/assignments/language-subtag-registry. For a good general overview of the construction of language tags, see https://www.w3.org/International/articles/language-tags/, and for a practical step-by-step guide, see https://www.w3.org/International/questions/qa-choosing-language-tags.en.php.

The value used must conform with BCP 47. If the value is a private use code (i.e., starts with x- or contains -x-), a <language> element with a matching value for its ident attribute should be supplied in the TEI header to document this value. Such documentation may also optionally be supplied for non-private-use codes, though these must remain consistent with their (IETF)Internet Engineering Task Force definitions.

xml:baseprovides a base URI reference with which applications can resolve relative URI references into absolute URI references.
StatusOptional
Datatypeteidata.pointer
<div type="bibl"> + ancestral territorial organization.</p>
Note

The xml:lang value will be inherited from the immediately enclosing element, or from its parent, and so on up the document hierarchy. It is generally good practice to specify xml:lang at the highest appropriate level, noticing that a different default may be needed for the <teiHeader> from that needed for the associated resource element or elements, and that a single TEI document may contain texts in many languages.

Only attributes with free text values (rare in these guidelines) will be in the scope of xml:lang.

The authoritative list of registered language subtags is maintained by IANA and is available at http://www.iana.org/assignments/language-subtag-registry. For a good general overview of the construction of language tags, see https://www.w3.org/International/articles/language-tags/, and for a practical step-by-step guide, see https://www.w3.org/International/questions/qa-choosing-language-tags.en.php.

The value used must conform with BCP 47. If the value is a private use code (i.e., starts with x- or contains -x-), a <language> element with a matching value for its ident attribute should be supplied in the TEI header to document this value. Such documentation may also optionally be supplied for non-private-use codes, though these must remain consistent with their (IETF)Internet Engineering Task Force definitions.

xml:baseprovides a base URI reference with which applications can resolve relative URI references into absolute URI references.
StatusOptional
Datatypeteidata.pointer
<div type="bibl">  <head>Selections from <title level="m">The Collected Letters of Robert Southey. Part 1: 1791-1797</title>  </head>  <listBibl xml:base="https://romantic-circles.org/sites/default/files/imported/editions/southey_letters/XML/"> @@ -3854,7 +3871,7 @@    </ref>   </bibl>  </listBibl> -</div>
xml:spacesignals an intention about how white space should be managed by applications.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
default
signals that the application's default white-space processing modes are acceptable
preserve
indicates the intent that applications preserve all white space
Note

The XML specification provides further guidance on the use of this attribute. Note that many parsers may not handle xml:space correctly.

Appendix A.3.13 att.global.analytic

att.global.analytic provides additional global attributes for associating specific analyses or interpretations with appropriate portions of a text. [17.2. Global Attributes for Simple Analyses 17.3. Spans and Interpretations]
Moduleanalysis — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

When multiple values are given, they may reflect either multiple divergent interpretations of an ambiguous text, or multiple mutually consistent interpretations of the same passage in different contexts.

Appendix A.3.14 att.global.linking

att.global.linking provides a set of attributes for hypertextual linking. [16. Linking, Segmentation, and Alignment]
Modulelinking — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
corresp(corresponds) points to elements that correspond to the current element in some way.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<group> +</div>
xml:spacesignals an intention about how white space should be managed by applications.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
default
signals that the application's default white-space processing modes are acceptable
preserve
indicates the intent that applications preserve all white space
Note

The XML specification provides further guidance on the use of this attribute. Note that many parsers may not handle xml:space correctly.

Appendix A.3.13 att.global.analytic

att.global.analytic provides additional global attributes for associating specific analyses or interpretations with appropriate portions of a text. [17.2. Global Attributes for Simple Analyses 17.3. Spans and Interpretations]
Moduleanalysis — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

When multiple values are given, they may reflect either multiple divergent interpretations of an ambiguous text, or multiple mutually consistent interpretations of the same passage in different contexts.

Appendix A.3.14 att.global.linking

att.global.linking provides a set of attributes for hypertextual linking. [16. Linking, Segmentation, and Alignment]
Modulelinking — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
corresp(corresponds) points to elements that correspond to the current element in some way.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<group>  <text xml:id="t1-g1-t1"   xml:lang="mi">   <body xml:id="t1-g1-t1-body1"> @@ -3874,7 +3891,7 @@    </div>   </body>  </text> -</group>
In this example a <group> contains two <text>s, each containing the same document in a different language. The correspondence is indicated using corresp. The language is indicated using xml:lang, whose value is inherited; both the tag with the corresp and the tag pointed to by the corresp inherit the value from their immediate parent.
+</group>
In this example a <group> contains two <text>s, each containing the same document in a different language. The correspondence is indicated using corresp. The language is indicated using xml:lang, whose value is inherited; both the tag with the corresp and the tag pointed to by the corresp inherit the value from their immediate parent.
<!-- In a placeography called "places.xml" --><place xml:id="LOND1"  corresp="people.xml#LOND2 people.xml#GENI1">  <placeName>London</placeName> @@ -3896,15 +3913,15 @@      allegorical character in mayoral shows.   </p>  </note> -</person>
In this example, a <place> element containing information about the city of London is linked with two <person> elements in a literary personography. This correspondence represents a slightly looser relationship than the one in the preceding example; there is no sense in which an allegorical character could be substituted for the physical city, or vice versa, but there is obviously a correspondence between them.
synch(synchronous) points to elements that are synchronous with the current element.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
nextpoints to the next element of a virtual aggregate of which the current element is part.
StatusOptional
Datatypeteidata.pointer
Note

It is recommended that the element indicated be of the same type as the element bearing this attribute.

prev(previous) points to the previous element of a virtual aggregate of which the current element is part.
StatusOptional
Datatypeteidata.pointer
Note

It is recommended that the element indicated be of the same type as the element bearing this attribute.

Appendix A.3.15 att.global.rendition

att.global.rendition provides rendering attributes common to all elements in the TEI encoding scheme. [1.3.1.1.3. Rendition Indicators]
Moduletei — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
rend(rendition) indicates how the element in question was rendered or presented in the source text.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
<head rend="align(center) case(allcaps)"> +</person>
In this example, a <place> element containing information about the city of London is linked with two <person> elements in a literary personography. This correspondence represents a slightly looser relationship than the one in the preceding example; there is no sense in which an allegorical character could be substituted for the physical city, or vice versa, but there is obviously a correspondence between them.
synch(synchronous) points to elements that are synchronous with the current element.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
nextpoints to the next element of a virtual aggregate of which the current element is part.
StatusOptional
Datatypeteidata.pointer
Note

It is recommended that the element indicated be of the same type as the element bearing this attribute.

prev(previous) points to the previous element of a virtual aggregate of which the current element is part.
StatusOptional
Datatypeteidata.pointer
Note

It is recommended that the element indicated be of the same type as the element bearing this attribute.

Appendix A.3.15 att.global.rendition

att.global.rendition provides rendering attributes common to all elements in the TEI encoding scheme. [1.3.1.1.3. Rendition Indicators]
Moduletei — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
rend(rendition) indicates how the element in question was rendered or presented in the source text.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
<head rend="align(center) case(allcaps)">  <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her <lb/>  <hi rend="case(mixed)">New Blazing-World</hi>. -</head>
Note

These Guidelines make no binding recommendations for the values of the rend attribute; the characteristics of visual presentation vary too much from text to text and the decision to record or ignore individual characteristics varies too much from project to project. Some potentially useful conventions are noted from time to time at appropriate points in the Guidelines. The values of the rend attribute are a set of sequence-indeterminate individual tokens separated by whitespace.

stylecontains an expression in some formal style definition language which defines the rendering or presentation used for this element in the source text
StatusOptional
Datatypeteidata.text
<head style="text-align: center; font-variant: small-caps"> +</head>
Note

These Guidelines make no binding recommendations for the values of the rend attribute; the characteristics of visual presentation vary too much from text to text and the decision to record or ignore individual characteristics varies too much from project to project. Some potentially useful conventions are noted from time to time at appropriate points in the Guidelines. The values of the rend attribute are a set of sequence-indeterminate individual tokens separated by whitespace.

stylecontains an expression in some formal style definition language which defines the rendering or presentation used for this element in the source text
StatusOptional
Datatypeteidata.text
<head style="text-align: center; font-variant: small-caps">  <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her <lb/>  <hi style="font-variant: normal">New Blazing-World</hi>. -</head>
Note

Unlike the attribute values of rend, which uses whitespace as a separator, the style attribute may contain whitespace. This attribute is intended for recording inline stylistic information concerning the source, not any particular output.

The formal language in which values for this attribute are expressed may be specified using the <styleDefDecl> element in the TEI header.

If style and rendition are both present on an element, then style overrides or complements rendition. style should not be used in conjunction with rend, because the latter does not employ a formal style definition language.

renditionpoints to a description of the rendering or presentation used for this element in the source text.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<head rendition="#ac #sc"> +</head>
Note

Unlike the attribute values of rend, which uses whitespace as a separator, the style attribute may contain whitespace. This attribute is intended for recording inline stylistic information concerning the source, not any particular output.

The formal language in which values for this attribute are expressed may be specified using the <styleDefDecl> element in the TEI header.

If style and rendition are both present on an element, then style overrides or complements rendition. style should not be used in conjunction with rend, because the latter does not employ a formal style definition language.

renditionpoints to a description of the rendering or presentation used for this element in the source text.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<head rendition="#ac #sc">  <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her <lb/>  <hi rendition="#normal">New Blazing-World</hi>. @@ -3915,11 +3932,11 @@ <rendition xml:id="normal"  scheme="css">font-variant: normal</rendition> <rendition xml:id="ac" - scheme="css">text-align: center</rendition>
Note

The rendition attribute is used in a very similar way to the class attribute defined for XHTML but with the important distinction that its function is to describe the appearance of the source text, not necessarily to determine how that text should be presented on screen or paper.

If rendition is used to refer to a style definition in a formal language like CSS, it is recommended that it not be used in conjunction with rend. Where both rendition and rend are supplied, the latter is understood to override or complement the former.

Each URI provided should indicate a <rendition> element defining the intended rendition in terms of some appropriate style language, as indicated by the scheme attribute.

Appendix A.3.16 att.global.responsibility

att.global.responsibility provides attributes indicating the agent responsible for some aspect of the text, the markup or something asserted by the markup, and the degree of certainty associated with it. [1.3.1.1.4. Sources, certainty, and responsibility 3.5. Simple Editorial Changes 11.3.2.2. Hand, Responsibility, and Certainty Attributes 17.3. Spans and Interpretations 13.1.1. Linking Names and Their Referents]
Moduletei — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
resp(responsible party) indicates the agency responsible for the intervention or interpretation, for example an editor or transcriber.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

To reduce the ambiguity of a resp pointing directly to a person or organization, we recommend that resp be used to point not to an agent (<person> or <org>) but to a <respStmt>, <author>, <editor> or similar element which clarifies the exact role played by the agent. Pointing to multiple <respStmt>s allows the encoder to specify clearly each of the roles played in part of a TEI file (creating, transcribing, encoding, editing, proofing etc.).

Example
Blessed are the + scheme="css">text-align: center</rendition>
Note

The rendition attribute is used in a very similar way to the class attribute defined for XHTML but with the important distinction that its function is to describe the appearance of the source text, not necessarily to determine how that text should be presented on screen or paper.

If rendition is used to refer to a style definition in a formal language like CSS, it is recommended that it not be used in conjunction with rend. Where both rendition and rend are supplied, the latter is understood to override or complement the former.

Each URI provided should indicate a <rendition> element defining the intended rendition in terms of some appropriate style language, as indicated by the scheme attribute.

Appendix A.3.16 att.global.responsibility

att.global.responsibility provides attributes indicating the agent responsible for some aspect of the text, the markup or something asserted by the markup, and the degree of certainty associated with it. [1.3.1.1.4. Sources, certainty, and responsibility 3.5. Simple Editorial Changes 11.3.2.2. Hand, Responsibility, and Certainty Attributes 17.3. Spans and Interpretations 13.1.1. Linking Names and Their Referents]
Moduletei — Formal specification
Membersatt.global[TEI addName affiliation appInfo application availability bibl birth body catDesc catRef category change classDecl correction date death desc div edition editionStmt editorialDecl education email encodingDesc equipment equipment event extent figure fileDesc forename funder gap graphic head hyphenation idno incident kinesic label langUsage language licence link linkGrp listEvent listOrg listPerson listPrefixDef listRelation measure media meeting name nameLink namespace normalization note num occupation org orgName p particDesc pb pc persName person placeName prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation recording recordingStmt ref relation resp respStmt revisionDesc roleName s seg segmentation setting settingDesc sex sourceDesc state surname tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt u unit vocal w]
Attributes
resp(responsible party) indicates the agency responsible for the intervention or interpretation, for example an editor or transcriber.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

To reduce the ambiguity of a resp pointing directly to a person or organization, we recommend that resp be used to point not to an agent (<person> or <org>) but to a <respStmt>, <author>, <editor> or similar element which clarifies the exact role played by the agent. Pointing to multiple <respStmt>s allows the encoder to specify clearly each of the roles played in part of a TEI file (creating, transcribing, encoding, editing, proofing etc.).

Example
Blessed are the <choice>  <sic>cheesemakers</sic>  <corr resp="#editorcert="high">peacemakers</corr> -</choice>: for they shall be called the children of God.
Example
+</choice>: for they shall be called the children of God.
Example
<!-- in the <text> ... --><lg> <!-- ... -->  <l>Punkes, Panders, baſe extortionizing @@ -3944,11 +3961,11 @@ <sch:value-of select="name(.)"/>), the @source attribute should have only 1 value. (This one has <sch:value-of select="count($srcs)"/>.) </sch:report> -</sch:rule>
Note

The source attribute points to an external source. When used on an element describing a schema component (<classRef>, <dataRef>, <elementRef>, <macroRef>, <moduleRef>, or <schemaSpec>), it identifies the source from which declarations for the components should be obtained.

On other elements it provides a pointer to the bibliographical source from which a quotation or citation is drawn.

In either case, the location may be provided using any form of URI, for example an absolute URI, a relative URI, a private scheme URI of the form tei:x.y.z, where x.y.z indicates the version number, e.g. tei:4.3.2 for TEI P5 release 4.3.2 or (as a special case) tei:current for whatever is the latest release, or a private scheme URI that is expanded to an absolute URI as documented in a <prefixDef>.

When used on elements describing schema components, source should have only one value; when used on other elements multiple values are permitted.

Example
<p> +</sch:rule>
Note

The source attribute points to an external source. When used on an element describing a schema component (<classRef>, <dataRef>, <elementRef>, <macroRef>, <moduleRef>, or <schemaSpec>), it identifies the source from which declarations for the components should be obtained.

On other elements it provides a pointer to the bibliographical source from which a quotation or citation is drawn.

In either case, the location may be provided using any form of URI, for example an absolute URI, a relative URI, a private scheme URI of the form tei:x.y.z, where x.y.z indicates the version number, e.g. tei:4.3.2 for TEI P5 release 4.3.2 or (as a special case) tei:current for whatever is the latest release, or a private scheme URI that is expanded to an absolute URI as documented in a <prefixDef>.

When used on elements describing schema components, source should have only one value; when used on other elements multiple values are permitted.

Example
<p> <!-- ... --> As Willard McCarty (<bibl xml:id="mcc_2012">2012, p.2</bibl>) tells us, <quote source="#mcc_2012">‘Collaboration’ is a problematic and should be a contested    term.</quote> <!-- ... --> -</p>
Example
<p> +</p>
Example
<p> <!-- ... -->  <quote source="#chicago_15_ed">Grammatical theories are in flux, and the more we learn, the    less we seem to know.</quote> @@ -3960,36 +3977,36 @@ <edition>15th edition</edition>. <pubPlace>Chicago</pubPlace>: <publisher>University of    Chicago Press</publisher> (<date>2003</date>), <biblScope unit="page">p.147</biblScope>. -</bibl>
Example
<elementRef key="psource="tei:2.0.1"/>
Include in the schema an element named <p> available from the TEI P5 2.0.1 release.
Example
<schemaSpec ident="myODD" +</bibl>
Example
<elementRef key="psource="tei:2.0.1"/>
Include in the schema an element named <p> available from the TEI P5 2.0.1 release.
Example
<schemaSpec ident="myODD"  source="mycompiledODD.xml"> <!-- further declarations specifying the components required --> -</schemaSpec>
Create a schema using components taken from the file mycompiledODD.xml.

Appendix A.3.18 att.internetMedia

att.internetMedia provides attributes for specifying the type of a computer resource using a standard taxonomy.
Moduletei — Formal specification
Membersatt.media[graphic media] ref
Attributes
mimeType(MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
ExampleIn this example mimeType is used to indicate that the URL points to a TEI XML file encoded in UTF-8.
<ref mimeType="application/tei+xml; charset=UTF-8" - target="https://raw.githubusercontent.com/TEIC/TEI/dev/P5/Source/guidelines-en.xml"/>
Note

This attribute class provides an attribute for describing a computer resource, typically available over the internet, using a value taken from a standard taxonomy. At present only a single taxonomy is supported, the Multipurpose Internet Mail Extensions (MIME) Media Type system. This typology of media types is defined by the Internet Engineering Task Force in RFC 2046. The list of types is maintained by the Internet Assigned Numbers Authority (IANA). The mimeType attribute must have a value taken from this list.

Appendix A.3.19 att.lexicographic.normalized

att.lexicographic.normalized provides attributes for usage within word-level elements in the analysis module and within lexicographic microstructure in the dictionaries module.
Moduleanalysis — Formal specification
Membersatt.linguistic[pc w]
Attributes
norm(normalized) provides the normalized/standardized form of information present in the source text in a non-normalized form
StatusOptional
Datatypeteidata.text
Normalization of part-of-speech information within a dictionary entry.
<gramGrp> +</schemaSpec>
Create a schema using components taken from the file mycompiledODD.xml.

Appendix A.3.18 att.internetMedia

att.internetMedia provides attributes for specifying the type of a computer resource using a standard taxonomy.
Moduletei — Formal specification
Membersatt.media[graphic media] ref
Attributes
mimeType(MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
ExampleIn this example mimeType is used to indicate that the URL points to a TEI XML file encoded in UTF-8.
<ref mimeType="application/tei+xml; charset=UTF-8" + target="https://raw.githubusercontent.com/TEIC/TEI/dev/P5/Source/guidelines-en.xml"/>
Note

This attribute class provides an attribute for describing a computer resource, typically available over the internet, using a value taken from a standard taxonomy. At present only a single taxonomy is supported, the Multipurpose Internet Mail Extensions (MIME) Media Type system. This typology of media types is defined by the Internet Engineering Task Force in RFC 2046. The list of types is maintained by the Internet Assigned Numbers Authority (IANA). The mimeType attribute must have a value taken from this list.

Appendix A.3.19 att.lexicographic.normalized

att.lexicographic.normalized provides attributes for usage within word-level elements in the analysis module and within lexicographic microstructure in the dictionaries module.
Moduleanalysis — Formal specification
Membersatt.linguistic[pc w]
Attributes
norm(normalized) provides the normalized/standardized form of information present in the source text in a non-normalized form
StatusOptional
Datatypeteidata.text
Normalization of part-of-speech information within a dictionary entry.
<gramGrp>  <pos norm="noun">n</pos> -</gramGrp>
Normalization of a source form in a tokenized historical corpus.
<s> +</gramGrp>
Normalization of a source form in a tokenized historical corpus.
<s>  <w>for</w>  <w norm="virtue's">vertues</w>  <w>sake</w> -</s>
<s> +</s>
<s>  <w norm="persuasion">perswasion</w>  <w>of</w>  <w norm="Unity">Vnitie</w> -</s>
Example of normalization from Aviso. Relation oder Zeitung. Wolfenbüttel, 1609. In: Deutsches Textarchiv.
<s> +</s>
Example of normalization from Aviso. Relation oder Zeitung. Wolfenbüttel, 1609. In: Deutsches Textarchiv.
<s>  <w norm="freiwillig">freywillig</w>  <pc norm=","   join="left">/</pc>  <w norm="unbedrängt">vnbedraͤngt</w>  <w norm="und">vnd</w>  <w norm="unverhindert">vnuerhindert</w> -</s>
<w norm="Teil">Theyll</w>
<w norm="Freude">Frewde</w>
Note

It needs to be stressed that the two attributes in this class are meant for strictly lexicographic and linguistic uses, and not for editorial interventions. For the latter, the mechanism based on <choice>, <orig>, and <reg> needs to be employed.

Appendix A.3.20 att.linguistic

att.linguistic provides a set of attributes concerning linguistic features of tokens, for usage within token-level elements, specifically <w> and <pc> in the analysis module. [17.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Memberspc w
Attributesatt.lexicographic.normalized (@norm)
lemmaprovides a lemma (base form) for the word, typically uninflected and serving both as an identifier (e.g. in dictionary contexts, as a headword), and as a basis for potential inflections.
StatusOptional
Datatypeteidata.text
<w lemma="wife">wives</w>
<w lemma="Arznei">Artzeneyen</w>
pos(part of speech) indicates the part of speech assigned to a token (i.e. information on whether it is a noun, adjective, or verb), usually according to some official reference vocabulary (e.g. for German: STTS, for English: CLAWS, for Polish: NKJP, etc.).
StatusOptional
Datatypeteidata.text
The German sentence ‘Wir fahren in den Urlaub.’ tagged with the Stuttgart-Tuebingen-Tagset (STTS).
<s> +</s>
<w norm="Teil">Theyll</w>
<w norm="Freude">Frewde</w>
Note

It needs to be stressed that the two attributes in this class are meant for strictly lexicographic and linguistic uses, and not for editorial interventions. For the latter, the mechanism based on <choice>, <orig>, and <reg> needs to be employed.

Appendix A.3.20 att.linguistic

att.linguistic provides a set of attributes concerning linguistic features of tokens, for usage within token-level elements, specifically <w> and <pc> in the analysis module. [17.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Memberspc w
Attributesatt.lexicographic.normalized (@norm)
lemmaprovides a lemma (base form) for the word, typically uninflected and serving both as an identifier (e.g. in dictionary contexts, as a headword), and as a basis for potential inflections.
StatusOptional
Datatypeteidata.text
<w lemma="wife">wives</w>
<w lemma="Arznei">Artzeneyen</w>
pos(part of speech) indicates the part of speech assigned to a token (i.e. information on whether it is a noun, adjective, or verb), usually according to some official reference vocabulary (e.g. for German: STTS, for English: CLAWS, for Polish: NKJP, etc.).
StatusOptional
Datatypeteidata.text
The German sentence ‘Wir fahren in den Urlaub.’ tagged with the Stuttgart-Tuebingen-Tagset (STTS).
<s>  <w pos="PPER">Wir</w>  <w pos="VVFIN">fahren</w>  <w pos="APPR">in</w>  <w pos="ART">den</w>  <w pos="NN">Urlaub</w>  <w pos="$.">.</w> -</s>
The English sentence ‘We're going to Brazil.’ tagged with the CLAWS-5 tagset, arranged inline (with significant whitespace).
<p><w pos="PNP">We</w><w pos="VBB">'re</w> <w pos="VVG">going</w> <w pos="PRP">to</w> <w pos="NP0">Brazil</w><pc pos="PUN">.</pc></p> -        
The English sentence ‘We're going on vacation to Brazil for a month!’ tagged with the CLAWS-7 tagset and arranged sequentially.
<p> +</s>
The English sentence ‘We're going to Brazil.’ tagged with the CLAWS-5 tagset, arranged inline (with significant whitespace).
<p><w pos="PNP">We</w><w pos="VBB">'re</w> <w pos="VVG">going</w> <w pos="PRP">to</w> <w pos="NP0">Brazil</w><pc pos="PUN">.</pc></p> +        
The English sentence ‘We're going on vacation to Brazil for a month!’ tagged with the CLAWS-7 tagset and arranged sequentially.
<p>  <w pos="PPIS2">We</w>  <w pos="VBR">'re</w>  <w pos="VVG">going</w> @@ -4001,7 +4018,7 @@  <w pos="AT1">a</w>  <w pos="NNT1">month</w>  <pc pos="!">!</pc> -</p>
msd(morphosyntactic description) supplies morphosyntactic information for a token, usually according to some official reference vocabulary (e.g. for German: STTS-large tagset; for a feature description system designed as (pragmatically) universal, see Universal Features).
StatusOptional
Datatypeteidata.text
<ab> +</p>
msd(morphosyntactic description) supplies morphosyntactic information for a token, usually according to some official reference vocabulary (e.g. for German: STTS-large tagset; for a feature description system designed as (pragmatically) universal, see Universal Features).
StatusOptional
Datatypeteidata.text
<ab>  <w pos="PPER"   msd="1.Pl.*.Nom">Wir</w>  <w pos="VVFIN" @@ -4014,7 +4031,7 @@   msd="Masc.Akk.Sg">Urlaub</w>  <pc pos="$."   msd="--">.</pc> -</ab>
joinwhen present, provides information on whether the token in question is adjacent to another, and if so, on which side.
StatusOptional
Datatypeteidata.text
Legal values are:
no
(the token is not adjacent to another)
left
(there is no whitespace on the left side of the token)
right
(there is no whitespace on the right side of the token)
both
(there is no whitespace on either side of the token)
overlap
(the token overlaps with another; other devices (specifying the extent and the area of overlap) are needed to more precisely locate this token in the character stream)
The example below assumes that the lack of whitespace is marked redundantly, by using the appropriate values of join.
<s> +</ab>
joinwhen present, provides information on whether the token in question is adjacent to another, and if so, on which side.
StatusOptional
Datatypeteidata.text
Legal values are:
no
(the token is not adjacent to another)
left
(there is no whitespace on the left side of the token)
right
(there is no whitespace on the right side of the token)
both
(there is no whitespace on either side of the token)
overlap
(the token overlaps with another; other devices (specifying the extent and the area of overlap) are needed to more precisely locate this token in the character stream)
The example below assumes that the lack of whitespace is marked redundantly, by using the appropriate values of join.
<s>  <pc join="right">"</pc>  <w join="left">Friends</w>  <w>will</w> @@ -4022,7 +4039,7 @@  <w join="right">friends</w>  <pc join="both">.</pc>  <pc join="left">"</pc> -</s>
Note that a project may make a decision to only indicate lack of whitespace in one direction, or do that non-redundantly. The existing proposal is the broadest possible, on the assumption that we adopt the "streamable view", where all the information on the current element needs to be represented locally.
The English sentence ‘We're going on vacation.’ tagged with the CLAWS-5 tagset, arranged sequentially, tagged on the assumption that only the lack of the preceding whitespace is indicated.
<p> +</s>
Note that a project may make a decision to only indicate lack of whitespace in one direction, or do that non-redundantly. The existing proposal is the broadest possible, on the assumption that we adopt the "streamable view", where all the information on the current element needs to be represented locally.
The English sentence ‘We're going on vacation.’ tagged with the CLAWS-5 tagset, arranged sequentially, tagged on the assumption that only the lack of the preceding whitespace is indicated.
<p>  <w pos="PNP">We</w>  <w pos="VBB"   join="left">'re</w> @@ -4031,15 +4048,15 @@  <w pos="NN1">vacation</w>  <pc pos="PUN"   join="left">.</pc> -</p>
Note

The definition of this attribute is adapted from ISO MAF (Morpho-syntactic Annotation Framework), ISO 24611:2012.

Note

These attributes make it possible to encode simple language corpora and to add a layer of linguistic information to any tokenized resource. See section 17.4.2. Lightweight Linguistic Annotation for discussion.

Appendix A.3.21 att.naming

att.naming provides attributes common to elements which refer to named persons, places, organizations etc. [3.6.1. Referring Strings 13.3.6. Names and Nyms]
Moduletei — Formal specification
Membersatt.personal[addName forename name orgName persName placeName roleName surname] affiliation birth death education event occupation pubPlace state
Attributesatt.canonical (@key, @ref)
rolemay be used to specify further information about the entity referenced by this name in the form of a set of whitespace-separated values, for example the occupation of a person, or the status of a place.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace

Appendix A.3.22 att.personal

att.personal (attributes for components of names usually, but not necessarily, personal names) common attributes for those elements which form part of a name usually, but not necessarily, a personal name. [13.2.1. Personal Names]
Moduletei — Formal specification
MembersaddName forename name orgName persName placeName roleName surname
Attributesatt.naming (@role) (att.canonical (@key, @ref))
fullindicates whether the name component is given in full, as an abbreviation or simply as an initial.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
yes
(yes) the name component is spelled out in full.[Default]
abb
(abbreviated) the name component is given in an abbreviated form.
init
(initial letter) the name component is indicated only by one initial.

Appendix A.3.23 att.pointing

att.pointing provides a set of attributes used by all elements which point to other elements by means of one or more URI references. [1.3.1.1.2. Language Indicators 3.7. Simple Links and Cross-References]
Moduletei — Formal specification
Membersatt.pointing.group[linkGrp] catRef licence link note ref term
Attributes
targetspecifies the destination of the reference by supplying one or more URI References
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

One or more syntactically valid URI references, separated by whitespace. Because whitespace is used to separate URIs, no whitespace is permitted inside a single URI. If a whitespace character is required in a URI, it should be escaped with the normal mechanism, e.g. TEI%20Consortium.

Appendix A.3.24 att.ranging

att.ranging provides attributes for describing numerical ranges.
Moduletei — Formal specification
Membersatt.dimensions[birth date death gap state time] measure num
Attributes
atLeastgives a minimum estimated value for the approximate measurement.
StatusOptional
Datatypeteidata.numeric
atMostgives a maximum estimated value for the approximate measurement.
StatusOptional
Datatypeteidata.numeric
minwhere the measurement summarizes more than one observation or a range, supplies the minimum value observed.
StatusOptional
Datatypeteidata.numeric
maxwhere the measurement summarizes more than one observation or a range, supplies the maximum value observed.
StatusOptional
Datatypeteidata.numeric
confidencespecifies the degree of statistical confidence (between zero and one) that a value falls within the range specified by min and max, or the proportion of observed values that fall within that range.
StatusOptional
Datatypeteidata.probability
Example
The MS. was lost in transmission by mail from <del rend="overstrike"> +</p>
Note

The definition of this attribute is adapted from ISO MAF (Morpho-syntactic Annotation Framework), ISO 24611:2012.

Note

These attributes make it possible to encode simple language corpora and to add a layer of linguistic information to any tokenized resource. See section 17.4.2. Lightweight Linguistic Annotation for discussion.

Appendix A.3.21 att.naming

att.naming provides attributes common to elements which refer to named persons, places, organizations etc. [3.6.1. Referring Strings 13.3.6. Names and Nyms]
Moduletei — Formal specification
Membersatt.personal[addName forename name orgName persName placeName roleName surname] affiliation birth death education event occupation pubPlace state
Attributesatt.canonical (@key, @ref)
rolemay be used to specify further information about the entity referenced by this name in the form of a set of whitespace-separated values, for example the occupation of a person, or the status of a place.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace

Appendix A.3.22 att.personal

att.personal (attributes for components of names usually, but not necessarily, personal names) common attributes for those elements which form part of a name usually, but not necessarily, a personal name. [13.2.1. Personal Names]
Moduletei — Formal specification
MembersaddName forename name orgName persName placeName roleName surname
Attributesatt.naming (@role) (att.canonical (@key, @ref))
fullindicates whether the name component is given in full, as an abbreviation or simply as an initial.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
yes
(yes) the name component is spelled out in full.[Default]
abb
(abbreviated) the name component is given in an abbreviated form.
init
(initial letter) the name component is indicated only by one initial.

Appendix A.3.23 att.pointing

att.pointing provides a set of attributes used by all elements which point to other elements by means of one or more URI references. [1.3.1.1.2. Language Indicators 3.7. Simple Links and Cross-References]
Moduletei — Formal specification
Membersatt.pointing.group[linkGrp] catRef licence link note ref term
Attributes
targetspecifies the destination of the reference by supplying one or more URI References
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

One or more syntactically valid URI references, separated by whitespace. Because whitespace is used to separate URIs, no whitespace is permitted inside a single URI. If a whitespace character is required in a URI, it should be escaped with the normal mechanism, e.g. TEI%20Consortium.

Appendix A.3.24 att.ranging

att.ranging provides attributes for describing numerical ranges.
Moduletei — Formal specification
Membersatt.dimensions[birth date death gap state time] measure num
Attributes
atLeastgives a minimum estimated value for the approximate measurement.
StatusOptional
Datatypeteidata.numeric
atMostgives a maximum estimated value for the approximate measurement.
StatusOptional
Datatypeteidata.numeric
minwhere the measurement summarizes more than one observation or a range, supplies the minimum value observed.
StatusOptional
Datatypeteidata.numeric
maxwhere the measurement summarizes more than one observation or a range, supplies the maximum value observed.
StatusOptional
Datatypeteidata.numeric
confidencespecifies the degree of statistical confidence (between zero and one) that a value falls within the range specified by min and max, or the proportion of observed values that fall within that range.
StatusOptional
Datatypeteidata.probability
Example
The MS. was lost in transmission by mail from <del rend="overstrike">  <gap reason="illegible"   extent="one or two lettersatLeast="1atMost="2unit="chars"/> </del> Philadelphia to the Graphic office, New York. -
Example
Americares has been supporting the health sector in Eastern +
Example
Americares has been supporting the health sector in Eastern Europe since 1986, and since 1992 has provided <measure atLeast="120000000unit="USD"  commodity="currency">more than $120m</measure> in aid to Ukrainians. -

Appendix A.3.25 att.resourced

att.resourced provides attributes by which a resource (such as an externally held media file) may be located.
Moduletei — Formal specification
Membersgraphic media
Attributes
url(uniform resource locator) specifies the URL from which the media concerned may be obtained.
StatusRequired
Datatypeteidata.pointer

Appendix A.3.26 att.typed

att.typed provides attributes that can be used to classify or subclassify elements in any way. [1.3.1. Attribute Classes 17.1.1. Words and Above 3.6.1. Referring Strings 3.7. Simple Links and Cross-References 3.6.5. Abbreviations and Their Expansions 3.13.1. Core Tags for Verse 7.2.5. Speech Contents 4.1.1. Un-numbered Divisions 4.1.2. Numbered Divisions 4.2.1. Headings and Trailers 4.4. Virtual Divisions 13.3.2.3. Personal Relationships 11.3.1.1. Core Elements for Transcriptional Work 16.1.1. Pointers and Links 16.3. Blocks, Segments, and Anchors 12.2. Linking the Apparatus to the Text 22.5.1.2. Defining Content Models: RELAX NG 8.3. Elements Unique to Spoken Texts 23.3.1.3. Modification of Attribute and Attribute Value Lists]
Moduletei — Formal specification
Membersatt.pointing.group[linkGrp] TEI addName affiliation application bibl birth change date death desc div education event figure forename graphic head idno incident kinesic label link listEvent listOrg listPerson listRelation measure media name nameLink note num occupation org orgName pb pc persName placeName recording ref relation roleName s seg sex state surname teiCorpus term text time title unit vocal w
Attributes
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
StatusOptional
Datatypeteidata.enumerated
<div type="verse"> +

Appendix A.3.25 att.resourced

att.resourced provides attributes by which a resource (such as an externally held media file) may be located.
Moduletei — Formal specification
Membersgraphic media
Attributes
url(uniform resource locator) specifies the URL from which the media concerned may be obtained.
StatusRequired
Datatypeteidata.pointer

Appendix A.3.26 att.typed

att.typed provides attributes that can be used to classify or subclassify elements in any way. [1.3.1. Attribute Classes 17.1.1. Words and Above 3.6.1. Referring Strings 3.7. Simple Links and Cross-References 3.6.5. Abbreviations and Their Expansions 3.13.1. Core Tags for Verse 7.2.5. Speech Contents 4.1.1. Un-numbered Divisions 4.1.2. Numbered Divisions 4.2.1. Headings and Trailers 4.4. Virtual Divisions 13.3.2.3. Personal Relationships 11.3.1.1. Core Elements for Transcriptional Work 16.1.1. Pointers and Links 16.3. Blocks, Segments, and Anchors 12.2. Linking the Apparatus to the Text 22.5.1.2. Defining Content Models: RELAX NG 8.3. Elements Unique to Spoken Texts 23.3.1.3. Modification of Attribute and Attribute Value Lists]
Moduletei — Formal specification
Membersatt.pointing.group[linkGrp] TEI addName affiliation application bibl birth change date death desc div education event figure forename graphic head idno incident kinesic label link listEvent listOrg listPerson listRelation measure media name nameLink note num occupation org orgName pb pc persName placeName recording ref relation roleName s seg sex state surname teiCorpus term text time title unit vocal w
Attributes
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
StatusOptional
Datatypeteidata.enumerated
<div type="verse">  <head>Night in Tarras</head>  <lg type="stanza">   <l>At evening tramping on the hot white road</l> @@ -4052,7 +4069,7 @@ </div>
Note

The type attribute is present on a number of elements, not all of which are members of att.typed, usually because these elements restrict the possible values for the attribute in a specific way.

subtype(subtype) provides a sub-categorization of the element, if needed
StatusOptional
Datatypeteidata.enumerated
Note

The subtype attribute may be used to provide any sub-classification for the element additional to that provided by its type attribute.

Schematron
<sch:rule context="tei:*[@subtype]"> <sch:assert test="@type">The <sch:name/> element should not be categorized in detail with @subtype unless also categorized in general with @type</sch:assert> -</sch:rule>
Note

When appropriate, values from an established typology should be used. Alternatively a typology may be defined in the associated TEI header. If values are to be taken from a project-specific list, this should be defined using the <valList> element in the project-specific schema description, as described in 23.3.1.3. Modification of Attribute and Attribute Value Lists .

Appendix A.4 Datatypes

Appendix A.4.1 teidata.certainty

teidata.certainty defines the range of attribute values expressing a degree of certainty.
Moduletei — Formal specification
Used by
Content model
+</sch:rule>
Note

When appropriate, values from an established typology should be used. Alternatively a typology may be defined in the associated TEI header. If values are to be taken from a project-specific list, this should be defined using the <valList> element in the project-specific schema description, as described in 23.3.1.3. Modification of Attribute and Attribute Value Lists .

Appendix A.4 Datatypes

Appendix A.4.1 teidata.certainty

teidata.certainty defines the range of attribute values expressing a degree of certainty.
Moduletei — Formal specification
Used by
Content model
 <content>
  <valList type="closed">
   <valItem ident="high"/>
@@ -4061,29 +4078,29 @@
   <valItem ident="unknown"/>
  </valList>
 </content>
-    
Declaration
-tei_teidata.certainty = "high" | "medium" | "low" | "unknown"
Note

Certainty may be expressed by one of the predefined symbolic values high, medium, or low. The value unknown should be used in cases where the encoder does not wish to assert an opinion about the matter.

Appendix A.4.2 teidata.count

teidata.count defines the range of attribute values used for a non-negative integer value used as a count.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.certainty = "high" | "medium" | "low" | "unknown"
Note

Certainty may be expressed by one of the predefined symbolic values high, medium, or low. The value unknown should be used in cases where the encoder does not wish to assert an opinion about the matter.

Appendix A.4.2 teidata.count

teidata.count defines the range of attribute values used for a non-negative integer value used as a count.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef name="nonNegativeInteger"/>
 </content>
-    
Declaration
-tei_teidata.count = xsd:nonNegativeInteger
Note

Any positive integer value or zero is permitted

Appendix A.4.3 teidata.duration.iso

teidata.duration.iso defines the range of attribute values available for representation of a duration in time using ISO 8601 standard formats
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.count = xsd:nonNegativeInteger
Note

Any positive integer value or zero is permitted

Appendix A.4.3 teidata.duration.iso

teidata.duration.iso defines the range of attribute values available for representation of a duration in time using ISO 8601 standard formats
Moduletei — Formal specification
Used by
Content model
 <content>
  <dataRef name="token"
   restriction="[0-9.,DHMPRSTWYZ/:+\-]+"/>
 </content>
-    
Declaration
-tei_teidata.duration.iso = token { pattern = "[0-9.,DHMPRSTWYZ/:+\-]+" }
Example
<time dur-iso="PT0,75H">three-quarters of an hour</time>
Example
<date dur-iso="P1,5D">a day and a half</date>
Example
<date dur-iso="P14D">a fortnight</date>
Example
<time dur-iso="PT0.02S">20 ms</time>
Note

A duration is expressed as a sequence of number-letter pairs, preceded by the letter P; the letter gives the unit and may be Y (year), M (month), D (day), H (hour), M (minute), or S (second), in that order. The numbers are all unsigned integers, except for the last, which may have a decimal component (using either . or , as the decimal point; the latter is preferred). If any number is 0, then that number-letter pair may be omitted. If any of the H (hour), M (minute), or S (second) number-letter pairs are present, then the separator T must precede the first ‘time’ number-letter pair.

For complete details, see ISO 8601 Data elements and interchange formats — Information interchange — Representation of dates and times.

Appendix A.4.4 teidata.duration.w3c

teidata.duration.w3c defines the range of attribute values available for representation of a duration in time using W3C datatypes.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.duration.iso = token { pattern = "[0-9.,DHMPRSTWYZ/:+\-]+" }
Example
<time dur-iso="PT0,75H">three-quarters of an hour</time>
Example
<date dur-iso="P1,5D">a day and a half</date>
Example
<date dur-iso="P14D">a fortnight</date>
Example
<time dur-iso="PT0.02S">20 ms</time>
Note

A duration is expressed as a sequence of number-letter pairs, preceded by the letter P; the letter gives the unit and may be Y (year), M (month), D (day), H (hour), M (minute), or S (second), in that order. The numbers are all unsigned integers, except for the last, which may have a decimal component (using either . or , as the decimal point; the latter is preferred). If any number is 0, then that number-letter pair may be omitted. If any of the H (hour), M (minute), or S (second) number-letter pairs are present, then the separator T must precede the first ‘time’ number-letter pair.

For complete details, see ISO 8601 Data elements and interchange formats — Information interchange — Representation of dates and times.

Appendix A.4.4 teidata.duration.w3c

teidata.duration.w3c defines the range of attribute values available for representation of a duration in time using W3C datatypes.
Moduletei — Formal specification
Used by
Content model
 <content>
  <dataRef name="duration"/>
 </content>
-    
Declaration
-tei_teidata.duration.w3c = xsd:duration
Example
<time dur="PT45M">forty-five minutes</time>
Example
<date dur="P1DT12H">a day and a half</date>
Example
<date dur="P7D">a week</date>
Example
<time dur="PT0.02S">20 ms</time>
Note

A duration is expressed as a sequence of number-letter pairs, preceded by the letter P; the letter gives the unit and may be Y (year), M (month), D (day), H (hour), M (minute), or S (second), in that order. The numbers are all unsigned integers, except for the S number, which may have a decimal component (using . as the decimal point). If any number is 0, then that number-letter pair may be omitted. If any of the H (hour), M (minute), or S (second) number-letter pairs are present, then the separator T must precede the first ‘time’ number-letter pair.

For complete details, see the W3C specification.

Appendix A.4.5 teidata.enumerated

teidata.enumerated defines the range of attribute values expressed as a single XML name taken from a list of documented possibilities.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.duration.w3c = xsd:duration
Example
<time dur="PT45M">forty-five minutes</time>
Example
<date dur="P1DT12H">a day and a half</date>
Example
<date dur="P7D">a week</date>
Example
<time dur="PT0.02S">20 ms</time>
Note

A duration is expressed as a sequence of number-letter pairs, preceded by the letter P; the letter gives the unit and may be Y (year), M (month), D (day), H (hour), M (minute), or S (second), in that order. The numbers are all unsigned integers, except for the S number, which may have a decimal component (using . as the decimal point). If any number is 0, then that number-letter pair may be omitted. If any of the H (hour), M (minute), or S (second) number-letter pairs are present, then the separator T must precede the first ‘time’ number-letter pair.

For complete details, see the W3C specification.

Appendix A.4.5 teidata.enumerated

teidata.enumerated defines the range of attribute values expressed as a single XML name taken from a list of documented possibilities.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef key="teidata.word"/>
 </content>
-    
Declaration
-tei_teidata.enumerated = teidata.word
Note

Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Typically, the list of documented possibilities will be provided (or exemplified) by a value list in the associated attribute specification, expressed with a <valList> element.

Appendix A.4.6 teidata.language

teidata.language defines the range of attribute values used to identify a particular combination of human language and writing system. [6.1. Language Identification]
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.enumerated = teidata.word
Note

Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Typically, the list of documented possibilities will be provided (or exemplified) by a value list in the associated attribute specification, expressed with a <valList> element.

Appendix A.4.6 teidata.language

teidata.language defines the range of attribute values used to identify a particular combination of human language and writing system. [6.1. Language Identification]
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <alternate>
   <dataRef name="language"/>
@@ -4092,13 +4109,13 @@
   </valList>
  </alternate>
 </content>
-    
Declaration
-tei_teidata.language = xsd:language | ( "" )
Note

The values for this attribute are language ‘tags’ as defined in BCP 47. Currently BCP 47 comprises RFC 5646 and RFC 4647; over time, other IETF documents may succeed these as the best current practice.

A ‘language tag’, per BCP 47, is assembled from a sequence of components or subtags separated by the hyphen character (-, U+002D). The tag is made of the following subtags, in the following order. Every subtag except the first is optional. If present, each occurs only once, except the fourth and fifth components (variant and extension), which are repeatable.

language
The IANA-registered code for the language. This is almost always the same as the ISO 639 2-letter language code if there is one. The list of available registered language subtags can be found at http://www.iana.org/assignments/language-subtag-registry. It is recommended that this code be written in lower case.
script
The ISO 15924 code for the script. These codes consist of 4 letters, and it is recommended they be written with an initial capital, the other three letters in lower case. The canonical list of codes is maintained by the Unicode Consortium, and is available at http://unicode.org/iso15924/iso15924-codes.html. The IETF recommends this code be omitted unless it is necessary to make a distinction you need.
region
Either an ISO 3166 country code or a UN M.49 region code that is registered with IANA (not all such codes are registered, e.g. UN codes for economic groupings or codes for countries for which there is already an ISO 3166 2-letter code are not registered). The former consist of 2 letters, and it is recommended they be written in upper case; the list of codes can be searched or browsed at https://www.iso.org/obp/ui/#search/code/. The latter consist of 3 digits; the list of codes can be found at http://unstats.un.org/unsd/methods/m49/m49.htm.
variant
An IANA-registered variation. These codes ‘are used to indicate additional, well-recognized variations that define a language or its dialects that are not covered by other available subtags’.
extension
An extension has the format of a single letter followed by a hyphen followed by additional subtags. These exist to allow for future extension to BCP 47, but as of this writing no such extensions are in use.
private use
An extension that uses the initial subtag of the single letter x (i.e., starts with x-) has no meaning except as negotiated among the parties involved. These should be used with great care, since they interfere with the interoperability that use of RFC 4646 is intended to promote. In order for a document that makes use of these subtags to be TEI-conformant, a corresponding <language> element must be present in the TEI header.

There are two exceptions to the above format. First, there are language tags in the IANA registry that do not match the above syntax, but are present because they have been ‘grandfathered’ from previous specifications.

Second, an entire language tag can consist of only a private use subtag. These tags start with x-, and do not need to follow any further rules established by the IETF and endorsed by these Guidelines. Like all language tags that make use of private use subtags, the language in question must be documented in a corresponding <language> element in the TEI header.

Examples include

sn
Shona
zh-TW
Taiwanese
zh-Hant-HK
Chinese written in traditional script as used in Hong Kong
en-SL
English as spoken in Sierra Leone
pl
Polish
es-MX
Spanish as spoken in Mexico
es-419
Spanish as spoken in Latin America

The W3C Internationalization Activity has published a useful introduction to BCP 47, Language tags in HTML and XML.

Appendix A.4.7 teidata.name

teidata.name defines the range of attribute values expressed as an XML Name.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.language = xsd:language | ( "" )
Note

The values for this attribute are language ‘tags’ as defined in BCP 47. Currently BCP 47 comprises RFC 5646 and RFC 4647; over time, other IETF documents may succeed these as the best current practice.

A ‘language tag’, per BCP 47, is assembled from a sequence of components or subtags separated by the hyphen character (-, U+002D). The tag is made of the following subtags, in the following order. Every subtag except the first is optional. If present, each occurs only once, except the fourth and fifth components (variant and extension), which are repeatable.

language
The IANA-registered code for the language. This is almost always the same as the ISO 639 2-letter language code if there is one. The list of available registered language subtags can be found at http://www.iana.org/assignments/language-subtag-registry. It is recommended that this code be written in lower case.
script
The ISO 15924 code for the script. These codes consist of 4 letters, and it is recommended they be written with an initial capital, the other three letters in lower case. The canonical list of codes is maintained by the Unicode Consortium, and is available at http://unicode.org/iso15924/iso15924-codes.html. The IETF recommends this code be omitted unless it is necessary to make a distinction you need.
region
Either an ISO 3166 country code or a UN M.49 region code that is registered with IANA (not all such codes are registered, e.g. UN codes for economic groupings or codes for countries for which there is already an ISO 3166 2-letter code are not registered). The former consist of 2 letters, and it is recommended they be written in upper case; the list of codes can be searched or browsed at https://www.iso.org/obp/ui/#search/code/. The latter consist of 3 digits; the list of codes can be found at http://unstats.un.org/unsd/methods/m49/m49.htm.
variant
An IANA-registered variation. These codes ‘are used to indicate additional, well-recognized variations that define a language or its dialects that are not covered by other available subtags’.
extension
An extension has the format of a single letter followed by a hyphen followed by additional subtags. These exist to allow for future extension to BCP 47, but as of this writing no such extensions are in use.
private use
An extension that uses the initial subtag of the single letter x (i.e., starts with x-) has no meaning except as negotiated among the parties involved. These should be used with great care, since they interfere with the interoperability that use of RFC 4646 is intended to promote. In order for a document that makes use of these subtags to be TEI-conformant, a corresponding <language> element must be present in the TEI header.

There are two exceptions to the above format. First, there are language tags in the IANA registry that do not match the above syntax, but are present because they have been ‘grandfathered’ from previous specifications.

Second, an entire language tag can consist of only a private use subtag. These tags start with x-, and do not need to follow any further rules established by the IETF and endorsed by these Guidelines. Like all language tags that make use of private use subtags, the language in question must be documented in a corresponding <language> element in the TEI header.

Examples include

sn
Shona
zh-TW
Taiwanese
zh-Hant-HK
Chinese written in traditional script as used in Hong Kong
en-SL
English as spoken in Sierra Leone
pl
Polish
es-MX
Spanish as spoken in Mexico
es-419
Spanish as spoken in Latin America

The W3C Internationalization Activity has published a useful introduction to BCP 47, Language tags in HTML and XML.

Appendix A.4.7 teidata.name

teidata.name defines the range of attribute values expressed as an XML Name.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef name="Name"/>
 </content>
-    
Declaration
-tei_teidata.name = xsd:Name
Note

Attributes using this datatype must contain a single word which follows the rules defining a legal XML name (see https://www.w3.org/TR/REC-xml/#dt-name): for example they cannot include whitespace or begin with digits.

Appendix A.4.8 teidata.numeric

teidata.numeric defines the range of attribute values used for numeric values.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.name = xsd:Name
Note

Attributes using this datatype must contain a single word which follows the rules defining a legal XML name (see https://www.w3.org/TR/REC-xml/#dt-name): for example they cannot include whitespace or begin with digits.

Appendix A.4.8 teidata.numeric

teidata.numeric defines the range of attribute values used for numeric values.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <alternate>
   <dataRef name="double"/>
@@ -4107,60 +4124,60 @@
   <dataRef name="decimal"/>
  </alternate>
 </content>
-    
Declaration
+    
Declaration
 tei_teidata.numeric =
-   xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
Note

Any numeric value, represented as a decimal number, in floating point format, or as a ratio.

To represent a floating point number, expressed in scientific notation, ‘E notation’, a variant of ‘exponential notation’, may be used. In this format, the value is expressed as two numbers separated by the letter E. The first number, the significand (sometimes called the mantissa) is given in decimal format, while the second is an integer. The value is obtained by multiplying the mantissa by 10 the number of times indicated by the integer. Thus the value represented in decimal notation as 1000.0 might be represented in scientific notation as 10E3.

A value expressed as a ratio is represented by two integer values separated by a solidus (/) character. Thus, the value represented in decimal notation as 0.5 might be represented as a ratio by the string 1/2.

Appendix A.4.9 teidata.outputMeasurement

teidata.outputMeasurement defines a range of values for use in specifying the size of an object that is intended for display.
Moduletei — Formal specification
Used by
Content model
+   xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
Note

Any numeric value, represented as a decimal number, in floating point format, or as a ratio.

To represent a floating point number, expressed in scientific notation, ‘E notation’, a variant of ‘exponential notation’, may be used. In this format, the value is expressed as two numbers separated by the letter E. The first number, the significand (sometimes called the mantissa) is given in decimal format, while the second is an integer. The value is obtained by multiplying the mantissa by 10 the number of times indicated by the integer. Thus the value represented in decimal notation as 1000.0 might be represented in scientific notation as 10E3.

A value expressed as a ratio is represented by two integer values separated by a solidus (/) character. Thus, the value represented in decimal notation as 0.5 might be represented as a ratio by the string 1/2.

Appendix A.4.9 teidata.outputMeasurement

teidata.outputMeasurement defines a range of values for use in specifying the size of an object that is intended for display.
Moduletei — Formal specification
Used by
Content model
 <content>
  <dataRef name="token"
   restriction="[\-+]?\d+(\.\d+)?(%|cm|mm|in|pt|pc|px|em|ex|ch|rem|vw|vh|vmin|vmax)"/>
 </content>
-    
Declaration
+    
Declaration
 tei_teidata.outputMeasurement =
    token
    {
       pattern = "[\-+]?\d+(\.\d+)?(%|cm|mm|in|pt|pc|px|em|ex|ch|rem|vw|vh|vmin|vmax)"
-   }
Example
<figure> + }
Example
<figure>  <head>The TEI Logo</head>  <figDesc>Stylized yellow angle brackets with the letters <mentioned>TEI</mentioned> in    between and <mentioned>text encoding initiative</mentioned> underneath, all on a white    background.</figDesc>  <graphic height="600pxwidth="600px"   url="http://www.tei-c.org/logos/TEI-600.jpg"/> -</figure>
Note

These values map directly onto the values used by XSL-FO and CSS. For definitions of the units see those specifications; at the time of this writing the most complete list is in the CSS3 working draft.

Appendix A.4.10 teidata.pattern

teidata.pattern defines attribute values which are expressed as a regular expression.
Moduletei — Formal specification
Used by
Element:
Content model
+</figure>
Note

These values map directly onto the values used by XSL-FO and CSS. For definitions of the units see those specifications; at the time of this writing the most complete list is in the CSS3 working draft.

Appendix A.4.10 teidata.pattern

teidata.pattern defines attribute values which are expressed as a regular expression.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef name="token"/>
 </content>
-    
Declaration
-tei_teidata.pattern = token
Note
A regular expression, often called a pattern, is an expression that describes a set of strings. They are usually used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings Handel, Händel, and Haendel can be described by the pattern H(ä|ae?)ndel (or alternatively, it is said that the pattern H(ä|ae?)ndel matches each of the three strings)
Wikipedia

This TEI datatype is mapped to the XSD token datatype, and may therefore contain any string of characters. However, it is recommended that the value used conform to the particular flavour of regular expression syntax supported by XSD Schema.

Appendix A.4.11 teidata.pointer

teidata.pointer defines the range of attribute values used to provide a single URI, absolute or relative, pointing to some other resource, either within the current document or elsewhere.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.pattern = token
Note
A regular expression, often called a pattern, is an expression that describes a set of strings. They are usually used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings Handel, Händel, and Haendel can be described by the pattern H(ä|ae?)ndel (or alternatively, it is said that the pattern H(ä|ae?)ndel matches each of the three strings)
Wikipedia

This TEI datatype is mapped to the XSD token datatype, and may therefore contain any string of characters. However, it is recommended that the value used conform to the particular flavour of regular expression syntax supported by XSD Schema.

Appendix A.4.11 teidata.pointer

teidata.pointer defines the range of attribute values used to provide a single URI, absolute or relative, pointing to some other resource, either within the current document or elsewhere.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef restriction="\S+" name="anyURI"/>
 </content>
-    
Declaration
-tei_teidata.pointer = xsd:anyURI { pattern = "\S+" }
Note

The range of syntactically valid values is defined by RFC 3986 Uniform Resource Identifier (URI): Generic Syntax. Note that the values themselves are encoded using RFC 3987 Internationalized Resource Identifiers (IRIs) mapping to URIs. For example, https://secure.wikimedia.org/wikipedia/en/wiki/% is encoded as https://secure.wikimedia.org/wikipedia/en/wiki/%25 while http://موقع.وزارة-الاتصالات.مصر/ is encoded as http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c/

Appendix A.4.12 teidata.prefix

teidata.prefix defines a range of values that may function as a URI scheme name.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.pointer = xsd:anyURI { pattern = "\S+" }
Note

The range of syntactically valid values is defined by RFC 3986 Uniform Resource Identifier (URI): Generic Syntax. Note that the values themselves are encoded using RFC 3987 Internationalized Resource Identifiers (IRIs) mapping to URIs. For example, https://secure.wikimedia.org/wikipedia/en/wiki/% is encoded as https://secure.wikimedia.org/wikipedia/en/wiki/%25 while http://موقع.وزارة-الاتصالات.مصر/ is encoded as http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c/

Appendix A.4.12 teidata.prefix

teidata.prefix defines a range of values that may function as a URI scheme name.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef name="token"
   restriction="[a-z][a-z0-9\+\.\-]*"/>
 </content>
-    
Declaration
-tei_teidata.prefix = token { pattern = "[a-z][a-z0-9\+\.\-]*" }
Note

This datatype is used to constrain a string of characters to one that can be used as a URI scheme name according to RFC 3986, section 3.1. Thus only the 26 lowercase letters a–z, the 10 digits 0–9, the plus sign, the period, and the hyphen are permitted, and the value must start with a letter.

Appendix A.4.13 teidata.probCert

teidata.probCert defines a range of attribute values which can be expressed either as a numeric probability or as a coded certainty value.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.prefix = token { pattern = "[a-z][a-z0-9\+\.\-]*" }
Note

This datatype is used to constrain a string of characters to one that can be used as a URI scheme name according to RFC 3986, section 3.1. Thus only the 26 lowercase letters a–z, the 10 digits 0–9, the plus sign, the period, and the hyphen are permitted, and the value must start with a letter.

Appendix A.4.13 teidata.probCert

teidata.probCert defines a range of attribute values which can be expressed either as a numeric probability or as a coded certainty value.
Moduletei — Formal specification
Used by
Content model
 <content>
  <alternate>
   <dataRef key="teidata.probability"/>
   <dataRef key="teidata.certainty"/>
  </alternate>
 </content>
-    
Declaration
-tei_teidata.probCert = teidata.probability | teidata.certainty

Appendix A.4.14 teidata.probability

teidata.probability defines the range of attribute values expressing a probability.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.probCert = teidata.probability | teidata.certainty

Appendix A.4.14 teidata.probability

teidata.probability defines the range of attribute values expressing a probability.
Moduletei — Formal specification
Used by
Content model
 <content>
  <dataRef name="double"/>
 </content>
-    
Declaration
-tei_teidata.probability = xsd:double
Note

Probability is expressed as a real number between 0 and 1; 0 representing certainly false and 1 representing certainly true.

Appendix A.4.15 teidata.replacement

teidata.replacement defines attribute values which contain a replacement template.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.probability = xsd:double
Note

Probability is expressed as a real number between 0 and 1; 0 representing certainly false and 1 representing certainly true.

Appendix A.4.15 teidata.replacement

teidata.replacement defines attribute values which contain a replacement template.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <textNode/>
 </content>
-    
Declaration
-tei_teidata.replacement = text

Appendix A.4.16 teidata.temporal.iso

teidata.temporal.iso defines the range of attribute values expressing a temporal expression such as a date, a time, or a combination of them, that conform to the international standard Data elements and interchange formats – Information interchange – Representation of dates and times.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.replacement = text

Appendix A.4.16 teidata.temporal.iso

teidata.temporal.iso defines the range of attribute values expressing a temporal expression such as a date, a time, or a combination of them, that conform to the international standard Data elements and interchange formats – Information interchange – Representation of dates and times.
Moduletei — Formal specification
Used by
Content model
 <content>
  <alternate>
   <dataRef name="date"/>
@@ -4175,7 +4192,7 @@
    restriction="[0-9.,DHMPRSTWYZ/:+\-]+"/>
  </alternate>
 </content>
-    
Declaration
+    
Declaration
 tei_teidata.temporal.iso =
    xsd:date
  | xsd:gYear
@@ -4185,7 +4202,7 @@
  | xsd:gMonthDay
  | xsd:time
  | xsd:dateTime
- | token { pattern = "[0-9.,DHMPRSTWYZ/:+\-]+" }
Note

If it is likely that the value used is to be compared with another, then a time zone indicator should always be included, and only the dateTime representation should be used.

For all representations for which ISO 8601:2004 describes both a basic and an extended format, these Guidelines recommend use of the extended format.

Appendix A.4.17 teidata.temporal.w3c

teidata.temporal.w3c defines the range of attribute values expressing a temporal expression such as a date, a time, or a combination of them, that conform to the W3C XML Schema Part 2: Datatypes Second Edition specification.
Moduletei — Formal specification
Used by
Element:
Content model
+ | token { pattern = "[0-9.,DHMPRSTWYZ/:+\-]+" }
Note

If it is likely that the value used is to be compared with another, then a time zone indicator should always be included, and only the dateTime representation should be used.

For all representations for which ISO 8601:2004 describes both a basic and an extended format, these Guidelines recommend use of the extended format.

Appendix A.4.17 teidata.temporal.w3c

teidata.temporal.w3c defines the range of attribute values expressing a temporal expression such as a date, a time, or a combination of them, that conform to the W3C XML Schema Part 2: Datatypes Second Edition specification.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <alternate>
   <dataRef name="date"/>
@@ -4198,7 +4215,7 @@
   <dataRef name="dateTime"/>
  </alternate>
 </content>
-    
Declaration
+    
Declaration
 tei_teidata.temporal.w3c =
    xsd:date
  | xsd:gYear
@@ -4207,30 +4224,30 @@
  | xsd:gYearMonth
  | xsd:gMonthDay
  | xsd:time
- | xsd:dateTime
Note

If it is likely that the value used is to be compared with another, then a time zone indicator should always be included, and only the dateTime representation should be used.

Appendix A.4.18 teidata.text

teidata.text defines the range of attribute values used to express some kind of identifying string as a single sequence of Unicode characters possibly including whitespace.
Moduletei — Formal specification
Used by
Element:
Content model
+ | xsd:dateTime
Note

If it is likely that the value used is to be compared with another, then a time zone indicator should always be included, and only the dateTime representation should be used.

Appendix A.4.18 teidata.text

teidata.text defines the range of attribute values used to express some kind of identifying string as a single sequence of Unicode characters possibly including whitespace.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef name="string"/>
 </content>
-    
Declaration
-tei_teidata.text = string
Note

Attributes using this datatype must contain a single ‘token’ in which whitespace and other punctuation characters are permitted.

Appendix A.4.19 teidata.truthValue

teidata.truthValue defines the range of attribute values used to express a truth value.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.text = string
Note

Attributes using this datatype must contain a single ‘token’ in which whitespace and other punctuation characters are permitted.

Appendix A.4.19 teidata.truthValue

teidata.truthValue defines the range of attribute values used to express a truth value.
Moduletei — Formal specification
Used by
Content model
 <content>
  <dataRef name="boolean"/>
 </content>
-    
Declaration
-tei_teidata.truthValue = xsd:boolean
Note

The possible values of this datatype are 1 or true, or 0 or false.

This datatype applies only for cases where uncertainty is inappropriate; if the attribute concerned may have a value other than true or false, e.g. unknown, or inapplicable, it should have the extended version of this datatype: teidata.xTruthValue.

Appendix A.4.20 teidata.versionNumber

teidata.versionNumber defines the range of attribute values used for version numbers.
Moduletei — Formal specification
Used by
Element:
Content model
+    
Declaration
+tei_teidata.truthValue = xsd:boolean
Note

The possible values of this datatype are 1 or true, or 0 or false.

This datatype applies only for cases where uncertainty is inappropriate; if the attribute concerned may have a value other than true or false, e.g. unknown, or inapplicable, it should have the extended version of this datatype: teidata.xTruthValue.

Appendix A.4.20 teidata.versionNumber

teidata.versionNumber defines the range of attribute values used for version numbers.
Moduletei — Formal specification
Used by
Element:
Content model
 <content>
  <dataRef name="token"
   restriction="[\d]+[a-z]*[\d]*(\.[\d]+[a-z]*[\d]*){0,3}"/>
 </content>
-    
Declaration
+    
Declaration
 tei_teidata.versionNumber =
-   token { pattern = "[\d]+[a-z]*[\d]*(\.[\d]+[a-z]*[\d]*){0,3}" }

Appendix A.4.21 teidata.word

teidata.word defines the range of attribute values expressed as a single word or token.
Moduletei — Formal specification
Used by
teidata.enumeratedElement:
Content model
+   token { pattern = "[\d]+[a-z]*[\d]*(\.[\d]+[a-z]*[\d]*){0,3}" }

Appendix A.4.21 teidata.word

teidata.word defines the range of attribute values expressed as a single word or token.
Moduletei — Formal specification
Used by
teidata.enumeratedElement:
Content model
 <content>
  <dataRef name="token"
   restriction="[^\p{C}\p{Z}]+"/>
 </content>
-    
Declaration
-tei_teidata.word = token { pattern = "[^\p{C}\p{Z}]+" }
Note

Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Appendix A.4.22 teidata.xTruthValue

teidata.xTruthValue (extended truth value) defines the range of attribute values used to express a truth value which may be unknown.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.word = token { pattern = "[^\p{C}\p{Z}]+" }
Note

Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Appendix A.4.22 teidata.xTruthValue

teidata.xTruthValue (extended truth value) defines the range of attribute values used to express a truth value which may be unknown.
Moduletei — Formal specification
Used by
Content model
 <content>
  <alternate>
   <dataRef name="boolean"/>
@@ -4240,15 +4257,15 @@
   </valList>
  </alternate>
 </content>
-    
Declaration
-tei_teidata.xTruthValue = xsd:boolean | ( "unknown" | "inapplicable" )
Note

In cases where where uncertainty is inappropriate, use the datatype teidata.TruthValue.

Appendix A.4.23 teidata.xpath

teidata.xpath defines attribute values which contain an XPath expression.
Moduletei — Formal specification
Used by
Content model
+    
Declaration
+tei_teidata.xTruthValue = xsd:boolean | ( "unknown" | "inapplicable" )
Note

In cases where where uncertainty is inappropriate, use the datatype teidata.TruthValue.

Appendix A.4.23 teidata.xpath

teidata.xpath defines attribute values which contain an XPath expression.
Moduletei — Formal specification
Used by
Content model
 <content>
  <textNode/>
 </content>
-    
Declaration
-tei_teidata.xpath = text
Note

Any XPath expression using the syntax defined in 6.2..

When writing programs that evaluate XPath expressions, programmers should be mindful of the possibility of malicious code injection attacks. For further information about XPath injection attacks, see the article at OWASP.

Notes
1
Note that this is a illustrative example, i.e. a valid ParlaMint corpus would also need certain attributes to be defined on the illustrated elements. This holds for all the examples in this section.
2
Note that parliaments also have unaffiliated (or independent) MPs, that can either belong to a special ‘unaffiliated’ parliamentary group or don't belong to any parliamentary group. For the former, they are simply not affiliated to any parliamentary group. For the latter, an ‘unaffiliated’ parlimentaryGroup organisation must be created, and such MPs are affiliated with it as members.
3
The typical situation is that the organisation somebody is affiliated with is specificed as a organisation, using the <org> element (cf. the Section on Organisations) but if this is not the case, using <orgName> directly in the <affiliation> is an alternative encoding.
4
Note that, in general, the utterance can also be split in the middle of a sentence, which brings with it problems for automatic linguistic processing, as, ideally, the parts should be first joined, and only then processed.
5
These are typically tagset developed and used for specific languages and can be found in the XPOS column of CoNLL-U files, which is the native format for UD treebanks.
6
Note that the example is rendered in three lines, however, the correct encoding in the corpus is actually in a single line, without any spaces between the elements, as otherwise the new line and indenting spaces are actually a part of the word ‘abyste’.
Tomaž Erjavec, tomaz.erjavec@ijs.si, Matyáš Kopp, kopp@ufal.mff.cuni.cz and Andrej Pančur, andrej.pancur@inz.si. Date: 2023-08-20
Notes
1
Note that this is a illustrative example, i.e. a valid ParlaMint corpus would also need certain attributes to be defined on the illustrated elements. This holds for all the examples in this section.
2
Note that parliaments also have unaffiliated (or independent) MPs, that can either belong to a special ‘unaffiliated’ parliamentary group or don't belong to any parliamentary group. For the former, they are simply not affiliated to any parliamentary group. For the latter, an ‘unaffiliated’ parlimentaryGroup organisation must be created, and such MPs are affiliated with it as members.
3
The typical situation is that the organisation somebody is affiliated with is specificed as a organisation, using the <org> element (cf. the Section on Organisations) but if this is not the case, using <orgName> directly in the <affiliation> is an alternative encoding.
4
Note that, in general, the utterance can also be split in the middle of a sentence, which brings with it problems for automatic linguistic processing, as, ideally, the parts should be first joined, and only then processed.
5
These are typically tagset developed and used for specific languages and can be found in the XPOS column of CoNLL-U files, which is the native format for UD treebanks.
6
Note that the example is rendered in three lines, however, the correct encoding in the corpus is actually in a single line, without any spaces between the elements, as otherwise the new line and indenting spaces are actually a part of the word ‘abyste’.
Tomaž Erjavec, tomaz.erjavec@ijs.si, Matyáš Kopp, kopp@ufal.mff.cuni.cz and Andrej Pančur, andrej.pancur@inz.si. Date: 2023-08-22
\ No newline at end of file