diff --git a/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml b/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml index 11f9314..d23b98a 100644 --- a/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml +++ b/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml @@ -113,11 +113,12 @@

This paper describes how we dealt with the encoding and transformation of the punctuation in the Early New High German edition of Marco Polo’s travel account. Technically, we implemented a set of general rules (as XSLT - templates) plus various exceptions (as descriptive instructions in XML attributes), - and applied them in an automated fashion (using XProc pipelines). In addition to - this, we discuss the philological foundation of this method and, contextually, we - address the topic of the transformation of a single original source into different + xml:id="R1" target="#xslt"/>XSLT templates) + plus various exceptions (as descriptive instructions in XML attributes), and applied + them in an automated fashion (using XProc pipelines). In addition to this, we + discuss the philological foundation of this method and, contextually, we address the + topic of the transformation of a single original source into different transcriptions: from a highly diplomatic edition to an interpretative one, going through a spectrum of intermediate levels of normalization. We also reflect on the separation between transcription and analysis, as well as on the role of the editor @@ -146,8 +147,9 @@

These issues quickly arose while editing:

the master TEI file became too big and its structure too complex, thus too hard - to navigate and maintain, even when using advanced XML editors such as Oxygen - XML; + to navigate and maintain, even when using advanced XML editors such as Oxygen XML; normalizing punctuation revealed itself as a complex task that required profound changes to the structure of the edited text. @@ -187,9 +189,10 @@

This paper provides an overview of our approach (section 3) and shows how our approach addresses both issues. In particular we show how normalizing punctuation represents a dramatic step beyond the more classical normalization of words. The - current implementation of our approach, based on XProc and XSLT, is also - presented in section 3.

+ current implementation of our approach, based on XProc and XSLT, is also presented in section 3.

Moving to an editorial workflow with such a level of automation requires a reevaluation of the role of the editor, from wordsmith to formalizer of rules (and exceptions). In section 4 we discuss how our approach fits the recent @@ -257,15 +260,17 @@ (revealing, opening up to the text) in its nature, as some specific aspects of the text are presented in such a way that the reader is granted more informed access to them.

-

The edition will be published online using a specifically tailored version of EVT - (Edition Visualization TechnologyA light-weight, - open source tool specifically designed to create digital editions from - XML-encoded texts (Rosselli Del Turco et al. 2013).) and will - present, on the one hand, each witness in its continuum from facsimile to multiple - levels of normalization and, on the other hand, the three main witnesses in - synopsis. From each module of the edition and from each of the texts composing the - editorial project it will be possible to access a twofold commentary: specific +

The edition will be published online using a specifically tailored version of EVT (Edition Visualization TechnologyA + light-weight, open source tool specifically designed to create digital + editions from XML-encoded texts + (Rosselli Del Turco et al. 2013).) and + will present, on the one hand, each witness in its continuum from facsimile to + multiple levels of normalization and, on the other hand, the three main witnesses + in synopsis. From each module of the edition and from each of the texts composing + the editorial project it will be possible to access a twofold commentary: specific notes referring to the named entities and the realia appearing in the text, and philological notes referring either to all of the three witnesses or to one witness in particular.

@@ -784,7 +789,8 @@ Multiple editions will be generated automatically from the master TEI file, with no manual intervention on the resulting files. The generated editions files will conform to the TEI subset understood by - EVT. + EVT.

Some of these desiderata clash with each other. For instance, the desire to directly edit the XML file makes it hard and error-prone to keep in a single @@ -848,8 +854,9 @@ level="m">Digital Vercelli Book, by Roberto Rosselli Del Turco (n.d.): here two levels of edition are offered, a diplomatic and a more interpretative one. The - user can compare the two editions visualizing them synoptically in the EVT - software used for the edition.

+ user can compare the two editions visualizing them synoptically in the EVT software used for the edition.

@@ -878,11 +885,12 @@ the edition files, despite being the main concrete output of the editorial project, are ephemeral and never modified directly. -

The implementation consists of a series of XSLT transformations, each - representing and implementing a single rule, coordinated by three different XProc - pipelines, one for each level of edition. The source code is available at .

+

The implementation consists of a series of XSLT transformations, each + representing and implementing a single rule, coordinated by three different XProc pipelines, one for each level of edition. The source code is available + at .

This methodology contrasts with the established editorial practice of mingling transcription, normalization, and critical amendments. Instead of just performing the desired normalization steps while transcribing and keeping track of them in an @@ -934,14 +942,14 @@ system that does not allow for this interaction to happen is not able to deal in properly with normalization in general and punctuation in particular.

Each rule is implemented as a small and self-contained XSLT + xml:id="R8" target="#xslt"/>XSLT transformation. At the time of writing, the ENHG Marco Polo project comprises about a hundred rules, grouped in twenty macro categories. On average, the core of each rule is implemented in less than three lines of XSLT.

+ xml:id="R9" target="#xslt"/>XSLT.

To give the readers an impression of the simplicity of the rule implementation, we - show here the main parts of the XSLT that implement one of the example + show here the main parts of the XSLT that implement one of the example rules described above.

Example: Rule to Join Words Split at the End of a Line @@ -949,8 +957,8 @@ used to mark that a word has been split at the end of a line. In the diplomatic rendition we want to preserve this word division and the forced line break, while in other renditions we want to reconstruct the complete word.

-

The XSLT excerpt in The XSLT excerpt in Example 3 shows how split words are joined when a middle double oblique hyphen is found. The joining is performed in a lossless way: all information present in the original witness is preserved. This is possible @@ -993,8 +1001,8 @@ - XSLT implementation of the rule Join + XSLT implementation of the rule Join words split with a double oblique hyphen.

The rule in Example 3 is @@ -1085,13 +1093,16 @@ gray cogs indicate steps shared by all pipelines. The cogs with patterns identify level-specific steps. -

Each pipeline is implemented as an XProc pipeline. All the pipelines are simple - linear flows (i.e., the output of a rule is the input for the next rule). From a - methodological point of view, the XProc pipeline is a record of all the operations - that the scholar performs on the transcription. The creation of an edition level - is equivalent to replaying this record. Example 6 shows an excerpt of the XProc pipeline used to generate the - semidiplomatic edition.

+

Each pipeline is implemented as an XProc pipeline. All the + pipelines are simple linear flows (i.e., the output of a rule is the input for the + next rule). From a methodological point of view, the XProc + pipeline is a record of all the operations that the scholar performs on the + transcription. The creation of an edition level is equivalent to replaying this + record. Example 6 shows an excerpt + of the XProc pipeline used to generate the semidiplomatic edition.

It is important to note that pipelines comprise three kinds of steps:

infrastructural steps: for example, the tokenize step that @@ -1138,17 +1149,19 @@ - Excerpt of the XProc pipeline used to generate the - semi-diplomatic edition. Steps marked A are steps that implement rules; the - step marked B takes care of exceptions. + Excerpt of the XProc pipeline used to + generate the semi-diplomatic edition. Steps marked A are steps that implement + rules; the step marked B takes care of exceptions. -

The fact that the editorial workflows for all the editions are formalized in XProc - pipelines makes it possible, for instance, to compare these pipelines and see in - detail (and with utmost precision) how they differ and what is, in this project, - the difference between the processes needed to establish a diplomatic, a - semi-diplomatic or an interpretative edition. Breaking down the traditional - analogue processes into unambiguous discrete steps can contribute to the scholarly - debate on edition typology.

+

The fact that the editorial workflows for all the editions are formalized in XProc pipelines makes it possible, for instance, to compare these + pipelines and see in detail (and with utmost precision) how they differ and what + is, in this project, the difference between the processes needed to establish a + diplomatic, a semi-diplomatic or an interpretative edition. Breaking down the + traditional analogue processes into unambiguous discrete steps can contribute to + the scholarly debate on edition typology.

@@ -1191,19 +1204,21 @@ like to experiment with creating declarative rule generators. Many rules are repetitive in their nature (for example, the normalization of single characters) and it should be possible to express them in a declarative fashion. These abstract - rules would then be translated into XSLT transformations. Another aspect we + rules would then be translated into XSLT transformations. Another aspect we would like to reflect on is how the transformation process directed by the pipelines influences the various levels of abstraction of the document being transformed, drawing parallels with stratified document models such as CMV+P (Barabucci 2019). A final thing we would like to test - is the replacement of the XProc pipelines with pure XSLT pipelines - (Birnbaum 2017). Replacing XProc - with XSLT pipelines would reduce the number of technologies that other - scholars have to be familiar with in order to understand the editorial process in its - entirety.

+ is the replacement of the XProc pipelines with pure XSLT pipelines + (Birnbaum 2017). Replacing XProc with XSLT pipelines would reduce the number of + technologies that other scholars have to be familiar with in order to understand the + editorial process in its entirety.

Another future development that we envision is the deconstruction of the visualization of the edition into a series of small, explicit steps, taking place one after the other, just like their counterparts in the pipelines: one click would show @@ -1288,8 +1303,8 @@ Polo. Prima edizione integrale. Firenze: Leo S. Olschki. Birnbaum, David J. 2017. - Patterns and Antipatterns in <ptr type="software" - xml:id="XSLT" target="#XSLT"/><rs type="soft.name" ref="#XSLT">XSLT</rs> + <title level="a">Patterns and Antipatterns in <ptr type="software" xml:id="R23" + target="#xslt"/><rs type="soft.name" ref="#R23">XSLT</rs> Micropipelining. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies @@ -1458,10 +1473,12 @@ The Digital Vercelli Book. Beta version. Accessed October 22, 2021. . - Rosselli Del Turco, Roberto, et al. + Rosselli Del Turco, + Roberto, et al. 2013. Edition Visualization Technology. - Accessed April 19, 2021.. + Accessed April 19, 2021.. Stella, Francesco, ed. 2020. Corpus Rhythmorum Musicum. Last modified July 28, 2020. + + XProc + + An XML Pipeline Language +