diff --git a/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml b/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml
index 11f9314..d23b98a 100644
--- a/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml
+++ b/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml
@@ -113,11 +113,12 @@
This paper describes how we dealt with the encoding and transformation of the
punctuation in the Early New High German edition of Marco Polo’s travel account.
Technically, we implemented a set of general rules (as XSLT
- templates) plus various exceptions (as descriptive instructions in XML attributes),
- and applied them in an automated fashion (using XProc pipelines). In addition to
- this, we discuss the philological foundation of this method and, contextually, we
- address the topic of the transformation of a single original source into different
+ xml:id="R1" target="#xslt"/>XSLT templates)
+ plus various exceptions (as descriptive instructions in XML attributes), and applied
+ them in an automated fashion (using XProc pipelines). In addition to this, we
+ discuss the philological foundation of this method and, contextually, we address the
+ topic of the transformation of a single original source into different
transcriptions: from a highly diplomatic edition to an interpretative one, going
through a spectrum of intermediate levels of normalization. We also reflect on the
separation between transcription and analysis, as well as on the role of the editor
@@ -146,8 +147,9 @@
These issues quickly arose while editing:
- the master TEI file became too big and its structure too complex, thus too hard
- to navigate and maintain, even when using advanced XML editors such as Oxygen
- XML;
+ to navigate and maintain, even when using advanced XML editors such as Oxygen XML;
- normalizing punctuation revealed itself as a complex task that required
profound changes to the structure of the edited text.
@@ -187,9 +189,10 @@
This paper provides an overview of our approach (section 3) and shows how our
approach addresses both issues. In particular we show how normalizing punctuation
represents a dramatic step beyond the more classical normalization of words. The
- current implementation of our approach, based on XProc and XSLT, is also
- presented in section 3.
+ current implementation of our approach, based on XProc and XSLT, is also presented in section 3.
Moving to an editorial workflow with such a level of automation requires a
reevaluation of the role of the editor, from wordsmith to formalizer of rules (and
exceptions). In section 4 we discuss how our approach fits the recent
@@ -257,15 +260,17 @@
(revealing, opening up to the text) in its nature, as some specific aspects of the
text are presented in such a way that the reader is granted more informed access
to them.
- The edition will be published online using a specifically tailored version of EVT
- (Edition Visualization TechnologyA light-weight,
- open source tool specifically designed to create digital editions from
- XML-encoded texts
([Rosselli Del Turco et al. 2013](#delturco2013)).) and will
- present, on the one hand, each witness in its continuum from facsimile to multiple
- levels of normalization and, on the other hand, the three main witnesses in
- synopsis. From each module of the edition and from each of the texts composing the
- editorial project it will be possible to access a twofold commentary: specific
+
The edition will be published online using a specifically tailored version of EVT (Edition Visualization TechnologyA
+ light-weight, open source tool specifically designed to create digital
+ editions from XML-encoded texts
+ ([Rosselli Del Turco et al. 2013](#delturco2013)).) and
+ will present, on the one hand, each witness in its continuum from facsimile to
+ multiple levels of normalization and, on the other hand, the three main witnesses
+ in synopsis. From each module of the edition and from each of the texts composing
+ the editorial project it will be possible to access a twofold commentary: specific
notes referring to the named entities and the realia appearing in the text, and philological notes referring
either to all of the three witnesses or to one witness in particular.
@@ -784,7 +789,8 @@
- Multiple editions will be generated automatically from the master TEI file,
with no manual intervention on the resulting files.
- The generated editions files will conform to the TEI subset understood by
- EVT.
+ EVT.
Some of these desiderata clash with each other. For instance, the desire to
directly edit the XML file makes it hard and error-prone to keep in a single
@@ -848,8 +854,9 @@
level="m">Digital Vercelli Book, by [Roberto Rosselli Del Turco (n.d.)](#delturcond): here two
levels of edition are offered, a diplomatic and a more interpretative one. The
- user can compare the two editions visualizing them synoptically in the EVT
- software used for the edition.
+ user can compare the two editions visualizing them synoptically in the EVT software used for the edition.
@@ -878,11 +885,12 @@
- the edition files, despite being the main concrete output of the editorial
project, are ephemeral and never modified directly.
- The implementation consists of a series of XSLT transformations, each
- representing and implementing a single rule, coordinated by three different XProc
- pipelines, one for each level of edition. The source code is available at .
+ The implementation consists of a series of XSLT transformations, each
+ representing and implementing a single rule, coordinated by three different XProc pipelines, one for each level of edition. The source code is available
+ at .
This methodology contrasts with the established editorial practice of mingling
transcription, normalization, and critical amendments. Instead of just performing the
desired normalization steps while transcribing and keeping track of them in an
@@ -934,14 +942,14 @@
system that does not allow for this interaction to happen is not able to deal in
properly with normalization in general and punctuation in particular.
Each rule is implemented as a small and self-contained XSLT
+ xml:id="R8" target="#xslt"/>XSLT
transformation. At the time of writing, the ENHG Marco Polo project comprises
about a hundred rules, grouped in twenty macro categories. On average, the core of
each rule is implemented in less than three lines of XSLT.
+ xml:id="R9" target="#xslt"/>XSLT.
To give the readers an impression of the simplicity of the rule implementation, we
- show here the main parts of the XSLT that implement one of the example
+ show here the main parts of the XSLT that implement one of the example
rules described above.
Example: Rule to Join Words Split at the End of a Line
@@ -949,8 +957,8 @@
used to mark that a word has been split at the end of a line. In the diplomatic
rendition we want to preserve this word division and the forced line break,
while in other renditions we want to reconstruct the complete word.
-
The XSLT excerpt in [The XSLT excerpt in ](#example3)[Example 3](#example3) shows how split words are joined when a middle double
oblique hyphen is found. The joining is performed in a lossless way: all
information present in the original witness is preserved. This is possible
@@ -993,8 +1001,8 @@
- XSLT implementation of the rule Join
+ XSLT implementation of the rule Join
words split with a double oblique hyphen
.
The rule in [Example 3](#example3) is
@@ -1085,13 +1093,16 @@
gray cogs indicate steps shared by all pipelines. The cogs with patterns
identify level-specific steps.
-
Each pipeline is implemented as an XProc pipeline. All the pipelines are simple
- linear flows (i.e., the output of a rule is the input for the next rule). From a
- methodological point of view, the XProc pipeline is a record of all the operations
- that the scholar performs on the transcription. The creation of an edition level
- is equivalent to replaying this record. [Example 6](#example6) shows an excerpt of the XProc pipeline used to generate the
- semidiplomatic edition.
+ Each pipeline is implemented as an XProc pipeline. All the
+ pipelines are simple linear flows (i.e., the output of a rule is the input for the
+ next rule). From a methodological point of view, the XProc
+ pipeline is a record of all the operations that the scholar performs on the
+ transcription. The creation of an edition level is equivalent to replaying this
+ record. [Example 6](#example6) shows an excerpt
+ of the XProc pipeline used to generate the semidiplomatic edition.
It is important to note that pipelines comprise three kinds of steps:
- infrastructural steps: for example, the tokenize step that
@@ -1138,17 +1149,19 @@
- Excerpt of the XProc pipeline used to generate the
- semi-diplomatic edition. Steps marked A are steps that implement rules; the
- step marked B takes care of exceptions.
+ Excerpt of the XProc pipeline used to
+ generate the semi-diplomatic edition. Steps marked A are steps that implement
+ rules; the step marked B takes care of exceptions.
-
The fact that the editorial workflows for all the editions are formalized in XProc
- pipelines makes it possible, for instance, to compare these pipelines and see in
- detail (and with utmost precision) how they differ and what is, in this project,
- the difference between the processes needed to establish a diplomatic, a
- semi-diplomatic or an interpretative edition. Breaking down the traditional
- analogue processes into unambiguous discrete steps can contribute to the scholarly
- debate on edition typology.
+ The fact that the editorial workflows for all the editions are formalized in XProc pipelines makes it possible, for instance, to compare these
+ pipelines and see in detail (and with utmost precision) how they differ and what
+ is, in this project, the difference between the processes needed to establish a
+ diplomatic, a semi-diplomatic or an interpretative edition. Breaking down the
+ traditional analogue processes into unambiguous discrete steps can contribute to
+ the scholarly debate on edition typology.
@@ -1191,19 +1204,21 @@
like to experiment with creating
declarative rule generators. Many rules
are repetitive in their nature (for example, the normalization of single characters)
and it should be possible to express them in a declarative fashion. These abstract
- rules would then be translated into
XSLT transformations. Another aspect we
+ rules would then be translated into
XSLT transformations. Another aspect we
would like to reflect on is how the transformation process directed by the pipelines
influences the various levels of abstraction of the document being transformed,
drawing parallels with stratified document models such as CMV+P (
[Barabucci 2019](#barabucci2019)). A final thing we would like to test
- is the replacement of the XProc pipelines with pure
XSLT pipelines
- (
[Birnbaum 2017](#birnbaum2017)). Replacing XProc
- with
XSLT pipelines would reduce the number of technologies that other
- scholars have to be familiar with in order to understand the editorial process in its
- entirety.
+ is the replacement of the
XProc pipelines with pure
XSLT pipelines
+ (
[Birnbaum 2017](#birnbaum2017)). Replacing
XProc with
XSLT pipelines would reduce the number of
+ technologies that other scholars have to be familiar with in order to understand the
+ editorial process in its entirety.
Another future development that we envision is the deconstruction of the
visualization of the edition into a series of small, explicit steps, taking place one
after the other, just like their counterparts in the pipelines: one click would show
@@ -1288,8 +1303,8 @@
Polo. Prima edizione integrale. Firenze:
Leo S. Olschki.
Birnbaum, David J. 2017.
- Patterns and Antipatterns in XSLT
+ Patterns and Antipatterns in XSLT
Micropipelining. In Proceedings of Balisage: The
Markup Conference 2017. Balisage Series on Markup
Technologies
@@ -1458,10 +1473,12 @@
The Digital Vercelli Book. Beta
version. Accessed October 22, 2021. .
- Rosselli Del Turco, Roberto, et al.
+ Rosselli Del Turco,
+ Roberto, et al.
2013. Edition Visualization Technology.
- Accessed April 19, 2021..
+ Accessed April 19, 2021..
Stella, Francesco, ed. 2020.
Corpus Rhythmorum Musicum. Last modified July
28, 2020.
+ -
+ XProc
+
+ [An XML Pipeline Language]
+