Skip to content

Commit

Permalink
merge
Browse files Browse the repository at this point in the history
  • Loading branch information
fernandaalvaf committed Feb 14, 2024
2 parents 2b04666 + ffccb38 commit ba9b707
Show file tree
Hide file tree
Showing 10 changed files with 582 additions and 384 deletions.
97 changes: 53 additions & 44 deletions data/JTEI/14_2021-23/jtei-bleeker-et-al-199-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -377,29 +377,30 @@
nonlinear objects, modeled as a GODDAG data structure (<ref target="#huit2003"
type="bibl">Huitfeldt and Sperberg-McQueen 2003</ref>). In GODDAG, all
children of the markup nodes are typically ordered, but TexMECS provides a
notation to mark certain markup nodes as unordered. The GODDAG processor
ignores the default linear order of these elements’ children, and therefore
TexMECS supports the representation of nonlinear structures. No known working
implementation of TexMECS, however, is currently available. At first glance,
EARMARK (Extremely Annotated RDF Markup) also seems to support the option to
represent nonlinearity: with EARMARK, users can express different linear
structures using RDF statements about text fragments, and in this way it is
possible to describe multiple text orders (<ref type="bibl" target="#per2009"
>Peroni and Vitali 2009</ref>, 4.1; <ref type="bibl" target="#iorio2009">Di
Iorio 2009</ref>). However, multi-orderedness is not the same as partial
orderedness: if a text is partially ordered, it means that (part of the) text
has no order. Multi-orderedness always implies a certain order. The EARMARK
specification as described in <ref type="bibl" target="#per2009">Peroni and
Vitali 2009</ref> does not natively support partially ordered text, in the
sense that EARMARK users cannot mark the branching of the text stream. It is
also important to note that EARMARK is a metamarkup language, which means that
users encode their texts not in EARMARK but in an RDF
serialization.<note>Recognizing the challenge of expressing literary texts
as RDF statements, <ref target="#bara2012" type="bibl" xml:id="quoteref8"
>Barabucci et al.</ref> developed the FRETTA approach, which is designed
<quote source="#quoteref8">to express EARMARK annotations in an embedded
syntax such as XML</quote>. It is unclear, however, whether this approach
has been further developed or implemented.</note></p>
notation to mark certain markup nodes as unordered. The <ptr type="software"
xml:id="R5" target="#goddag"/><rs type="soft.name" ref="#R5">GODDAG
processor</rs> ignores the default linear order of these elements’ children,
and therefore TexMECS supports the representation of nonlinear structures. No
known working implementation of TexMECS, however, is currently available. At
first glance, EARMARK (Extremely Annotated RDF Markup) also seems to support
the option to represent nonlinearity: with EARMARK, users can express different
linear structures using RDF statements about text fragments, and in this way it
is possible to describe multiple text orders (<ref type="bibl"
target="#per2009">Peroni and Vitali 2009</ref>, 4.1; <ref type="bibl"
target="#iorio2009">Di Iorio 2009</ref>). However, multi-orderedness is not
the same as partial orderedness: if a text is partially ordered, it means that
(part of the) text has no order. Multi-orderedness always implies a certain
order. The EARMARK specification as described in <ref type="bibl"
target="#per2009">Peroni and Vitali 2009</ref> does not natively support
partially ordered text, in the sense that EARMARK users cannot mark the
branching of the text stream. It is also important to note that EARMARK is a
metamarkup language, which means that users encode their texts not in EARMARK
but in an RDF serialization.<note>Recognizing the challenge of expressing
literary texts as RDF statements, <ref target="#bara2012" type="bibl"
xml:id="quoteref8">Barabucci et al.</ref> developed the FRETTA approach,
which is designed <quote source="#quoteref8">to express EARMARK annotations
in an embedded syntax such as XML</quote>. It is unclear, however,
whether this approach has been further developed or implemented.</note></p>
</div>
<div xml:id="discontinuity2">
<head>Discontinuity</head>
Expand Down Expand Up @@ -543,9 +544,11 @@
<p>TAGML may resemble existing markup languages like XML, TexMECS, or LMNL, but
TAGML is more expressive. For instance, in XML all annotation values are of
type string, but TAGML offers data-typing of annotations. These data types are
expressed in UTF-8 and interpreted by the TAGML parser as different data types.
Encoders can distinguish between integer, string, or Boolean values (<ptr
target="#tagml1" type="crossref"/>). <figure xml:id="tagml1">
expressed in UTF-8 and interpreted by the <ptr type="software" xml:id="R6"
target="#tagmlparser"/><rs type="soft.name" ref="#R6">TAGML parser</rs> as
different data types. Encoders can distinguish between integer, string, or
Boolean values (<ptr target="#tagml1" type="crossref"/>). <figure
xml:id="tagml1">
<graphic url="img/tagml1.png" width="1852px" height="70px"/>
<head type="legend">Example of TAGML, featuring different types of
annotation value.</head>
Expand All @@ -572,11 +575,17 @@
encoding complex textual features, TAGML is designed to make that modeling
process as natural as possible. The markup language has the same compactness as
XML and is independent of the user environment.<note>TAGML can be edited in any
editor, but the open source text editor Sublime has <ref
target="https://huygensing.github.io/tagml-sublime-syntax/"> a TAGML
syntax highlighting package</ref>, and the <ref
target="https://huygensing.github.io/alexandria/">reference
implementation Alexandria</ref> can be used to parse and validate TAGML
editor, but the open source text editor <ptr type="software" xml:id="R7"
target="#sublime"/><rs type="soft.name" ref="#R7">Sublime</rs> has <ptr
type="software" xml:id="R8" target="#sublimepackage"/><rs type="soft.url"
ref="#R7"><ref
target="https://huygensing.github.io/tagml-sublime-syntax/"> a <rs
type="soft.name" ref="#R8">TAGML syntax highlighting
package</rs></ref></rs>, and the <ptr type="software" xml:id="R9"
target="#alexandria"/><rs type="soft.url" ref="#R9"><ref
target="https://huygensing.github.io/alexandria/">reference
implementation <rs type="soft.name" ref="#R9"
>Alexandria</rs></ref></rs> can be used to parse and validate TAGML
documents and store them as a TAG hypergraph.</note> Following the argument
of <ref type="bibl" target="#sper2008">Sperberg-McQueen and Huitfeldt</ref> and
<ref target="#per2009" type="bibl">Peroni and Vitali</ref>, we did not
Expand Down Expand Up @@ -1048,8 +1057,8 @@
retrieve all quotes together. The first would not pose a problem for TEI XML, but
retrieving the disjointed quotations as one (merged) utterance would only be possible
with additional, vocabulary-specific coding. Processing the two <gi>q</gi> elements
as a single <gi>q</gi> requires a set of <ptr type="software" xml:id="XSLT"
target="#XSLT"/><rs type="soft.name" ref="#XSLT">XSLT</rs> instructions that check
as a single <gi>q</gi> requires a set of <ptr type="software" xml:id="R1"
target="#XSLT"/><rs type="soft.name" ref="#R1">XSLT</rs> instructions that check
the values of the <att>xml:id</att> and the <att>next</att> and <att>prev</att>
attributes in order to know which <gi>q</gi> elements should be stitched together. In
TAGML, both scenarios would be equally straightforward. The hypergraph can be queried
Expand Down Expand Up @@ -1091,22 +1100,22 @@
<head type="legend">TEI transcription of <ptr target="#discont4" type="crossref"
/></head>
</figure> To process the text of this fragment correctly, one needs to write a rather
complicated set of <ptr type="software" xml:id="XSLT" target="#XSLT"/><rs
type="soft.name" ref="#XSLT">XSLT</rs> instructions. At the very least, these
complicated set of <ptr type="software" xml:id="R2" target="#XSLT"/><rs
type="soft.name" ref="#R2">XSLT</rs> instructions. At the very least, these
instructions need to match the values of the <att>xml:id</att> and <att>prev</att> in
order to process the first part of the deletion, look for the second part of the
deletion, and then concatenate their textual content. At the same time, one has to
prevent the second part from being processed twice (first as the second part of the
deletion, and the second time together with the regular <gi>del</gi> elements). After
some experimenting and consulting several <ptr type="software" xml:id="XSLT"
target="#XSLT"/><rs type="soft.name" ref="#XSLT">XSLT</rs> specialists, we have
come to no less than three different sets of instructions.<note>The authors are
grateful to Peter Boot, Vincent Neyt, and Frederike Neuber for sharing their
expertise and invaluable insights.</note> And considering the ingenuity and
technical expertise of the TEI community, we are quite certain there are even more
ways. In short, it can be a challenging and time-consuming process to write and tweak
vocabulary-specific and schema-aware tools—a daunting task for any TEI XML user who
lacks a certain level of technical expertise. </p>
some experimenting and consulting several <ptr type="software" xml:id="R3"
target="#XSLT"/><rs type="soft.name" ref="#R3">XSLT</rs> specialists, we have come
to no less than three different sets of instructions.<note>The authors are grateful
to Peter Boot, Vincent Neyt, and Frederike Neuber for sharing their expertise and
invaluable insights.</note> And considering the ingenuity and technical expertise
of the TEI community, we are quite certain there are even more ways. In short, it can
be a challenging and time-consuming process to write and tweak vocabulary-specific
and schema-aware tools—a daunting task for any TEI XML user who lacks a certain level
of technical expertise. </p>
</div>
<div xml:id="conclusion">
<head>Conclusion</head>
Expand Down
16 changes: 8 additions & 8 deletions evaluation/csv/citation-types-frequencies.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Citation type,/2015/ abs. frequency (n=27),/2015/ rel. frequency (in %),/2016/ abs. frequency (n=45),/2016/ rel. frequency (in %),/2017/ abs. frequency (n=0),/2017/ rel. frequency (in %),/2018/ abs. frequency (n=74),/2018/ rel. frequency (in %),/2019/ abs. frequency (n=67),/2019/ rel. frequency (in %),/2020/ abs. frequency (n=53),/2020/ rel. frequency (in %),ALL / abs. frequency (n=266),ALL / rel. frequency (in %)
Bib.Soft,1,3.7,0,.0,0,0,11,14.86,3,4.48,1,1.89,16,6.02
Bib.Ref,5,18.52,14,31.11,0,0,11,14.86,17,25.37,15,28.3,62,23.31
Name.Only,19,70.37,31,68.89,0,0,48,64.86,52,77.61,37,69.81,187,70.3
Agent,1,3.7,0,.0,0,0,5,6.76,1,1.49,4,7.55,11,4.14
URL,5,18.52,2,4.44,0,0,22,29.73,13,19.4,10,18.87,52,19.55
PID,0,.0,0,.0,0,0,0,.0,1,1.49,0,.0,1,.38
Ver,1,3.7,0,.0,0,0,6,8.11,2,2.99,0,.0,9,3.38
Citation type, abs. frequency (n=119), rel. frequency (in %),ALL / abs. frequency (n=119),ALL / rel. frequency (in %)
Soft.Bib,1,.84,1,.84
Soft.Bib.Ref,0,.0,0,.0
Soft.Name,119,100.0,119,100.0
Soft.Agent,0,.0,0,.0
Soft.URL,9,7.56,9,7.56
Soft.PID,0,.0,0,.0
Soft.Ver,0,.0,0,.0
Loading

0 comments on commit ba9b707

Please sign in to comment.