Download presentation
Presentation is loading. Please wait.
Published bySharon Nash Modified over 8 years ago
1
http://www.text-technology.de/projects/sekimo.html Sekimo Solutions mentioned by the TEI CONCUR: an optional feature of SGML (not XML) that allows multiple hierarchies to be marked up concurrently in the same document milestone elements: empty elements that mark the boundaries between elements in a non-nesting structure fragmentation of an item: the division of a single element into two or more parts, each of which nests properly within its context virtual joins: the re-creation of a virtual element from fragments of text redundant encoding: information encoded in multiple forms
2
http://www.text-technology.de/projects/sekimo.html Sekimo Problems with milestones milestones are empty elements milestones elements have no content consequences: no content model restriction can be stated by a document grammar standard SG/XML editors cannot annotate these regions SG/XML parsers cannot ensure proper nesting of the milestone elements to process these regions by means of a style sheet is more difficult (XSLT) or impossible (CSS)
3
http://www.text-technology.de/projects/sekimo.html Sekimo CLIX/Horse-milestones Differing type of milestones …... CLIX Non-XML: s xyz t Would be : b xyz t
4
http://www.text-technology.de/projects/sekimo.html Sekimo Problems with the other TEI-solutions CONCUR: (de facto) not implemented (and not part of XML) fragmentation of an item: results in 'containers' containing only a part of the text, e.g. a fragmented sentence or para would not contain an entire sentence or paragraph, as implied virtual joins: requires a separate interpretation of the SGML document redundant encoding: results in multiple files the files are not integrated in a larger unit it exists no unit containing all the information
5
http://www.text-technology.de/projects/sekimo.html Sekimo Stand-off annotation new layers of annotation are added by building a new tree whose nodes are SGML elements which do not contain textual content, but links to another layer in some respects a generalization of the virtual joins (although not mentioned by the TEI), because not only contents of elements are joined, but also ranges between points within the document link base: Distinction 1: markup already contained in an annotation layer vs. text content, addressed by character offsets Distinction 2: one (dedicated) layer as the link target vs. (free) interlinking of several layers
6
http://www.text-technology.de/projects/sekimo.html Sekimo Advantages of stand-off annotation Thompson & McKelvie (1997) the source document might be read-only annotation files can be distributed without distributing the source text Michael Glass & Barbara Di Eugenio (2002) discontinuous segments of text can be combined in a single annotation independent parallel coders can produce independent annotations different annotation files can contain different layers of information Pianta & Bentivogli (2004) elegance and clarity processing conceptually simple
7
http://www.text-technology.de/projects/sekimo.html Sekimo Drawbacks of stand-off annotation new layers require a separate interpretation the layers, although separate, depend on each other the information, although included, is difficult to access using generic methods standard parsing or editing software cannot be employed standard document grammars can only be used for the level, containing both markup and textual data linking at a sub-element range is difficult the primary layer should be a (primary) level
8
http://www.text-technology.de/projects/sekimo.html Sekimo Non SGML-based Markup Languages some non-SGML-based markup languages have been proposed, e.g. Multi-Element Code System (MECS) or TexMECS its major extension with respect to SGML and XML is that overlapping ranges are admitted within documents. in 2002 the Layered Markup and Annotation Language (LMNL) was proposed Tennison and Piez 2002 LMNL is a markup language which not only allows to annotate overlapping elements but also to connect the element names to corresponding annotation levels. LMNL solves both problems, but (full) LMNL is not SGML-based
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.