Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sekimo Solutions mentioned by the TEI  CONCUR: an optional feature of SGML (not XML) that allows multiple.

Similar presentations


Presentation on theme: "Sekimo Solutions mentioned by the TEI  CONCUR: an optional feature of SGML (not XML) that allows multiple."— Presentation transcript:

1 http://www.text-technology.de/projects/sekimo.html Sekimo Solutions mentioned by the TEI  CONCUR: an optional feature of SGML (not XML) that allows multiple hierarchies to be marked up concurrently in the same document  milestone elements: empty elements that mark the boundaries between elements in a non-nesting structure  fragmentation of an item: the division of a single element into two or more parts, each of which nests properly within its context  virtual joins: the re-creation of a virtual element from fragments of text  redundant encoding: information encoded in multiple forms

2 http://www.text-technology.de/projects/sekimo.html Sekimo Problems with milestones  milestones are empty elements  milestones elements have no content  consequences:  no content model restriction can be stated by a document grammar  standard SG/XML editors cannot annotate these regions  SG/XML parsers cannot ensure proper nesting of the milestone elements  to process these regions by means of a style sheet is  more difficult (XSLT) or  impossible (CSS)

3 http://www.text-technology.de/projects/sekimo.html Sekimo CLIX/Horse-milestones  Differing type of milestones …...  CLIX Non-XML: s xyz t Would be : b xyz t

4 http://www.text-technology.de/projects/sekimo.html Sekimo Problems with the other TEI-solutions  CONCUR:  (de facto) not implemented (and not part of XML)  fragmentation of an item:  results in 'containers' containing only a part of the text, e.g. a fragmented sentence or para would not contain an entire sentence or paragraph, as implied  virtual joins:  requires a separate interpretation of the SGML document  redundant encoding:  results in multiple files  the files are not integrated in a larger unit  it exists no unit containing all the information

5 http://www.text-technology.de/projects/sekimo.html Sekimo Stand-off annotation  new layers of annotation are added by building a new tree whose nodes are SGML elements which do not contain textual content, but links to another layer  in some respects a generalization of the virtual joins (although not mentioned by the TEI), because  not only contents of elements are joined, but also ranges between points within the document  link base:  Distinction 1: markup already contained in an annotation layer vs. text content, addressed by character offsets  Distinction 2: one (dedicated) layer as the link target vs. (free) interlinking of several layers

6 http://www.text-technology.de/projects/sekimo.html Sekimo Advantages of stand-off annotation  Thompson & McKelvie (1997)  the source document might be read-only  annotation files can be distributed without distributing the source text  Michael Glass & Barbara Di Eugenio (2002)  discontinuous segments of text can be combined in a single annotation  independent parallel coders can produce independent annotations  different annotation files can contain different layers of information  Pianta & Bentivogli (2004)  elegance and clarity  processing conceptually simple

7 http://www.text-technology.de/projects/sekimo.html Sekimo Drawbacks of stand-off annotation  new layers require a separate interpretation  the layers, although separate, depend on each other  the information, although included, is difficult to access using generic methods  standard parsing or editing software cannot be employed  standard document grammars can only be used for the level, containing both markup and textual data  linking at a sub-element range is difficult  the primary layer should be a (primary) level

8 http://www.text-technology.de/projects/sekimo.html Sekimo Non SGML-based Markup Languages  some non-SGML-based markup languages have been proposed, e.g. Multi-Element Code System (MECS) or TexMECS  its major extension with respect to SGML and XML is that overlapping ranges are admitted within documents.  in 2002 the Layered Markup and Annotation Language (LMNL) was proposed Tennison and Piez 2002  LMNL is a markup language which not only allows to annotate overlapping elements but also to connect the element names to corresponding annotation levels.  LMNL solves both problems, but  (full) LMNL is not SGML-based


Download ppt "Sekimo Solutions mentioned by the TEI  CONCUR: an optional feature of SGML (not XML) that allows multiple."

Similar presentations


Ads by Google