Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Humanities – Stanford University – 2011-06-201 Expressive power of markup languages and graph structures Yves Marcoux Université de Montréal, Canada.

Similar presentations


Presentation on theme: "Digital Humanities – Stanford University – 2011-06-201 Expressive power of markup languages and graph structures Yves Marcoux Université de Montréal, Canada."— Presentation transcript:

1 Digital Humanities – Stanford University – Expressive power of markup languages and graph structures Yves Marcoux Université de Montréal, Canada Michael Sperberg-McQueen Black Mesa Technologies Claus Huitfeldt University of Bergen, Norway

2 Digital Humanities – Stanford University – Overview of the talk 1.Problem setting –Graph representations of XML documents –Need for more complex structures –Overlap-only-TexMECS 2.Main result and consequence 3.Future work

3 Digital Humanities – Stanford University – Problem setting

4 Digital Humanities – Stanford University – Graph representations of structured documents

5 Digital Humanities – Stanford University – XML document = tree top b c  a Embedding in markup  Child-parent in tree

6 Digital Humanities – Stanford University – Any tree  an XML document top b c  a

7 Digital Humanities – Stanford University – Any tree  an XML document top b c  a e d Perfect correspondence !

8 Digital Humanities – Stanford University – Document Object Models DOMs are essentially graph representations of structured documents "Patched" for attributes, namespaces, etc. DOM manipulations = graph modifications It suffices to make sure that the graph remains a tree

9 Digital Humanities – Stanford University – Need for more complex structures

10 Digital Humanities – Stanford University – Overlap et al. In real life (outside of XML documents), information is often not purely hierarchical Classical examples: –verse structure vs sentence structure –speech structure vs line structure –reordering, discontinuity, etc. In general: multiple structures applied (at least in part) to same contents

11 Digital Humanities – Stanford University – Example 1 (Peer) Hvorfor bande? (Åse) Tvi, du tør ej! ¶ Alt ihob er tøv og tant! ¶ vers peer åse Alt ihob er tøv og tant!Hvorfor bande?Tvi, du tør ej!

12 Digital Humanities – Stanford University – Example 2 (last verse spoken in chorus by Peer & Åse) vers peer åse Alt ihob er tøv og tant!Hvorfor bande?Tvi, du tør ej!

13 Digital Humanities – Stanford University – Example 3 (last verse spoken in chorus by Peer & Åse) vers peeråse Alt ihob er tøv og tant!Hvorfor bande?Tvi, du tør ej!

14 Example 4 Digital Humanities – Stanford University –

15 Digital Humanities – Stanford University – OO-TexMECS

16 Digital Humanities – Stanford University – TexMECS A particular proposal to address the overlap problem with overlapping markup+ MECS (Huitfeldt ) –Multi-element code system TexMECS (Huitfeldt & SMcQ 2003) –"Trivially extended MECS" Markup Languages for Complex Documents (MLCD) project

17 Digital Humanities – Stanford University – Overlap-only TexMECS TexMECS allows overlapping markup... but also much more: –virtual elements, interrupted elements, etc. OO-TexMECS 101 –Start-tags: –Overlapping elements allowed –Natural notion of well-formedness

18 Digital Humanities – Stanford University – OO-TexMECS example (Peer) Hvorfor bande? (Åse) Tvi, du tør ej! ¶ Alt ihob er tøv og tant! ¶ |doc>

19 Earlier result In 2008 [M2008], we identified a particular class of graphs that we showed to correspond exactly to OO-TexMECS –Those graphs are essentially+ the « restricted GODDAGs » (r-GODDAGs) of [SH2004] –All trees are r-GODDAGs –Some non-trees are r-GODDAGs too –So: OO-TexMECS more expressive than XML Digital Humanities – Stanford University –

20 Digital Humanities – Stanford University – Example 1 r-GODDAG ? vers peer åse Alt ihob er tøv og tant!Hvorfor bande?Tvi, du tør ej! √

21 Digital Humanities – Stanford University – Example 2 r-GODDAG ? vers peer åse Alt ihob er tøv og tant!Hvorfor bande?Tvi, du tør ej!

22 Digital Humanities – Stanford University – Example 3 r-GODDAG ? vers peeråse Alt ihob er tøv og tant!Hvorfor bande?Tvi, du tør ej!

23 Example 4 r-GODDAG ? Digital Humanities – Stanford University –

24 However… That kind of result depends on the class of « possible » graphs Proof used « noDAGs » (node-ordered) –Already fairly restricted class (though not as much as r-GODDAGs) Would we get the same result with a larger universe of discourse… –Arbitrary graphs ? Digital Humanities – Stanford University –

25 Example Digital Humanities – Stanford University – A B ba "A""B" ba "A""B" ""

26 2. Main result and consequence Digital Humanities – Stanford University –

27 The result (1/4) Essentially: r-GODDAGs are really the only graphs you can express with OO- TexMECS Digital Humanities – Stanford University –

28 The result (2/4) Universe of discourse: CODGs (child- ordered graphs) –finite, directed graphs, otherwise unrestricted –can have cycles –same child multiple times –many « roots » –can be disconnected Digital Humanities – Stanford University –

29 Example 4 CODG ? Digital Humanities – Stanford University – √

30 The result (3/4) Proof did not carry over Defining condition for graphs expressible in OO-TexMECS did not carry over –completion-acyclic noDAGs –vs full-completion-acyclic CODGs But essentially: –completion-acyclic noDAGs = –full-completion-acyclic CODGS = r-GODDAGs Digital Humanities – Stanford University –

31 The result (4/4) So, essentially, the graphs expressible in OO-TexMECS are: –the completion-acyclic noDAGs = –the full-completion-acyclic CODGS = –the r-GODDAGs Consequence: if you need more complex structures than r-GODDAGs, you must extend XML with more than overlap+ Digital Humanities – Stanford University –

32 Digital Humanities – Stanford University – Future work Optimal verification algorithm for full- completion-acyclicity Optimal serialization algorithm for full- completion-acyclic CODGs Graphs with partially ordered children Other constructs of TexMECS

33 Digital Humanities – Stanford University – Thank you ! Questions ?


Download ppt "Digital Humanities – Stanford University – 2011-06-201 Expressive power of markup languages and graph structures Yves Marcoux Université de Montréal, Canada."

Similar presentations


Ads by Google