Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J.

Similar presentations


Presentation on theme: "An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J."— Presentation transcript:

1 An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J. Watson Research Center, Marcus Fontoura, Vanja Josifovski IBM Almaden Research Center Published at ICDE 2003 Presented by Amir Bar-or, Technion

2 Overview Background Information –Evolution of query processing –XML processing Example Document Used Concepts –X-tree –X-dag XAOS –Algorithm Filtering Events –Building Matching-Structures –Emitting Output –Walk through Experimental results

3 The evolution of query processing Update modelQuery model Transactional Low to medium update rate Disk resident data Transactional Instant Accurate Static optimizations Index Classical Transactional Low to medium update rate Disk resident data Transactional/Non transactional Continuous Accurate Static optimizations Index Publish subscribe

4 The evolution of query processing Update modelQuery model Non - transactional High update rate Data is too big and cannot be stored efficiently on disks. Non - Transactional Continuous Approximated Dynamic optimizations Limited Buffering Streaming The close relatives of streaming algorithms are the one-pass algorithms.

5 XML processing Dom approach –Build in-core representations –Process as needed by standard API –Disadvantages: Scalability – cannot process large documents Locality – multiple traversals Algorithm inefficiencies – API ’ s perform unnecessary traversals SAX approach –Use a streaming event base API for on the fly parsing of XML –Disadvantages: Programmability : low level event handling Lack of support for Xpath, (especially with parent/ ancestor axes) Process DOM tree (XPath,XQuery,..) Build DOM tree XML parser

6 Caoz Aproach Caoz (chaos): an acronym for XML Analysis, Optimization,and Stuff. XML Parser Specialized XPath processor XML Doc XPath Expression FilterMatch Results Parsing events: SAX,DOM,Custom

7 Background Information Restricted XPath Set: –loc path: / step –predicate: [ ] –nodetest –axis specifier: ancestor, parent, child, descendant

8 Example document X (1,1) Root (0,0) Y (9,2) Y (2,2) Z (3,3) U (8,3) Z (10,3) V (4,4) V (5,4) W (6,4) W (7,5) W (11,4) Nodename (id, level)

9 X-Tree XPath expression is transformed into a rooted tree, the X- tree Vertices of a X- tree are called X- nodes Nodetests in the expression are translated into X- nodes Unique incoming edges. labeled with the specified axis One X- node is marked as 'Output X- node' Root /descendant:: Y[ child:: U]/ descendant:: W[ ancestor:: Z/ child:: V] Root descendent Y UW Z V child ancestor child descendent

10 X-Dag X-Dag is generated from the X-tree by reformatting the reverse axis into forward axis: Reverse direction –Ancestor  Descendant –Parent  Child Handle Orphan nodes –Add descendent axe from Root to orphan nodes

11 Root Y WU Z V descendent child ancestor /descendent::Y[child::U]/descendent::W[ancestor::Z/child::V] Root Y WU Z V descendent child descendent X-treeX-dag

12 Matching A matching for an x-tree X is a partial mapping from the x-nodes to the elements of document D where –All mapped vertices satisfy the node test –The edge between two mapped vertices describes the relationship between the mapped elements in the document A total matching exists if all the nodes of the x-tree are mapped. It is easy to show that an element e is in the result of the evaluation of xpath expression iff there is a total matching for the corresponding x-tree. The same argument can be proven for an x-dag. A total matching of an x-tree node v, is composed of total matching at each of the children of v. This is not true for an x-dag node.

13 /descendent::Y[child::U]/descendent::W[ancestor::Z/child::V] Root Y WU Z V descendent child ancestor X-tree Root Y WU Z V descendent child descendent X-dag

14 XAOS properties Update modelQuery model Non - transactional High update rate Data is too big and cannot be stored efficiently on disks. Non - Transactional Continuous Approximated Dynamic optimizations Limited Buffering Streaming


Download ppt "An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J."

Similar presentations


Ads by Google