Presentation is loading. Please wait.

Presentation is loading. Please wait.

G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics.

Similar presentations


Presentation on theme: "G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics."— Presentation transcript:

1 G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics & Business XML Path Language Efficient Algorithms for processing XPath Queries

2 XPath Queries 1 Presentation Outline XPath Overview XPath Engines Efficiency Query Evaluation Algorithms MINCONTEXT Algorithm Linear-time fragments of XPath Linear-space fragments of XPath Conclusions

3 XPath Queries 2 XPath Overview Proposed by W3C Selects nodes from XML document trees Importance XML query language Core of XML technologies XSLT▫ XPointer XQuery Implementation approaches Highly inefficient▫ Exponential time XML markup language eXtensible Stylesheet Language Transformations

4 XPath Queries 3 XPath Efficiency XPath engines efficiency (Query Complexity) XT (J. Clark) ▫ Exponential XALAN (Apache foundation) ▫ Exponential Saxon (M. Kay) ▫ Exponential Internet Explorer 6 (Microsoft) ▫ Exponential Quadratic Data Complexity for Simple Path XSLT processor Web browser XPath engine

5 XPath Queries 4 Query Evaluation Algorithms (1/4) Context-value table ▫ Bottom-up XML: Query: descendant::b/following-sibling::*[position()!=last()] Polynomial complexity Time: O(|D| 5 *|Q| 2 ) Space: O(|D| 4 *|Q| 2 ) ▫ Stores values in “data pool” No recalculation

6 XPath Queries 5 Query Evaluation Algorithms (2/4) Bottom-up Algorithm Bottom-up Semantics Function. Finds the semantics of expressions. Used here to find the query’s semantics Find the semantics for every leaf and add them to R Calculate expressions from the semantics in R, until the root of the Query has been reached

7 XPath Queries 6 Query Evaluation Algorithms (3/4) Consider the document. Let dom = {r, a, b 1, b 2, b 3, b 4 }, where b 1 …b 4 denote the children of a in document order. We want to evaluate the XPath query Q=descendant::b/following-sibling::*[position()!=last()] over the input context Q parse tree Calculates context-value tables for leaves E 1, E 3, E 5 and E 6 From E 5 and E 6 calculates E 4 and through E 3 and E 4 calculates E 2 Finally, from E 2 and E 1 the result (Q) is calculated

8 XPath Queries 7 Evaluation of XPath Queries (4/4) Bottom-up quick but not practical Irrelevant intermediate results Top-down approach Vector computation Op <> (,…, ) = Polynomial complexity Time: O(|D| 4 *|Q| 2 ) Space: O(|D| 3 *|Q| 2 )

9 XPath Queries 8 MINCONTEXT Algorithm (1/6) Context-value table Q parse tree: |  N| 2 entries/node Improved by Top-down Result depends on context info MINCONTEXT Algorithm Context information = Small Restrict context Relevant context

10 XPath Queries 9 MINCONTEXT Algorithm (2/6) Relevant context Base cases N  Leaf Node Constant | Boolean: Relev(N) = 0 Position | Last: Relev(N) = {‘cp’ } | {‘cs’} Location Step | Function: Relev(N) = {‘cn’} Compound expressions N  Inner Node Location step: Relev(N) = {‘cn’} Others: Relev(N) = ▫ Context-value table |  N| 2 entries/node Every possible context node calculated MINCONTEXT Results   N Set of nodes x j   N from any previous x i   N Polynomial complexity Time: O(|D| 4 *|Q| 2 ) Space: O(|D| 2 *|Q| 2 )

11 XPath Queries 10 MINCONTEXT Algorithm (3/6) Eval_out: If input expression=location path evaluates Input: a node N & a node set X  dom. Output: set Y of nodes that can be reached via the expression from any context-node x  X Eval_by: Takes a node N in the parse tree and a set X of possible context-nodes. It does not return a result value. For every node M in the subtree rooted at N, computes table(M), if expr (M) does not depend on context-position/size. Eval_single: evaluates XPath expressions for single context. Input: Takes node N in the parse tree & context. Output: result value for this context. This Procedure is called after eval_by has been called for the node N

12 XPath Queries 11 MINCONTEXT Algorithm (4/6) Example

13 XPath Queries 12 MINCONTEXT Algorithm (5/6) Example

14 XPath Queries 13 MINCONTEXT Algorithm (6/6) Example MINCONTEXT

15 XPath Queries 14 Linear-time fragments of XPath (1/2) Core XPath Fragment of XPath Clean logical core Only sets of nodes No arithmetical ops No string ops Set ops ( , , -,  ) Time: O(|D|*|Q|)

16 XPath Queries 15 Linear-time fragments of XPath (2/2) XPatterns Contained in XPath Extends Core XPath ID  Axis relation π 1 /id(π 2 )/π 3  π 1 / π 2 /id/π 3 π = location path Time: O(|D|*|Q|)

17 XPath Queries 16 Linear-space fragments of XPath Extended Wadler Fragment of XPath No select, count or sum data functions No expressions “nodeSet RelOp nodeSet” In expressions id(id(…(string)…)) String does not depend on context OPTMINCONTEXT Space: O(|D|*|Q| 2 ) Time: O(|D| 2 *|Q| 2 )

18 XPath Queries 17 Conclusions XPath query evaluation algorithms Context-value table based Bottom-up Top-down MINCONTEXT OPTMINCONTEXT Polynomial time Linear complexity fragments of XPath Query evaluation can be further optimized

19 XPath Queries 18 dilu 2004


Download ppt "G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics."

Similar presentations


Ads by Google