Presentation is loading. Please wait.

Presentation is loading. Please wait.

` 1 VAMANA (Talk 2) (vǎ - mǎ - nǎ) Venkatesh Raghavan & Prof. Elke Rundensteiner DSRG Talk 1 ST May 2003 An Efficient XPath Query Engine Exploiting the.

Similar presentations


Presentation on theme: "` 1 VAMANA (Talk 2) (vǎ - mǎ - nǎ) Venkatesh Raghavan & Prof. Elke Rundensteiner DSRG Talk 1 ST May 2003 An Efficient XPath Query Engine Exploiting the."— Presentation transcript:

1 ` 1 VAMANA (Talk 2) (vǎ - mǎ - nǎ) Venkatesh Raghavan & Prof. Elke Rundensteiner DSRG Talk 1 ST May 2003 An Efficient XPath Query Engine Exploiting the MASS Index

2 2 Introduction Purpose of the talk. Generation of Execution Tree Execution Running Example 1. Running Example 2. XPath Expression Execution. Cost Estimation. Heuristics and Transformation.

3 3 Running Examples E.g. 1: //name/parent::person/descendant::watch E.g. 2: // name [ text() = “Klemens Pelz” ]/parent::person Klemens Pelz Hayato Cappelletti

4 4 Bigger Picture MASS (A Multi-Axis Storage Structure for Large XML Documents ) VAMANA (XPath Query Engine) XQuery Engine (future development) Execution TreeMass Interface Node Set XPath Expression XPath Processor

5 5 How many “ROOT(s)” are there? Root of the Document We call it “Document Root” Root of the expression //name/parent::person/descendant::watch We call it “First Location Step” Root of Execution Tree We call it “ROOT”

6 6 XPath Processor Execution Tree XPath Expression XPath Processor E.g. 2: //name [ text() = “Klemens Pelz” ]/parent::person name // CONTEXT person Parent ROOT BIPRED = PRED text child OPERAND “Klemens Plez” LITERAL OPERAND Phase 1: Parse Tree

7 7 Contd.. name // CONTEXT person Parent ROOT BIPRED = PRED text child OPERAND “Klemens Plez” LITERAL OPERAND Phase I: Parse Tree BIPRED = PRED text child OPERAND “Klemens Plez” LITERAL OPERAND Phase II: Transformed Parse Tree Execution Tree XPath Expression XPath Processor

8 8 Phase III: Execution Tree Generation Execution Tree XPath Expression XPath Processor name // CONTEXT person Parent ROOT BIPRED = PRED text child OPERAND “Klemens Plez” LITERAL OPERAND Phase II: Transformed Parse Tree “person” X: Parent “name” X: // “” X: child “Klemens Plez” BI_PREDICATE “EQ” Phase III: VAMANA Execution Tree

9 9 VAMANA Nodes (VNode) Node Base VRootNode MassNode VBinaryPredicateNode VExistPredicateNode VJoinNode VLiteralNode VAMANA (XPath Query Engine) Execution Tree Mass InterfaceNode Set MASS

10 10 VNode Structure Context Side Expression Side Root Node child VAMANA (XPath Query Engine) Execution Tree Mass InterfaceNode Set MASS

11 11 VNode Flow Structure Data-Flow style of querying. Most of commercial relational database system. Each node is arranged in a fashion such that data “flow” from one node to another in a procedure-consumer fashion. Correctness. Each node performs some operation on the data that flows through it. The result is produced by the last node on the dataflow chain. IN SHORT: Data Flows upwards. Control Flows downwards. Iterative. VAMANA (XPath Query Engine) Execution Tree Mass InterfaceNode Set MASS

12 12 Contd. Iterative. Currently VAMANA executes nodes iteratively. So no copies of the data is made. IS IT A PROBLEM? MASS produces nodes in document order so not a problem. But there are some expression that in sibling order. Work in progress. VAMANA (XPath Query Engine) Execution Tree Mass InterfaceNode Set MASS

13 13 Execution Tree “name” X: // “watch” X: AXIS_DESCENDANT “person” X: AXIS_PARENT E.g. 1: //name/parent::person/descendant::watch Context Side Root Node VAMANA (XPath Query Engine) Execution Tree Mass InterfaceNode Set MASS

14 14 How Do We EXECUTE ? Step 1: Set Context Node of the root of the expression. In this example the root of the expression is the root of the document. Step 2: Ask the VAMANA Root Node for nodes. //name/parent::person/descendant::watch VAMANA (XPath Query Engine) Execution Tree Mass InterfaceNode Set MASS

15 15 Step1: Setting Context for the “First Location Step” “watch” X: AXIS_DESCENDANT “person” X: AXIS_PARENT “name” X: // //name/parent::person/descendant::watch

16 16 OUT OF NODE FETCHING INTIAL “watch” X: AXIS_DESCENDANT “person” X: AXIS_PARENT “name” X: // b.i.c.c b.i.c b.i.c.m.c b.i.c.c b.i.c.m.c //name/parent::person/descendant::watch

17 17 “watch” X: AXIS_DESCENDANT b.i.c “person” X: AXIS_PARENT b.i.c.c “name” X: // b.i.c b.i.c.m.c b.i.c.c b.i.c.m.e //name/parent::person/descendant::watch

18 18 “watch” X: AXIS_DESCENDANT b.i.c b.i.c.m.e “person” X: AXIS_PARENT b.i.c.c b.i.i “name” X: // b.i.i.c b.i.i b.i.i.m.c //name/parent::person/descendant::watch

19 19 IO Operation a.a, a.b, a.c a.a.a, a.b.a, a.b.b, a.c.a, a.c.a, a.c.b /z //y ** Please see handout

20 20 Example 2 “name” X: // “person” X: AXIS_PARENT “ ” X: AXIS_CHILD“Klemens Pelz” BI_PREDICATE EQ Context Side Expression Side //name [ text() = “Klemens Pelz” ]/parent::person

21 21 “person” X: AXIS_PARENT BI_PREDICATE EQ “name” X: // “ ” X: AXIS_CHILD“Klemens Pelz” b.i.e.c b.i.e.c.b Klemens Pelz b.i.e.c b.i.e //name [ text() = “Klemens Pelz” ]/parent::person b.i.e.c

22 22 Determining Selectivity Count. The exact count of the number of nodes in MASS storage structure of that particular nodetest. IN. The number of tuples that are fetched by the child VNode. OUT. The number of tuples produced by the VNode. I_Tuples. Total number of tuples processed till that VNode. This includes the cutrrent node also. NodeType: NodeTest: X: Count: IN: OUT: I_Tuples:

23 23 Example 1: //name/parent::person/emailaddress NodeType: MASS NodeTest: name X: // Count: 482 IN: 482 OUT: 482 NodeType: MASS NodeTest: person X: AXIS_PARENT Count: 255 IN: 482 OUT: ?

24 24 Worst Case – Costing Categorize the axis into three division Division 1: child | descendant | descendant-or-self NodeType: NodeTest: X: Count: IN: OUT: NodeType: NodeTest: X: Count: IN: OUT: X Y Cases: 1.#X > #Y 2.#Y > #X #X

25 25 Contd. Division 2: parent, ancestor, ancestor-or-self, following, following-sibling, preceding, preceding-sibling NodeType: NodeTest: X: Count: IN: OUT: NodeType: NodeTest: X: Count: IN: OUT: X Y Cases: 1.#X > #Y 2.#Y > #X #Y

26 26 Contd. Division 3: Self For Example: //*/self::X Y/self::* NodeType: NodeTest: X: Count: IN: OUT: NodeType: NodeTest: X: Count: IN: OUT: X Y Cases: 1.#X > #Y  #Y 2.#Y > #X  #X

27 27 NodeType: MASS NodeTest: name X: // Count: 482 IN: 482 OUT: 482 I_Tuple: 482 NodeType: MASS NodeTest: person X: AXIS_PARENT Count: 255 IN: 482 OUT: 482 I_Tuple: 737 NodeType: MASS NodeTest: watch X: AXIS_DESCENDANT Count: 488 IN: 482 OUT: 488 I_Tuple: 1225

28 28 What about Binary Operator Cost expression sides w.r.t. to child. Operator = AND | OR | EQ. ALL go out. Arithmetic Operators. ALL go out. Because cannot predict before execution.

29 29 Contd.

30 30 Heuristics Higher the ratio, better the selectivity. Generate a multimap. Each optimize-able node can then applied the rules that apply to it. Ratio = IN/OUT Scaled Ratio = scale 0..1 (IN/OUT)

31 31 Transformation Rule 1: “name” X: // “person” X: AXIS_PARENT BI_PREDICATE EQ “ ” X: AXIS_CHILD “Klemens Pelz” Binary Predicate with text comparison  Value Index “name” X: // “Klemens Pelz” X: AXIS_VALUE “Klemens Pelz” “name” X:AXIS_PARENT

32 32 Transformation Rule 2 Mass Node to Join “name” X: // “watch” X: AXIS_DESCENDANT “person” X: AXIS_PARENT Root Node “name” X: // “person” X: AXIS_PARENT “watch” X: AXIS_DESCENDANT JOIN X: AXIS_DESCENDANT //name/parent::person/descendant::watch

33 33 * Removal Rule: p/descendant :: */child::n ≡ p/descendant::n Where, p : path expression Need for this rule: with nodes "*" as node test, during the cost estimation this might be the spoilsport.

34 34 “Axis::self” Removal Rule: p/descendant::*/self::m ≡ p/descendent::m Rule: p/descendant-or-self::*/self::m ≡ p/descendent-or-self::m Need for the node: “self” node in combination with * or a node test not necessary.

35 35 Reverse Axes rules Rule : p/descendant::n/parent::m ≡ //descendant-or-self::m[child::n] Rule: p/descendant::n/m ≡ p/descendant::m[parent::n] Rule: /descendant::m/preceding::n ≡ /descendant::n [ following::m] From Paper: Symmetry in XPath by Dan Olteanu, Holger Meuss, Tim Furche, Francois Br

36 36 Predicate Axis Rules Rule: p/descendant::* [child::n] ≡ p [descendant::n] / descendant:: * Predicate Node to Join.

37 37 Conclusion Work in progress in THREE main areas. Frame work for XPath expression execution. Selectivity Determination. Transformation Rules.

38 38

39 39 References 1. James Clark and Steve DeRose. XML Path Language (XPATH), http://www.w3.org/TR/xpath, 2002. 2. S.Boag, D.Chamberlin, Mary F. Fernandez, D.Florescu, J.Robie and J.Siméon, XQuery 1.0: An XML Query Language. W3C Working Draft, http://www.w3.org/TR/xquery/, 2002. 3. Kurt W. Deschler and Elke Rundensteiner. MASS- Multi Axis Storage Structure, 2002, Technical Report in progress\. 4. T. Milo and D. Suciu. Index structure for path expression, In Proceedings of 7th International Conference on Database Theory, 1999, pages 277-295. 5. Flavio Rizzolo, Alberto Mendelzon. Indexing XML Data with ToXin},WebDB, pages 49- 54, Santa Barbara, USA, 2001. 6. Q. Li and B. Moon. Indexing and Querying XML Data for Regular Path Expressions, Proceedings of 27th International Conference on Very Large Database (VLDB'2001), Rome, Italy, September 2001, pages 361-370. 7. XMark - The XML Benchmark project. http://monetdb.cwi.nl/xml/.


Download ppt "` 1 VAMANA (Talk 2) (vǎ - mǎ - nǎ) Venkatesh Raghavan & Prof. Elke Rundensteiner DSRG Talk 1 ST May 2003 An Efficient XPath Query Engine Exploiting the."

Similar presentations


Ads by Google