Presentation is loading. Please wait.

Presentation is loading. Please wait.

2005rel-xml-iii1  View forests and query composition The composition algorithm works for a (large) subset of XQuery, excluding : (see paper for details)

Similar presentations


Presentation on theme: "2005rel-xml-iii1  View forests and query composition The composition algorithm works for a (large) subset of XQuery, excluding : (see paper for details)"— Presentation transcript:

1 2005rel-xml-iii1  View forests and query composition The composition algorithm works for a (large) subset of XQuery, excluding : (see paper for details) User-defined recursive functions order-dependent features (before, after, …) XQuery contain many redundant/complex features; Every query can be translated (& simplified) into a XQueryCore expression (syntax in p. 20) – this is normalization The first step is type checking (e.g., data is applied only where the element is atomic-valued), then normalization, then the composition algorithm is applied

2 2005rel-xml-iii2 Normalization: (as defined in the XQuery formal semantics) Rewriting XML elements/attributes into internal form Breaking for/let clauses so each defines just one variable Replacing where e1 return e2 by if (e1) then e2 else () Applying data to element/attribute operands of expressions that requires atomic arguments SilkRoute adds some more: Long path expressions are broken into one-step expressions Each one-step binds a new variable … (see paper)

3 2005rel-xml-iii3 The original public query (fig. 6) back 24

4 2005rel-xml-iii4 Upper & bottom parts of normalized public query(fig. 14, p. 21) p.3

5 2005rel-xml-iii5 In the algorithm, a node is a triple QName Forest (of children) SQL (an sql fragment) Constructed by elementNode(QName, F, S) or attributeNode(QName, F, S) An SQL fragment is a triple: (From, Where, Select) (each may be empty) The function joinSQL concatenates several fragments into one The algorithm is recursive, works top-bottom on the given query

6 2005rel-xml-iii6 We illustrate with a simple example : The relational db has one relation Clothing priceon-saleItem 99truecoat 38falseshirt 45falseskirt

7 2005rel-xml-iii7 Here is (a simplified) canonical view tree Omitted SQL fragments are empty We assume $CV is bound to N1 N1 N1.1 From: Clothing c N1.1.1 N1.1.1.1 string Select: c.item N1.1.3 N1.1.2 N1.1.2.1 bool Select: c.on-sale N1.1.3.1 int Select: c.price back 11 Back13

8 2005rel-xml-iii8 Here is our query: element view { for $t in $CV/child::Tuple return for $s in $t/child::on-sale return if data($s) then element product { element name { for $item in $t/child::item return data($item) } } else () } Changes: Tuple  product, only the item field is output (projection), and on- sale is used for a selection p. 7

9 2005rel-xml-iii9 Here is the expected view tree of the composition P1.1 From: Clothing c Where: c.on-sale P1.1.1 P1.1.1.1 string Select: c.item Q: How is the SQL fragment for P1 generated? A: when the binding for $t is found, the SQL fragment From: Clothing c is collected; When the if is processed, the Where: c.on-sale is collected; when element product is encountered, the SQL is output  The algorithm has three parameters Env: bindings of variables seen so far to view trees/forests Expr: the expression to be processed S: an SQL fragment (collected on the way down) p. 8 P1

10 2005rel-xml-iii10 The algorithm vfca(Env, Exp, S) is recursive, functional, top-down, returns a view forest Denote by Qi the query with first i rows deleted vfca is initially called with Env0 = {$cv  N1}, Expr0 = Q0, S0 = () Process 1 st line : element view { let vf = vfca(Env0, Q1, ()) in elementNode(QName,vf, S) A node P1, with label view & empty SQL, is generated, and with child(ren) – whatever is returned by the recursive call

11 2005rel-xml-iii11 vfca(Env1 (= Env0), Q1, S1 (= S0) ) : Processing 2nd line: for $t in $cv/child::Tuple return let vf = binding obtained for $t (= N1.1), Env’ = Env1 + {$t  vf}, // will be changed a bit later Expr’ = remainder of Q1 (= Q2), S’ = sqlJoin(S1, vf.sql) = From: clothing c in vfca(Env’, Expr’, S’) We refer in the following to the arguments as Env2, Q2, S2 p. 7

12 2005rel-xml-iii12 vfca(Env2, Q2, S2) : Now, process: for $s in $t/child::on-sale return let vf = binding obtained for $s (= N1.1.2), Env’ = Env2 + {$s  vf1}, Expr’ = remainder of Q2 (= Q3), S’ = sqlJoin(S2, vf.sql) = From: clothing c in vfca(Env’, Expr’, S’) Note: the SQL fragment has not changed, since N1.1.2 contains an empty fragment

13 2005rel-xml-iii13 vfca(Env3, Q3, S3 (= From Clothing c) ) : Now, process: if data($s) then…… How do we process vfca( Env3, data($s), S3) ? Type-checking  we know that $s is bound to a node that contains an atomic-valued expression (bool, in this case) On a db instance, we would obtain true or false Given the binding for $s (to N1.1.2), the SQL fragment of the child N1.1.2.1 contains a Select with an atomic-valued expression Select: c.on-sale vfca returns this child : (vfca always returns a forest) p. 7

14 2005rel-xml-iii14 vfca(Env, var, S) = let vf = Env(var) // the forest/tree var is bound to in forestJoin(vf, S) For $s, this gives N1.1.2, with From Clothing c vfca(Env, data(E), S) = let vf = vfca(Env, E, S) in vf/child::* ( * returns all nodes, independent of label; we know there is just one) Note: S = From Clothing c is ignored in this result

15 2005rel-xml-iii15 vfca(Env, if E1 then E2 else E3, S ) let vn = vfca(Env, E1, S) // must be a singleton, with bool-valued Select sqlTrue.From = vn.SQL.From sqlTrue.Where = vn.SQL.Where “and (”, vn.SQL.Select, “ ) ” vf2 = vfca(Env, E2, sqlTrue) sqlFalse.From = vn.SQL.From sqlFalse.Where = vn.SQL.Where “and not (”, vn.SQL.Select, “ ) ” vf3 = vfca(Env, E3, sqlFalse) in forestJoin((vf2, vf3), S) forestJoin(vf, S) : adds (using sqlJoin) S to each root of vf

16 2005rel-xml-iii16 In our example, E3 = (), the empty forest, so vf3 =(), so (vf2, vf3) = vf2, a singleton forest, a tree E2 is the result of vfca(Env3, Q4’, sqlTrue), where Q4’ is element product { element name { for $item in $t/child::item return data($item) } It returns P1.1 Where: c.on-sale P1.1.1 P1.1.1.1 string Select: c.item only the Select: c.on-sale condition was passed down to this query But, forsetJoin (end of if ) now adds to its root From: Clothing c

17 2005rel-xml-iii17 A few more cases: vfca(Env, element QName {E}, S) = // an element constructor let vf = vfca(Env, E, ()) in elementNode(QName, vf, S) Thus, all accumulated SQL is added to the constructed element node (see prev. page) Why is the recursive call with empty SQL? Attribute construction is similar

18 2005rel-xml-iii18 A few more cases: vfca(Env, L, S) = // L is a literal let sql.From = S.From sql.Where = S.Where sql.Select = L “as atomicValue ” // atomicValue –an attribute name // in sql.Select, it is ignore d in atomicNode(sql) vfca(Env, E1 arithOp|logicalOp E2, S) = compute the trees for E1, E2, with SQL = () take their Select components, add arithOp|logicalOp create an atomic node with this SQL fragment then add S to its root

19 2005rel-xml-iii19 For a for $v2 in $v1.Axis::nodeTest… We have looked at Axis = child If Axis = descendent: need to collect the SQL components of all nodes between the bindings for $v1, $v2 (not inclusive) If Axis = self/ancestor: Essentially as in child In many cases, we need to rename variables in the tree bound to $v2 – here is an example

20 2005rel-xml-iii20 $view is bound to N1, From clothing c Where c.category = “outerwear” N1.1, From problems p where p.pid = c.cid Query (fragment): for $p in $view/self::product return for $r1 in $p/child::report return for $r2 in $p/child::report return if (some comparison of fileds of r1, r2…) … W/o renaming, $r1 and $r2 will be bound to same SQL fragment, with same variable p being defined

21 2005rel-xml-iii21 Renaming: When processing a for $v in… : Find the forest/tree bound to $v Copy it, renaming all defined variables in SQL fragments to new variables (not harmful, may be needed) Remove the SQL fragment from the root –it is now preserved as a parameter of recursive calls Add to the environment a binding of $v to resulting forest/tree

22 2005rel-xml-iii22 Processing a let $v = E1 return E2 : vfca(Env, let $v = E1 return E2, S) = let vf1 = vfca(Env, E1, ()) in vfca(Env + {$v  fv1}, E2, S) (much simpler than for )

23 2005rel-xml-iii23  Execution of view forests To obtain a query’s result, One or more SQL queries are generated from its view forest They generate ordered streams, that are merged, nested and tagged (outside the machine)

24 2005rel-xml-iii24 SQL queries construction and XML generation : Recall that each node n has an associated query Cn Add keys: Add to Cn’s Select keys for all relations in its From – needed for sorting results Partition the view tree into a spanning forest – an SQL query is constructed for each tree in this forest The schema for a tree t: If the longest node index has k digits, add k attributes L1, …Lk; together, they represent a node index ; the schema also contains all other selected attributes in the tree The query for t is an outer-join that combines the different paths, sorted on : L1, level-1 atts, L2, level-2 atts, … Merge the sorted streams, nest and tag

25 2005rel-xml-iii25 Illustration by example : (compare to fig. 6, p. 3 )p. 3 for $c in $CV/Clothing/Tuple return for $d in $CV/Discount/Tuple where $d/pid = $c/pid return { data($c/price)*data($d/discount)} for $p in $CV/Problems/Tuple where $p/pid = $c/pid return {$p/comments} (elements are pointed to by arrows)

26 2005rel-xml-iii26 View tree : N1 From Clothing c N1.2.1 string Select p.comments N1.2 From Problems p Where p.pid = c.pid N1.1.1 float Select d.discount * c.price N1.1 From Discount d Where d.pid = c.pid

27 2005rel-xml-iii27 Adding keys : N1 From Clothing c Select c.pid N1.2.1 string Select p.comments N1.2 From Problems p Where p.pid = c.pid Select p.pid N1.1.1 float Select d.discount * c.price N1.1 From Discount d Where d.pid = c.pid Select d.pid

28 2005rel-xml-iii28 Partition the tree : Atomic nodes always go with their parents, so they are not considered separately

29 2005rel-xml-iii29 Query for partition (a) (all tree): Q.

30 2005rel-xml-iii30 Query for partition (b) (right node, report): Select 2 as L2, c.pid, p.code. P.comments From Clothing c, problems p Where c.pid = d.pid Order by c.pid, p.code Note: this is on p. 35, and it seems the original has a typo

31 2005rel-xml-iii31 Performance results : For a large database, and a query with a large result (for other queries all approaches are fine): One large query, many small queries are both inferior to a small number (4--6) of queries The paper presents a greedy algorithm for selecting an appropriate partition of the view forest ואידך זיל גמור


Download ppt "2005rel-xml-iii1  View forests and query composition The composition algorithm works for a (large) subset of XQuery, excluding : (see paper for details)"

Similar presentations


Ads by Google