Presentation is loading. Please wait.

Presentation is loading. Please wait.

XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian.

Similar presentations


Presentation on theme: "XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian."— Presentation transcript:

1 XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian Baras, Brandon Berg, Denis Churin, Eugene Kogan SQL Server Microsoft Corp

2 VLDB 2005 - Sep 1S. Pal et al.2 Overview  Background XML Support in SQL Server 2005 OrdPath labeling of XML nodes XML indexes – PATH, VALUE, PROPERTY  Main topic – XQuery compilation Architecture XML operators Mapping XML operators to relational+ ops  Conclusions

3 VLDB 2005 - Sep 1S. Pal et al.3 Create table DOCS ( ID int primary key, XDOC xml)  XML stored in an internal, binary form (‘blob’)  Optionally typed by a collection of XML schemas Used for storage and query optimizations  3 of 5 methods on XML data type: query(): returns XML type value(): returns scalar value exist(): checks conditions on XML nodes  XML indexing  More information at http://msdn.microsoft.com/xml Background XML Support in SQL Server 2005

4 VLDB 2005 - Sep 1S. Pal et al.4 Background XQuery embedded in SQL  Retrieve section titles from wrapped in new elements: SELECT ID, XDOC.query(' for $s in /BOOK/SECTION return {data($s/TITLE)} ') FROM DOCS

5 VLDB 2005 - Sep 1S. Pal et al.5 Background XQuery – supported features  XQuery clauses “for”, “where”, “return” and “order by”  XPath axes – child, descendant, parent, attribute, self and descendant-or-self  Functions – numeric, string, Boolean, nodes, context, sequences, aggregate, constructor, data accessor  SQL Server extension functions to access SQL variable and column data within XQuery  Numeric operators (+, -, *, div, mod)  Value comparison operators (eq, ne, lt, gt, le, ge)  General comparison operators (=, !=,, =)

6 VLDB 2005 - Sep 1S. Pal et al.6 Background [SIGMOD04] ORDPATH Label of Nodes BOOK1 Section1.3 Figure1.3.3Title1.3.1 Section1.5 Title1.5.1Figure1.5.3 @ISBN1.1 node 1 precedes node 2 in document order  ORDPATH (node 1 ) < ORDPATH (node 2 ) node 1 is ancestor of node 2  ORDPATH (node 1 ) is prefix of ORDPATH (node 2 ) ORDPATH(1.3) ≤ id < Descendant_Limit (1.3) = 1.4

7 VLDB 2005 - Sep 1S. Pal et al.7 Background [VLDB 2004] Indexing XML column  Primary XML index on an XML column Creates B+tree tree on data model content of the XML nodes Adds column Path_ID for the reversed, encoded path from each XML node to root of XML tree  OrdPath labeling schema is used for XML nodes Relative order of nodes Document hierarchy

8 VLDB 2005 - Sep 1S. Pal et al.8 Background XML example INSERT INTO myTable VALUES (7, ‘ Bad Bugs Tree frogs … ’)

9 VLDB 2005 - Sep 1S. Pal et al.9 Background Primary XML Index Entries IDORDPATHTAGNODETYPEVALUEPATH_ID 711 (Book)10 (ns:bT)NULL#1 71.12 (ISBN)2 (xs:string)'1-55860-…'#2#1 71.33 (Section)11 (ns:sT)NULL#3#1 71.3.14 (Title)2 (xs:string)'Bad Bugs'#4#3#1 71.3.35 (Figure)12 (ns:fT)NULL#5#3#1 71.53 (Section)11 (ns:sT)NULL#3#1 71.5.14 (Title)2 (xs:string)'Tree frogs'#4#3#1 71.5.35 (Figure)12 (ns:fT)NULL#5#3#1 Clustering key - Encoding of tags & types stored in system meta-data - Additional details not shown

10 VLDB 2005 - Sep 1S. Pal et al.10 Background Secondary XML indexes  To speed up different classes of commonly occurring queries  Statistics created on key columns of the primary and secondary XML indexes Used for cost-based selection of secondary XML indexes PATHpath-based queriesPATH_ID, VALUE, ID, ORDPATH VALUEvalue-based queries VALUE, PATH_ID, ID, ORDPATH PROPERTYObject propertiesID, PATH_ID, VALUE, ORDPATH

11 VLDB 2005 - Sep 1S. Pal et al.11 Background Handling Types  If XML column is typed Values are stored in XML blob and XML indexes with appropriate typing  Untyped XML Values are stored as strings Convert to appropriate types for operations  SQL typed values stored in primary XML index Most SQL types are compatible with XQuery types (integer) Value comparisons on XML index columns suffice Some types (e.g. xs:datetime) are stored in internal format and processed specially

12 VLDB 2005 - Sep 1S. Pal et al.12 XQuery Processing Architecture  XQuery Compiler: Parses XQuery expr Checks static type correctness Type annotations Applies static optimiztns  Path collapsing  Rewrites using XML schemas  XML Operator Mapper Recursively traverses XML algebra tree Converts each XmlOp to reln+ operator sub-tree Mapping depends upon existence of primary XML index XQuery expression XQuery Compiler XML algebra tree (XmlOp ops) XML Operator Mapper Relational Operator Tree (relational+ operators) Reln Query Processor

13 VLDB 2005 - Sep 1S. Pal et al.13 Examples of XML Operators XmlOp_Select In: list of items, condition Out: items satisfying condition XmlOp_Path In: simple paths, no predicates Opt: path context to collapse paths Out: eligible XML nodes XmlOp_Apply In: two item lists Out: one item list Variable binding in “for” expression XmlOp_Construct In: sub-nodes for element construction, otherwise value Out: constructed node

14 VLDB 2005 - Sep 1S. Pal et al.14 XML Operator Mapping – Overview 1 20 35 XMLPK XQUERY 1 1 1 1 20 35 PK REL+ tree Primary XML Index PATH Index VALUE Index PROPERTY Index OrdPath Special handling for SELECT * | XDOC

15 VLDB 2005 - Sep 1S. Pal et al.15 New operators  Some produce N rows from M (≠ N) rows XML_Reader – streaming, pull-model XML parser XML_Serializer – to serialize query result as XML  Some are for efficiency Contains – to evaluate XQuery contains() TextAdd – to evaluate the XQuery function string() Data – to evaluate XQuery data() function  Some are for specific needs Check – validate XML during insertion or modification

16 VLDB 2005 - Sep 1S. Pal et al.16 XML Operator Mapping  Following categories: Mapping of XPath expressions Mapping of XQuery expressions Mapping of XQuery built-in functions

17 VLDB 2005 - Sep 1S. Pal et al.17 XPath Expressions  Two cases: Fully known, forward paths without branching after path collapsing Paths without branching that are not fully known after path collapsing  Segments of the path cannot be collapsed or a path is split into multiple segments  Occurs most commonly for paths containing wildcard steps, //, self and parent axes  Evaluated using LIKE operator on XML idx

18 VLDB 2005 - Sep 1S. Pal et al.18 Non-indexed XML, Full Path  XML_Reader produces subtrees of Node table rows Contains OrdPath No PK or PATH_ID  XML_Serialize reassembles those row into XML data type To output result XML operator tree: XmlOp_Path PATH = “/BOOK/SECTION” “/BOOK/SECTION” Rel+ operator tree: XML_Serialize XML_Reader (XDOC, “/BOOK/SECTION”)

19 VLDB 2005 - Sep 1S. Pal et al.19 Query Execution on XML Blob  XDOC column value in each row parsed at runtime Parser is XmlReader (not DOM) Evaluate simple XPath (without branching) during parsing Rest of processing done in memory using relational operators  // and * are also pushed into XML_Reader SELECT ID, XDOC.query (' /BOOK/SECTION [2] ') FROM DOCS

20 VLDB 2005 - Sep 1S. Pal et al.20 Sample query execution using Primary XML Index IDORDPATHTAGNODETYPEVALUEPATHID 711 (Book)10 (ns:bT)NULL#1 71.12 (ISBN)2 (xs:string)'1-55860-…'#2#1 71.33 (Section)11 (ns:sT)NULL#3#1 71.3.14 (Title)2 (xs:string)'Bad Bugs'#4#3#1 71.3.35 (Figure)12 (ns:fT)NULL#5#3#1 71.53 (Section)11 (ns:sT)NULL#3#1 71.5.14 (Title)2 (xs:string)'Tree frogs'#4#3#1 71.5.35 (Figure)12 (ns:fT)NULL#5#3#1 Clustering key /Book/Section  #3#1 (by XML Op Mapper) /Book/Section  #3#1 (by XML Op Mapper)

21 VLDB 2005 - Sep 1S. Pal et al.21 Indexed XML, Full Path  XmlOp_Path mapped to SELECT  GET(PXI) – rows from primary XML index Match PATH_ID  Not shown: JOIN with base table on PKXML_Serialize Apply Select ($b) GET(PXI) Path_ID=#SECTION#BOOK $b.OrdP ≤ OrdP< DL($b) GET(PXI) Select Assemble Subtree

22 VLDB 2005 - Sep 1S. Pal et al.22 XML index – PATH PATH_IDVALUEIDORDPATH #1NULL71 #2#1'1-55860-…'71.1 #3#1NULL71.3 #3#1NULL71.5 #4#3#1'Bad Bugs'71.3.1 #4#3#1'Tree frogs'71.5.1 #5#3#1NULL71.3.3 #5#3#1NULL71.5.3  Speeds up path evaluations  Example – /Book/Section  #3#1

23 VLDB 2005 - Sep 1S. Pal et al.23 Indexed XML, Imprecise Paths /BOOK/SECTION// TITLE  Matched using LIKE operator on Path_ID Apply Select ($s) GET(PXI) Path_ID LIKE #TITLE%#SECTION#BOOK XML_Serialize Assemble subtree of Assemble subtree of

24 VLDB 2005 - Sep 1S. Pal et al.24 Path_ID=#@I SBN#BOOK & VALUE=“12” & Par($b) Predicate Evaluation /BOOK[@ISBN = “12”]  Search value compared with VALUE column in PXI  Collapsed path /BOOK/@ISBN Induce index seeks Reduce intermediate result size  Parent check – Par($b) Using OrdPath  Value conversion might be neededXML_Serialize Apply Select GET(PXI) Apply Select ($b) GET(PXI) Path_ID= #BOOK Assemble subtree of Assemble subtree of

25 VLDB 2005 - Sep 1S. Pal et al.25 Ordinal Predicate  /BOOK[n]  Adds ranking column to the rows for elements Retrieves the nth node  Special optimizations [1]  TOP 1 ascending [last()]  TOP 1 descending Avoids sorting when input is sorted  Example – in XML_Serializer

26 VLDB 2005 - Sep 1S. Pal et al.26 Error handling  Static type errors at compilation time Raises static type errors if an expression could fail at runtime due to type safety violation  Addition of string to integer  Querying non-existent node name in typed XML  Non-singleton in “eq” Some can be fixed using explicit cast or ordinal specification  Dynamic error converted to empty sequence Yields correct result in predicates without negations

27 VLDB 2005 - Sep 1S. Pal et al.27 “for” Iterator Path_ID LIKE #@num#SEC% #BK & VALUE >= 3 & Par($s) Select Select ($s) GET (PXI) Path_ID LIKE #SECTION%#BOOK Exists GET(PXI) Select XML_Serialize Assemble Path_ID LIKE #TITLE#SECTION% #BOOK & Par($s) Apply ($s) Apply for $s in /BOOK//SECTION where $s/@num >= 3 return $s/TITLE  XML op for “for” is XmlOp_Apply Maps to APPLY Binds $s and iterates over Determines its children  Nested “for” and “for” with multiple bindings turn into nested APPLY Each APPLY binds to a different variable

28 VLDB 2005 - Sep 1S. Pal et al.28 XQuery “order by” and “where”  Order by: Sorts rows based on order-by expression Adds a ranking column to these rows Ranking column converted into OrdPath values  Yield the new order of the rows  Fits rest of query processing framework  Where Becomes SELECT on input sequence Filters rows satisfying specified condition

29 VLDB 2005 - Sep 1S. Pal et al.29 XQuery “return”  Return nodes sequence in document order Use OrdPath values and XML_Serialize operator  New element and sequence constructions Merge constructed and existing nodes into a single sequence (SWITCH_UNION)

30 VLDB 2005 - Sep 1S. Pal et al.30 XQuery Functions & Operators  Built-in fn and op are mapped to relational fn and op if possible fn:count()  count()  Additional support for XQuery types, functions and operators that cannot be mapped directly Intrinsics

31 VLDB 2005 - Sep 1S. Pal et al.31 Optimizations  Exploiting Ordered Sets Sorting information (OrdPath) made available to further relational operators XML_Serialize is an example  Using static type information Eliminates CONVERT() in operations Allows range scan on VALUE index

32 VLDB 2005 - Sep 1S. Pal et al.32 Conclusions  Built-up infrastructure for query processing framework Other XQuery features (such as “let” and typeswitch) can be implemented Data modification language  Fits into relational query processing framework  XQuery features can be implemented using rel++ operators  Optimizations pose the biggest challenges  More cost-based optimizations can be done Enhanced costing model (e.g. choice of PXI) Matching materialized views

33 VLDB 2005 - Sep 1S. Pal et al.33 Thank you!


Download ppt "XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian."

Similar presentations


Ads by Google