Presentation is loading. Please wait.

Presentation is loading. Please wait.

SQL Server 2005: Deep Dive On XML And XQuery Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation.

Similar presentations


Presentation on theme: "SQL Server 2005: Deep Dive On XML And XQuery Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation."— Presentation transcript:

1 SQL Server 2005: Deep Dive On XML And XQuery Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation

2 2 File System XML XML XML XML XML And Relational Data Today Relational Data Query and Combine

3 3 XML Scenarios Data Exchange Business-to-business (B2B), business-to-consumer (B2C), application-to- application (A2A) XML is ubiquitous, extensible, platform independent transport format Document Management XHTML, Office XML Documents Messaging Simple Object Access Protocol (SOAP), RSS Mid-Tier Collaboration Ad-hoc modeling of semistructured data storing objects with sparse or multi-valued properties that do not fit well into the traditional relational schemata →Transport, Store, and Query XML data

4 4 XML Or Relational? Data Characteristics XMLRelational Flat Structured Data Hierarchical Structured Data Not First Class: PK-FK with cascading delete Semi-structured Data Not First Class Mark-up Data Not First Class: FTS Order preservation Not First Class Recursion (Recursive query)

5 5 XML And Relational! ScenariosXMLRelational Relational Data Exchange Use as transport, shred to relational Storage and Query Document Management Use as markup, store natively Provides framework to manage collections and relationships; provides Full-text search Semi-structured Data Represent semi- structured parts Represent structured parts Message audit Store natively Used for querying over promoted properties Object serialization Store natively Used for querying over promoted properties

6 6 SQL Server 2005 XML Architecture XML Parser XML ValidationValidation XML data type (binary XML) Schema Collection XML Relational XML Schemata OpenXML/nodes() FOR XML with TYPE directive RowsetsRowsets XQuery XML-DML Node Table PATH Index PROP Index VALUE Index PRIMARY XML INDEX XQuery

7 7 Why XQuery? SQL does not understand XML XPath 1.0 W3C Recommendation Used in SQL Server 2000: SQLXML and OpenXML Navigation, no reshaping Limited knowledge about types XSLT W3C Recommendation Data-driven reshaping (uses XPath) MSXML, System.XML Hard to author and optimize for large amount of data No XML data modification language (DML)

8 8 What Is XQuery? Queries and transforms trees Functional, declarative query language Combines XPath with node construction Operates on (XML Schema-)typed and unconstrained XML Designed to operate on large amounts of data Optimizable Current Status: In final Last Call Recommendations in H2 CY2006 Fulltext and DML extensions will follow later

9 XQuery Introduction

10 10 Key XQuery Features FLWOR: FOR / LET / WHERE / ORDER BY / RETURN Includes XPath 2.0 (/doc[@id = 123]) Element constructors ( {…} ) Order-preserving operators Input order (FLWR) Document order (XPath, union) Statically (or dynamically) typed Strong typing with schema, weak typing without schema FLWOR: FOR / LET / WHERE / ORDER BY / RETURN Includes XPath 2.0 (/doc[@id = 123]) Element constructors ( {…} ) Order-preserving operators Input order (FLWR) Document order (XPath, union) Statically (or dynamically) typed Strong typing with schema, weak typing without schema SQL: SELECTFROMWHEREORDER BYWITH FOR LET WHERE ORDER BY & SET RETURN

11 11 XQuery Type System 3 Classes of Item Types: Node types: element(), attribute(), comment() etc. Element content types: xs:anyType, user-defined (e.g., my:CustomerT) Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize) XQuery uses XML Schema for content and atomic types “Untyped” data have special types (e.g., xdt:untypedAtomic) XML Schema (W3C standard) Rich mechanism for type definitions and validation constraints Can be used to constrain XML documents XML Schema Collections will be used for typing (meta-data) Benefits of typed data Guarantees shape of data Provide type specific semantics Allows storage and query optimizations

12 12 Static Typing In XQuery Type Inference: Infers type of Expression during compilation Type Check: Inferred Type is subtype of expected type Benefits: Compile-time type error discovery Guarantees correct type at runtime More efficient execution Costs: Sometimes type inference is less precise than data will be (inferring list on /a[1]/b, but there will always be only 1 b) Requires more explicit casts and “pick first” (/a[1]/b[1])

13 13 XML Data Modification XQuery extensions: Insert, update, and delete XML sub-tree modification: Add or delete XML sub-trees Update values Generate consistent state

14 14 XML-DML: CustomerCustomer name: xs:string OrderOrder id: xs:int “Janine” 42 insert delete replace value of insert into /Customer insert as last into /Customer insert as first into /Customer insert before /Customer/name insert after /Customer/name notesnotes notesnotes notesnotes delete /Customer/Order[id = 42] Target needs to be statically one node “Nils” replace value of (/Customer/name)[1] with “Nils”

15 XQuery And XML-DML In SQL Server 2005

16 16 XQuery And XML-DML In SQL Server 2005 Subset of XQuery implemented Is aligned with July 2004 XQuery working draft Added XML Data Modification Applies to single XML data type instance Methods on XML data type: query(), value(), exist(), modify(), nodes() Use SQL to iterate over collection of instances (XML-typed column) Can refer to relational data Take advantage of Schema-collection information to operate on typed XML data Will make use of XML indices for optimization

17 17 query() creates new, untyped XML data type instance value() extracts an XQuery value into the SQL value and type space Expression has to statically be a singleton String value of atomized XQuery item is cast to SQL type SQL type has to be SQL scalar type (no XML or CLR UDT) exist() returns 1 if the XQuery expression returns at least one item, 0 otherwise XQuery Methods

18 18 XQuery: nodes() Provides OpenXML-like functionality on XML data type column in SQL Server 2005 Returns a row per selected node Each row contains a special XML data type instance that References one of the selected nodes Preserves the original structure and types Can only be used with the XQuery methods (not modify()), count(*), and IS (NOT) NULL

19 19 Map SQL value and type into XQuery values and types in context of XQuery or XML-DML sql:variable(): accesses a SQL variable/parameter sql:variable(): accesses a SQL variable/parameter declare @value int set @value=42 select * from T where T.x.exist(‘/a/b[@id=sql:variable(“@value”)]’)=1 sql:column(): accesses another column value sql:column(): accesses another column value tables: T(key int, x xml), S(key int, val int) select * from T join S on T.key=S.key where T.x.exist(‘/a/b[@id=sql:column(“S.val”)]’)=1 Restrictions in SQL Server 2005: No XML, CLR UDT, datetime, or deprecated text/ntext/image sql:column()/sql:variable()

20 20 Used with SET: declare @xdoc xml set @xdoc.modify(‘delete /a/b[@id=“42”]’) update T set T.xdoc.modify(‘insert into /a’) where T.id=1 Relational row-level concurrency: whole XML instance is locked XQuery: modify()

21 21 Combined SQL And XQuery/DML Processing XQuery Parser Static Typing AlgebrizationAlgebrization XML Schema Collection Metadata Static Phase Runtime Optimization and Execution of physical Op Tree Dynamic Phase XML and rel. Indices Static Optimization of combined Logical and Physical Operation Tree SELECT x.query(‘…’), y FROM T WHERE … SQL Parser AlgebrizationAlgebrization Static Typing

22 22 XML Indices Create XML index on XML column CREATE PRIMARY XML INDEX idx_1 ON docs (xDoc) Create secondary indexes on tags, values, paths Speed up queries Results can be served directly from index SQL’s cost based optimizer will consider index Primary and Secondary Indices will be efficiently maintained during updates Only subtree that changes will be updated

23 23 Example Index Contents insert into Person values (42, ' Bad Bugs Nobody loves bad bugs. Tree Frogs All right-thinking people love tree frogs. ')

24 24 Primary XML Index CREATE PRIMARY XML INDEX PersonIdx ON Person (Pdesc) Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details PKXID TAG ID NodeType-IDVALUEHID 421 1 (book) Element 1 (bookT) null#book 421.1 2 (ISBN) Attribute 2 (xs:string) 1-55860-438-3#@ISBN#book 421.3 3 (secti on) Element 3 (sectionT) null#section#book 421.3.1 4 (TITLE) Element 2 (xs:string) Bad Bugs #title#section#book 421.3.3--Text-- Nobody loves bad bugs. #text()#section#book 421.5 3 (secti on) Element 3 (sectionT) null#section#book 421.5.1 4 (title) Element 2 (xs:string) Tree frogs #title#section#book 421.5.3--Text-- All right- thinking people #text()#section#book 421.5.5 7 (bold) Element 4 (boldT) love#bold#section#book 421.5.7--Text-- tree frogs #text()#section#book

25 25 PKPKPKPK XI D NID TI D VALU E LVALU E HIDxsinil…1 1 1 2 2 2 3 3 3 Architectural Blueprint: Indexing idx1 Binary XML 2 3 XML Column in table T(id, x) Primary XML Index (1 per XML column) Clustered on Primary Key (of table T), XID Non-clustered Secondary Indices (n per primary Index) Value Index Path Index Property Index 3 1 2 1 2 4 3 3 1 2

26 XQuery Optimizations With XML Indices

27 27 Take-Away: XML Indices PRIMARY XML Index – use when lot’s of XQuery FOR VALUE – useful for queries where values are more selective than paths such as //*[.=“Seattle”] FOR PATH – useful for Path expressions: avoids joins by mapping paths to hierarchical index (HID) numbers. Example: /person/address/zip FOR PROPERTY – useful when optimizer chooses other index (e.g., on relational column, or FT Index) in addition so row is already known

28 28 Appendix: XML INDEX Some Requirements and Restrictions The user table must have a clustered index on the primary key To modify the primary key definition of a table, all XML indexes on the user table must be dropped first Primary XML index cannot be created on a computed XML column

29 29 Session Summary SQL Server 2005 provides XQuery and XML DML on XML datatype XQuery subset based on July 2004 WD Typing provided by XML Schema collections on XML datatype Node-based Data Manipulation Language (DML) Integrates with relational processing Optimization: Using extended relational algebra and query optimizer Indexing of XML datatype

30 30 Community Resources At PDC DAT Track lounge: I’ll be there daily After PDC MSDN dev center: http://msdn.microsoft.com/SQL/2005 XML and Databases whitepapers: http://msdn.microsoft.com/XML/BuildingXML/XMLandDatabase/ Online WebCasts: http://msdn.microsoft.com/sql/2005/2005webcasts/ Newsgroups & Forum: news:microsoft.public.sqlserver.xml http://forums.microsoft.com/msdn/ShowForum.aspx? ForumID=89 My E-mail: mrys@microsoft.com My Weblog: http://www.sqljunkies.com/weblog/mrys http://www.sqljunkies.com/weblog/mrys Please fill out Session Evaluation

31 31 © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

32 32 Appendix: XML Abilities Self-describing: This is an example. This is an example. Complex data Trees, recursive, graph Structured Data: highly regular, homogeneous structures Semi-structured Data: heterogeneous, sparse data Markup Data: documents/content markup Document ordering Schema/Type system Schema-less Optional Schema: semi-structured, structured Extensible Annotations, multiple schemas (late binding)

33 33 Appendix: XQuery History and Outlook Dec 1998: W3C Workshop on Querying XML Sep 1999: Start of W3C XQuery WG April 2005: XQuery Data Model, Functions and Operators, Syntax and Semantics in final Last Call End 2005 (expected): Last Call Working Draft of XQuery Full-Text language End 2005 (expected): First Working draft for Data Manipulation and Transformation Language Q2 of 2006 (expected): XQuery (without Full-Text, DML) in recommendation During 2006/7 (expected): Full-Text, DML in recommendation

34 34 Appendix: XID ORDPATHs OrdPath is abstracted from a bit array, see SIGMOD 2004 paper for details 1 1.11.31.5 1.3.11.3.31.5.11.5.31.5.51.5.7 insert Nils Nils as first into /book as first into /book 1.2.1 1.2.1.1

35 35 Appendix: XQuery Execution Plans SELECT doc FROM XMLdoc WHERE 1 = doc.exist( 'declare namespace c = "urn:example/customer"; /c:doc/c:customer/c:order/@year[.>2000] /c:doc/c:customer/c:order/@year[.>2000]'); With Primary XML index specified. Estimated Cost: 0.007 Primary XML Index Without Primary XML index specified. Estimated Cost: 1.008 PK Index on XMLdoc

36 36 Appendix: XQuery Subset in SQL Server 2005 Subset of Standard implemented. For example: FOR, WHERE, ORDER BY, RETURN, no LET No user-defined XQuery functions Subset of XQuery built-in functions No general expressions in path expressions No XQuery validation (use XML datatype validation) No expressions on a constructed node No explicit schema import (only implicit) Dynamic errors are mapped to empty sequence Some implementation restrictions: Limited support for xs:dateTime and friends,xs:QName, xsi:nil, list of unions No heterogeneous node and value sequences No XQuery joins across different XML instances (combine instances using FOR XML) Wrap XQuery expression into SQL UDF for CHECK constraints, computed columns


Download ppt "SQL Server 2005: Deep Dive On XML And XQuery Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation."

Similar presentations


Ads by Google