Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

Similar presentations


Presentation on theme: "May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,"— Presentation transcript:

1 May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions, LLC Session: E05

2 2 Outline XML in DB2 LUW till DB2 9 time Shredding CLOBs XML only databases TIMBER, Niagara, Natix Followed by bliss for several years… XML Databases Fundamental differences with Relational Databases

3 3 Outline Then IBM shook-up the database world WITH DB2 9 HYBRID DATA SERVER Extensible Optimizer and DB2 9 Why Native XML data type? pureXML™

4 4 Outline Pure XML Implementation Pure XML -- Key Enablers SQL/XML XPath/XDM XQuery Developer Workbench XQuery Builder Explain Facility and Visual Explain

5 5 Disclaimer DB2 9 is a registered trademark of IBM Corp. pureXML is a registered trademark of IBM Corp. DB2 9 Sample queries and programs are copyrights of IBM Corp. DB2 for z/OS is a registered trademark of IBM Corp. Developer Workbench and Visual Explain are copyrights of IBM Corp.

6 6 Shredding Early implementations of XML support in databases used shredding to shred XML to columns in relation tables Mapping + Parsing = Overhead Retrieval of whole document or parts Entire document replaced if update required Lack of flexibility

7 7 CLOBs Stored entire XML document as text High cost of retrieval Not buffered Poor search performance and parsing Lack of flexibility

8 8 Key Factors in IBM Approach “XML and Relational data coexist and complement each other in enterprise solutions” “A successful XML repository requires much of the same infrastructure that already exists in a RDBMS system” “XML query languages have considerable conceptual and functional overlap with SQL” DB2 goes hybrid: Integrating native XML and XQuery with relational data and SQL IBM Systems Journal, Vol 45 NO 2, 2006, Beyer, et al

9 9 Revolutionary Approach DB2 9 pureXML Framework DB2 Optimizer was extensible XML Native data type Enables XML data to be treated natively Native XML data types enables better performance (less overhead versus legacy methods) via optimization and XML indexes Industry schemas supported

10 10 Fundamental Differences DB2 9 native XML data type takes advantage of years of relational database research 20+ years of optimization advancements Extensive query rewrite plus new rewrites Uses underlying optimization and storage components Same or enhanced APIs

11 11 PureXML Framework Implementation Key Enablers Extensible Optimizer XML and SQL Integration XQuery, XDM, XPath, SQL/XML, Development Tooling Developer Workbench XQuery Builder Explain Support, including Visual Explain

12 12 SQL/XML ParserXQuery Parser Semantics Checking Optimizer Phase Rewrite Phase Code Generation Query Plan QGMX Hybrid SQL/XQuery Compiler

13 13 DB2 Client Application SQL/XMLXQuery Relational Interface XSR/Catalogs XML Interface DB2 Engine DB2 STORAGE XMLRelational DB2 9 Hybrid Data Server Architecture

14 14 Tight Integration

15 15 XQuery Defined SQL is the query language for relational databases XQuery is the query language for XML as defined by the W3C organization Built-in support provided in DB2 9 by query compiler and built-in XQuery functions

16 16 INPUT FUNCTIONS

17 17 DB2 9 XML Input SQL INSERT Statement Input to the XML column must be a well-formed XML document Defined in XML specification Clients send XML documents in textual representation and DB2 uses a Simple API for XML (SAX) parser “formness” Validation If XML data type, serialization performed by DB2 implicitly XMLPARSE function for non-XML data type

18 18 DB2 9 Annotated XML Schema Decomposition Data from XML documents decomposed into relational and XML columns using the annotated XML Schema decomposition Stores data into columns according to annotations contained in XML schema documents XML Schema Registry (XSR) Registration Schemas registered with DB2 supplied Stored Procedure or via Command Line Processor

19 19 DB2 9 XML Input -- IMPORT Import utility enhanced to support import of XML documents Validation optional Schema must be registered in DB2 XML Schema Repository (XSR) if validation performed

20 20 OUTPUT FUNCTIONS

21 21 DB2 9 XML Output Functions db2-fn:xmlcolumn function Takes a string literal as input that identifies an XML column and returns an XML sequence that consists of all document nodes in specified columns

22 22 DB2 9 XML Output Functions db2-fn:sqlquery function Used to restrict input to an XQuery by conditions placed on relational columns in the same or related tables Returns a single column Based on SQL Fullselect

23 23 DB2 9 XML Output -- EXPORT EXPORT utility supports XML data type XML data stored separately from exported relational data Details about exported XML represented in main exported file by an XML data specifier (XDS)

24 24 XQuery Data Model (XDM) XQuery Data Model (XDM) is used to define an instance of an XDM sequence An instance of the XDM is a sequence Sequence is an ordered collection of zero or more items An item is either an atomic value or a node Sequence – 48,, (6,7,8), (48,,(6,7,8)) () (an empty sequence), an XML document, 48

25 25 DATABASE DESIGN

26 26 Relational – XML Relational is highly structured Represented by well defined entities and relationships XML is hierarchical in form, unstructured and can be very complex Represented in a tree format defined by XPath W3C standard

27 27 Relational vs. XML Database Design Relational Frequency of updates Design is fixed Max performance req Stays relational Meaning outside hierarchy Specific attributes Large Fact and dimension tables RI Required XML Design Changes Flexibility desired Not use relationally downstream Only hierarchical Many attributes and only subset applicable Only subset applicable Small dimensions in STAR schema

28 28 XML Indexes Value Indexes Path-specific value indexes on XML columns Elements and attributes used in predicates and cross- document joins Full-text indexes Indexes can be defined on any native XML column Documents can be fully or partially indexed Enables just certain parts of documents to be subject to full- text search Text index maintained asynchronously via “lazy” update Regions Indexes Connects documents that span multiple pages Created automatically by DB2

29 29 XML Storage Relational data stored in tables and columns XML data stored in hierarchical type-annotated tree format XML document stored separately outside of table XML Data Specifier (XDS) stored in table describes XML document

30 30 XML Storage Documents must be able to span disk pages Single text node may be larger than a page Direct Node Access Not feasible to traverse every node (could be several gigabyte document) Must support existing isolation levels, logging and recovery mechanisms

31 31 XML Storage DB2 uses a structured, type-annotated tree Stored in binary representation to avoid repeated parsing and validating of the document Digital signatures preserved Each node contains its type information Type information on the document level enables schema evolution Each document in a column can conform to a different schema or different versions of evolving schema

32 32 XML Storage Each node contains pointers to parent and children Supports efficient navigational queries Path expressions are evaluated directly for the native format on buffered pages without copying or transforming the data Extra information stored with each node Type annotation if validated Each element node has set of child slots for associate attribute and ordered children

33 33 XML Storage Child slots have hints within them Give indication of what the child represents Enables fast navigation across a context node’s set of children without actually visiting each child node Child page may be on a different page and require I/O A unique identifier gives each node a logical and physical addressability Can be used in indexing and query evaluation Large document trees may not fit on one page Can be split into regions via region index

34 34 BUILDING APPLICATIONS

35 35 Key DB2 9 XML Enablers Build with Developer Workbench Test with Developer Workbench Deploy and Maintain with Developer Workbench Replaces former Development Center Migration support for existing documents Eclipse Framework based tool

36 36 Key DB2 9 XML Enablers Developer Workbench Separate download at http://www- 306.ibm.com/software/data/db2/ad/

37 37 XML Sample Schema Definition

38 38 XML-XQuery SP

39 39 Visual Explain Support

40 40

41 41 XML Schema Definition

42 42 XPath Example

43 43 Summary pureXML™ Framework SQL/XML XQuery/XPath XDM and XSR XML Storage and XML Indexes Developer Workbench Build, Test, Deploy and Maintain! Additional Features coming in DB2 9 for z/OS

44 44 Thanks! Philip K. Gunning Gunning Technology Solutions, LLC pgunning@gunningts.com Session: E5 DB2 9: XML Evolution and Revolution


Download ppt "May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,"

Similar presentations


Ads by Google