Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Schemas in Oracle XML DB Ravi Murthy Sandeepan Banerjee Oracle Corporation.

Similar presentations


Presentation on theme: "XML Schemas in Oracle XML DB Ravi Murthy Sandeepan Banerjee Oracle Corporation."— Presentation transcript:

1 XML Schemas in Oracle XML DB Ravi Murthy Sandeepan Banerjee Oracle Corporation

2 Talk Overview  Why XML in DB ?  Background – XML Schema – Object-Relational DB  XML Schema Support in Oracle XML DB – XML Storage – XML Query – XML Update  Future directions and Conclusions

3 Why XML in DB ?  Trends in industry – More XML content being generated – More applications dealing with (structured) data and (semi- structured) documents – Need for unified management for all kinds of data – XML Schema language provides strong typing  Many benefits to storing XML in DB – Better Queriability – Optimized Updates – Stronger Validation  Fidelity of XML very important

4 Oracle XML DB Overview  Native XML data type  Supports W3C XML Schema data model  XML/SQL Duality  Support for XML standards – Namespaces, XPath, XSLT, SQL/XML  High performance XML repository – Hierarchical File System Abstraction – Protocols : FTP, HTTP/WebDAV – Access Control (ACL) – Versioning

5 XML Schema  W3C Recommendation  Large number of built-in scalar data types  Simple type definitions – Length, pattern and other constraints  Complex type definitions – sequence / choice / all – Mixed content  Extensible – Derivation by extension and restriction – Substitution Groups – Wildcards  Gradually replacing the traditional DTD  Commonly used as a validation mechanism

6 Object Relational DB Basics  Object types  Collection types  Object References  LOBs XML Schema Construct Object Construct Complex typeObject type Local complex typeEmbedded object type Complex type with maxOccurs > 1 Collection type Derived complex type Subtype

7 XMLType  Native data type for XML  Used to define columns of tables and views, arguments to stored procedures, etc.  XML specific methods and operators for – Querying and extracting XML using XPath – Transforming XML using XSLT – Validating XML using XML Schema  Multiple Storage Options – Unstructured Storage in CLOB – Structured Storage into object-relational rows and columns – Hybrid Storage  Maintains application transparency to physical storage choice

8 XML DB and XML Schema  XML Schema controls all aspects of processing – Storage mappings – In-memory representations – Language Bindings  XML Schema Registration Process – Associates XML Schema with URL – Generates Object types – Creates default tables  XMLType column can be constrained to a global element of registered schema

9 XML Schema Example <schema targetNamespace=“http://www.oracle.com/PO.xsd” xmlns:po=“http://www.oracle.com/PO.xsd” elementFormDefault=”qualified” xmlns="http://www.w3.org/2001/XMLSchema">

10 XML Schema Example (contd)

11 Generated Object Types TYPE "Item_T" (part varchar2(1000), price number); TYPE "Item_COLL" AS VARRAY(1000) OF "Item_T"; TYPE "PurchaseOrderType_T" (ponum number, company varchar2(100), item Item_COLL);

12 Generated Tables TABLE po_tab OF XMLTYPE XMLSCHEMA “http://www.oracle.com/PO.xsd" ELEMENT "PurchaseOrder“ VARRAY(item) STORE AS item_tab;

13 XML Document Example <PurchaseOrder xmlns="http://www.oracle.com/PO.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.oracle.com/PO.xsd http://www.oracle.com/PO.xsd"> 1001 Oracle Corp 9i Doc Set 2550 8i Doc Set 350

14 XML Storage – PO_TAB and LINE_TAB Row IDponumcompany 11001Oracle Corp Parent Row ID Array Indexpartprice 119i Doc Set2250 128i Doc Set350

15 Structured Storage  Attributes and single-valued elements – Stored as columns in single row – SQL data types correspond to XML Schema types – SQL constraints correspond to XML Schema constraints  Multi-valued elements (collections) stored in separate nested tables – One row per item in collection – Nested table row stores parent key – Array Index column stores the position information  Number column – uses full range of floating points  Elements inserted in the middle of the collection get (previous_array_index + next_array_index)/2  Supports multiple levels of nesting – Embedded object types – Embedded collection types with multiple nested tables

16 DOM Fidelity  Structured storage guarantees DOM fidelity – No whitespace fidelity  System binary attribute in object types – SYS_XDBPD$  PD attribute stores non-relational information – Ordering of elements – Comments – Processing Instructions – Namespace declarations – Prefix information – Mixed content – text nodes that are intermixed with elements are stored in the system column

17 XDB Attributes in XML Schema  Mapping details captured as new attributes within the XML Schema – Oracle attributes use namespace : http://xmlns.oracle.com/xdb http://xmlns.oracle.com/xdb  Input XML Schema does not need to be annotated – Default assumptions for all XDB attributes  Input XML Schema can explicitly specify XDB attributes – SQLName : Name of column or type attribute – SQLType : Name of object type – SQLCollType : Name of collection type – MaintainOrder : Permits turning off order maintenance

18 Hybrid Storage of XML  Two ends of storage spectrum – Store entire content in a single LOB – Full shredding of XML data  Hybrid Storage Options – Combination of shredding and LOB storage – xdb:SQLType=“CLOB” applied to – CLOB storage ideal for fragments that are not intended for query or partial updates – Example : XHTML content embedded within resource descriptors (metadata)

19 Querying XMLType  XPath based operators  existsNode – Boolean operator – Checks for existence of node identified by XPath  extract – Extracts a fragment identified by XPath  extractValue – Retrieves the raw value of leaf node identified by XPath  Namespace Aware  ANSI SQL/XML Standards effort

20 Query Rewrite  Automatic rewrite of XPaths during query compilation  Rewritten query directly accesses underlying relational columns  Introduces joins with nested tables  Enables use of indexes

21 Query Rewrite - Example  Original Query SELECT extractValue(value(p), '/PurchaseOrder/PONum') FROM po_tab p WHERE existsNode(value(p), '/PurchaseOrder[Company=Oracle]') = 1;  Rewritten Query SELECT p.ponum FROM po_tab p WHERE p.company = 'Oracle';

22 Query Rewrite Examples XPath ExpressionRewrite Logic Simple XPath expressions  /PurchaseOrder/Company  /PurchaseOrder/@poDate  Traversals of object attributes  1 or more levels Collection traversals  /PurchaseOrder/LineItem/Part  Joins with appropriate nested tables Predicates  [Company=“Oracle”]  Relational predicates List indexes  Lineitem[1]  Operator to access i-th item of collection  Uses index on array_index column

23 Updating XMLType  Updating entire XML document  Partial Update – Uses UpdateXML() operator – XPath identifies element or attribute to be updated – New value for the updated node is specified – Rewritten to directly update underlying column(s)  Similar mechanisms for – Inserting new nodes – Deleting node(s)

24 Update Rewrite - Example  Original Statement UPDATE po_tab p SET value(p) = updatexml(value(p), '/PurchaseOrder/PONum/text()', 9999) WHERE existsNode(value(p), ‘/PurchaseOrder[Company=Oracle]’) = 1;  Rewritten Statement UPDATE po_tab p SET p.ponum = 9999 WHERE p.Company = ‘Oracle’;

25 XMLType Views  Provide an XML view of relational data  Good evolutionary strategy – New XML apps on XML abstraction of existing data  SQL/XML Standard operators used to generate XML  Views can generate schema based XML  Insert / Update / Delete operations via ‘Instead of’ Triggers  Queries over XML views are rewritten to directly access underlying relational columns

26 XMLType View - Example CREATE VIEW po_view of XMLTYPE XMLSCHEMA "po.xsd" ELEMENT "PurchaseOrder" AS SELECT XMLElement("PurchaseOrder", XMLForest(p.ponum "PONum", p.company "Company"), (SELECT XMLAGG( XMLElement("Item", XMLForest(i.part "Part", i.price "Price")) FROM items_rel_tab i WHERE i.po_id = p.id)) FROM po_rel_tab p;

27 Complex XML Schemas  Cyclic Definitions – Object types created with references – Table row contains reference to other rows stored in same or different table – Keys to document and parent rows stored in nested rows  Complex type derivation – Mapped to object type inheritance  Wildcards – Mapped to CLOB attributes

28 Indexing XMLType  Multiple index types  B-Tree and bitmap indexes  Function-based indexes – Create index on specific XPath expressions  Text indexes – Inverted lists provide section-based search – Also support keyword based search within textual content

29 Future Directions  Optimize XML Query support – Translate XML Query to SQL – Works against distributed databases using Oracle connectivity solutions  Optimize support for highly variable documents – New storage and indexing techniques

30 Conclusions  Oracle XML DB is a robust platform for building XML applications  Strong support for XML Schema based storage, query, indexing and updates  Structured storage maintains XML fidelity  Support for standards such as SQL/XML, XPath and XSLT  Powerful XML/SQL Duality


Download ppt "XML Schemas in Oracle XML DB Ravi Murthy Sandeepan Banerjee Oracle Corporation."

Similar presentations


Ads by Google