Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML and databases Chap. 12. Databases Today Data today: Structured - Info in databases – Data organized into chunks, similar entities groups together.

Similar presentations


Presentation on theme: "XML and databases Chap. 12. Databases Today Data today: Structured - Info in databases – Data organized into chunks, similar entities groups together."— Presentation transcript:

1 XML and databases Chap. 12

2 Databases Today Data today: Structured - Info in databases – Data organized into chunks, similar entities groups together – Descriptions for entities in groups – same format, length, etc. Semi-structured – data has certain structure, but not all items identical – Schema info may be mixed in with data values – Similar entities grouped together – may have different attributes – Self-describing data – May be displayed as a graph Unstructured data – Data can be of any type, may have no format or sequence Web pages in HTML Video, sound, images

3 HTML

4

5 Why XML not HTML? HTML not suitable for specifying structured data from databases - – Does not contain schema information (unstructured only how to display information Want: Data source – database with Web interface Specify content and format of Web pages with HTML – Use HTML tags (predefined) for formatting Web documents

6 Types of XML documents 1.Data-centric – documents have small data items following a specific structure 2.Document-centric – documents with large amount of text (little structured data) 3.Hybrid – documents with structured and unstructured data

7 XML documents and DBMS Options: Use a DBMS to store XML documents as text – If DBMS has module for document processing Use DBMS to store XML document contents as data elements – Works if all documents have same structure – map XML schema to DB schema Design special DBMS for storing XML data – New type of XML DBMS designed, e.g. based on hierarchical model Create XML documents from preexisting RDBS and store into DB

8 XML XML – standard for structuring and exchanging data over Web Basic object is XML document Structuring concepts: – Elements (tags) – Attributes XML attributes – describe properties and characteristics of elements (tags) – Also Entities, identifiers, references

9 Elements Elements identified by: – Start tag – End tag Simple elements – data values Complex elements – constructed from other elements hierarchically – XML called a tree or hierarchical model – No limit on number of nesting elements

10 Well-formed XML document is well formed if: – Starts with XML declaration to indicate version and other relevant attributes – Single root element – Element matching pair of start/end tags within parent element – Syntactically correct – Can be processed to create internal tree

11 Semi-structured can be schemaless or have a schema

12 Semi-structured Schemaless If semistructured data – schemaless XML document Want to use info in XML document to determine schema of database To do this, parse document to create tree structure of data schemaless XML document is standalone (no corresponding file, specifying schema)

13 Slide 27- 13

14

15 Whole document must be parsed beforehand to generate tree Each time access data, must parse document to create tree structure of data Parsing SLOWS down the process Set of API functions to manipulate tree and parsing models: – DOM (Document Object Model) - uses main memory to parse entire document – SAX – allows processing XML documents on the fly (also good for streaming XML documents) Semi-structured schemaless -- Parsing

16 Semi-structured not Schemaless To specify structure of data of semi-structured data if not schemaless – DTD (Document Type Definition), XML Schema Once parsed, allows validation of XML documents against: – DTD file or – XML schema file Valid means elements must follow structure and types specified in the separate schemas

17 XML DTD First specify root tag Parenthesis following element can be – Type – names of other elements (children) – (#PCDATA) means element is a leaf node (parsed character data – string) – | indicates either – * - element can be repeated 0 or more times – + - element can be repeated 1 or more times – ? – element can be repeated zero or one times – If no symbol, element must appear exactly once – Parenthesis can be nested Fig. 12.4

18 Slide 27- 18

19 DTD cont’d To check for conformance to DTD add to XML document: Could also include DTD doc at beginning of XML doc Problems with DTD – datatypes not general – Special syntax requires specialized processors – Elements must follow ordering of document

20 XML Schema Alternative to (evolution from) DTD Standard for specifying structure of XML documents xsd – XML schema definition

21 XML Schema Same syntax rules, so same processors on both Could display the entire Company database as a single document Could store DB in XML format instead of relational DB Fig. 12.5

22 Slide 27- 22

23 Slide 27- 23

24 Slide 27- 24

25 Features of XML Schema 1) To identify XML schema language elements used, specify a file at a Web site location Each such definition is XML namespace http://www.w3.org/2001/XMLSchema File name assigned to xsd, and this variable used as prefix to all XML schema commands

26 Features cont’d 2) Annotation, documents and language – Used for providing comments and other descriptions, e.g. “en” means english 3) Elements and types xsd:element - specifies element name xsd:ComplexType – if elements and children xsd:sequence –ordered set of element types, e.g. dept, employee, etc.,

27 Features cont’d 4) First-level elements specified in element tags 5)Element type, minimum and maximum occurrences MinOccurs, etc. 6)also xsd:key - PK xsd:unique tag, but must give constraint a name xsd:keyref – foreign keys

28 Features cont’d 7)Structures of complex elements - complex types 8)Composite (compound) attributes – complex types XSD 1.0 XSD 2.0 XSD 3.0 XSD 4.0

29 To Query XPointer – Specify position in XML document so other documents can link to it XPath – Query Language for selecting nodes from an XML document – Addresses parts of an XML document – Language mainly consists of location paths and expressions semantics for functionality – facilities for manipulation of strings, numbers and booleans

30 To Query: Use XPath (XML Path Language) to retrieve data XPath – expression language, based on tree representation of XML document Small query language Provides ability to navigate around the tree Addresses specific parts of XML document Provides a common syntax and shared model between XPointer and XSLT (Extensible Stylesheet Language Transformations)

31 XPath location path and expressions A location path is e.g. child::para[position=(1)] XPath expressions – Returns collection of element nodes that satisfy patterns specified in expression – Name with qualifier conditions – Separators: / means tag must appear as child of previous parent tag // means tag can appear as descendant of previous tag at any level

32 XPath To access whole XML document: Doc(www.company.com/info.xml)/company)www.company.com/info.xml)/company /company/department //employee [employeeSalary gt 70000]/employeeName /company/employee [employeeSalary gt 70000]/employeeName /company/project/projectWorker [hours ge 20.0]

33 XPath Language Provide a common syntax and shared model between XPointer and XSLT – XSLT – functional language - no states, domain specific – XSLT - Language to create new document to transform the format of XML data into data of other formats (human readable) - eg. XML data into HTML, plain text, PDF – XSLT - Describes how files encoded in XML are to be formatted or transformed With XSLT can transform a document from XML to XML, from XML to HTML, etc. – Xpath is used to query elements

34 XSLT Language Uses Xpath to query elements (select= ) XSLT to specify results of the transformation (xsl: ) XSLT template:

35

36

37

38 XQuery No other query languages besides XPath and XSLT until XQuery – XQuery (like SQL) to query data using XPath expressions Based on SQL-like FLWOR for joins For, Let, Where, Orderby Return

39 XQuery – querying in XML For LET WHERE ORDER BY RETURN

40 FLWOR For, Let can appear any number of times or in any order Where, order by are optional Return – always needed Can be nested Can be argument to function (e.g. count, max())

41 XQUERY examples (Fig. 12.7) FOR $x IN doc(www.company.com/info.xml)www.company.com/info.xml //employee [employeeSalary gt 70000]/employeeName RETURN $x/firstName, $x/lastName FOR $x IN doc(www.company.com/info.xml)www.company.com/info.xml WHERE $x/employeeSalary gt 70000 RETURN $x/employeeName/firstName, $x/employeeName/lastName

42 XQUERY examples FOR $x IN doc(www.company.com/info.xml)/company/ project[projectNumber=5]/projectWorker, $y IN doc(www.company.com/info.xml)/company/employee WHERE $x/hours gt 20.0 AND $y.ssn = $x.ssn RETURN $y/employeeName/firstName, $y/employeeName/lastName, $x/hours www.company.com/info.xmlwww.company.com/info.xml) What does this do? What do you think of XQuery?

43 Another Example let $maxCredit := 3000 let $overdrawnCustomers := //customer[overdraft > $maxCredit] return count($overdrawnCustomers) for $v in //video for $a in //actor where $v/actorRef = $a/@id order by $a, $v/year return concat($a, ":", $v/title)

44 SQL/XML How are the giants of the DB market incorporating XML? – Oracle is supporting SQL/XML (also IBM) These are extensions of SQL – SQL has a proprietary SQLXML for SQL Server Not the same as SQL/XML

45 SQL/XML XML Publishing Functions The XML Datatype Mapping Rules

46 XML Publishing Functions xmlelement()Creates an XML element, allowing the name to be specified. xmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attribute. xmlroot()Creates the root node of an XML document. xmlcomment()Creates an XML comment. xmlpi()Creates an XML processing instruction. xmlparse()Parses a string as XML and returns the resulting XML structure. xmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding element. xmlconcat() Combines a list of individual XML values to create a single value containing an XML forest. xmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.

47 Another XML Example

48 SQL/XML I11 Jane I22 Niko I33 Mick

49

50

51

52 XML Datatype The XML Type also plays a second important role Relational databases now routinely store XML in individual columns The XML Type provides a standard type for such columns, which is useful both in SQL and in JDBC.

53 XML Datatype

54 SQL/XML Mapping Rules Mapping SQL character sets to Unicode. Mapping SQL s to XML Names. Mapping SQL data types (as used in SQL-schemas to define SQL-schema objects such as columns) to XML Schema data types. Mapping values of SQL data types to values of XML Schema data types. Mapping an SQL table to an XML document and an XML Schema document. Mapping an SQL schema to an XML document and an XML Schema document. Mapping an SQL catalog to an XML document and an XML Schema document. Mapping Unicode to SQL character sets. Mapping XML Names to SQL s.

55 XQuery vs. SQL/XML

56 Oracle XML DB Documentation for Oracle XML DB Documentation Some of the features are: – XML DB is not a separate server, but group of technologies – Can utilize unstructured/structured data – Features: XMLType DOM fidelity XML schema XPath search XML indexes – Can even generate XML from Oracle DB

57 Oracle XML Creating a Table of XMLType CREATE TABLE XMLTABLE OF XMLType; Creating a Table with an XMLType Column CREATE TABLE Example1 ( KEYVALUE varchar2(10) primary key, XMLCOLUMN xmltype ); existsNode() to find a Node to Match the XPath Expression extractValue() is the same as extract() except it returns value without the XML element tags, must be a single element

58 Oracle and XML Schema is created for XML when using SQL*Plus //A DTD is not needed, but you can register one, whereby XSD is a meta generic DTD (info)info Create table Company of XMLType; The rest of the definitiondefinition The same fields do not have to be specified in every element Use: set long 500 to show all values in table

59 SELECT extractValue(OBJECT_VALUE, '/Company/Employee/Fname') FNAME From Company

60 SELECT extractValue(OBJECT_VALUE, '/Company/Employee/Fname') FNAME FROM Company WHERE existsNode(OBJECT_VALUE, '/Company/Employee/Salary <"40000"') = 1;

61 Examples in this section are based on the following PurchaseOrder XML document: <PurchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.oracle.com/xdb/po.xsd"> ADAMS-20011127121040988PST SCOTT 2002-03-31 Julie P. Adams ADAMS R20 Julie P. Adams Redwood Shores, CA 94065 650 506 7300 Ground

62 The Ruling Class Diabolique 8 1/2

63 existNode() existsNode() to find a Node to Match the XPath Expression Given this sample XML document, the following existsNode() operators return true (1). SELECT existsNode(value(X),'/PurchaseOrder/Reference') FROM XMLTABLE X; SELECT existsNode(value(X), '/PurchaseOrder[Reference="ADAMS- 20011127121040988PST"]') FROM XMLTABLE X;

64 extractValue() Example 3-7 Valid Uses of extractValue() SELECT extractValue(value(x),'/PurchaseOrder/Reference') FROM XMLTABLE X; Returns the following: EXTRACTVALUE(VALUE(X),'/PURCHASEORDER/REFERENCE') ------------------- -------------ADAMS-20011127121040988PST

65 extractValue() Non-Valid Uses of extractValue() SELECT extractValue(value(X), '/PurchaseOrder/LineItems/LineItem/Description') FROM XMLTABLE X; -- FROM XMLTABLE X; -- * -- ERROR at line 3: -- ORA-19025: EXTRACTVALUE returns value of only one node

66 extract() Using extract() to Return an XML Fragment The following extract() statement returns an XMLType that contains an XML document fragment containing occurrences of the Description node. These match the specified XPath expression shown. Note: In this case the XML is not well formed as it contains more than one root node. set long 20000 SELECT extract(value(X), '/PurchaseOrder/LineItems/LineItem/Description') FROM XMLTABLE X; -- This returns: -- EXTRACT(VALUE(X),'/PURCHASEORDER/LINEITEMS/LINEITEM/DESCRIPTION') -- ---- The Ruling Class Diabolique 8 1/2

67 updateXML() Using updateXML() to Update a Text Node Value Identified by an XPath Expression This example uses updateXML() to update the value of the text node identified by the XPath expression `/PurchaseOrder/Reference': UPDATE XMLTABLE t SET value(t) = updateXML(value(t), '/PurchaseOrder/Reference/text()', 'MILLER-200203311200000000PST') WHERE existsNode(value(t), '/PurchaseOrder[Reference="ADAMS- 20011127121040988PST"]') = 1; This returns: 1 row updated.

68 Oracle XML Oracle XML DB complies with the W3C XSL/XSLT recommendation by supporting XSLT transformations in the database. In Oracle XML DB, XSLT transformations can be performed using either of the following: – XMLTransform() function – XMLType datatype's transform() method

69 transform() Using transform() to Transform an XSL The following example shows how transform() can apply XSLT to an XSL stylesheet, PurchaseOrder.xsl, to transform the PurchaseOrder.xml document: SELECT value(t).transform(xmltype(getDocument('purchaseOrder.xsl'))) from XMLTABLE t where existsNode(value(t), '/PurchaseOrder[Reference="MILLER- 200203311200000000PST"]' ) = 1; This returns: VALUE(T).TRANSFORM(XMLTYPE(GETDOCUMENT('PURCHASEORDER.XSL'))) ---------... Since the transformed document using XSLT is expected as an instance of XMLType, the source could easily be a database table.

70 Research Topics XML to relational data mapping Updating XML in RDBMS XML and access control XML Parsing – reference on this topic: A Brief History of XML


Download ppt "XML and databases Chap. 12. Databases Today Data today: Structured - Info in databases – Data organized into chunks, similar entities groups together."

Similar presentations


Ads by Google