Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.

Similar presentations


Presentation on theme: "XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington."— Presentation transcript:

1 XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington

2 CS561 - Spring 2004.2 XML W3C standard to complement HTML origins: structured text SGML motivation: –HTML describes presentation –XML describes content http://www.w3.org/TR/2000/REC-xml-20001006 (version 2, 10/2000)

3 CS561 - Spring 2004.3 From HTML to XML HTML describes the presentation

4 CS561 - Spring 2004.4 HTML Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteboul, Buneman, Suciu Morgan Kaufmann, 1999

5 CS561 - Spring 2004.5 XML Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … XML describes the content

6 CS561 - Spring 2004.6 XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags

7 CS561 - Spring 2004.7 More XML: Attributes Foundations of Databases Abiteboul … 1995 attributes are alternative ways to represent data

8 CS561 - Spring 2004.8 More XML: Oids and References Jane Mary John oids and references in XML are just syntax

9 CS561 - Spring 2004.9 More XML: CDATA Section Syntax: Example: <>]]>

10 CS561 - Spring 2004.10 XML Namespaces http://www.w3.org/TR/REC-xml-names (1/99) name ::= [prefix:]localpart … 15 …. … 15 ….

11 CS561 - Spring 2004.11 … … XML Namespaces syntactic:, semantic: provide URL for schema defined here

12 CS561 - Spring 2004.12 XML Data Model Several competing models: Document Object Model (DOM): –http://www.w3.org/TR/2001/WD-DOM-Level-3-CMLS- 20010209/ (2/2001) –class hierarchy (node, element, attribute,…) –objects have behavior –defines API to inspect/modify the document Infoset - PSV (post schema validation) XML Query data model

13 CS561 - Spring 2004.13 XML Schemas http://www.w3.org/TR/xmlschema- 1/10/2000 generalizes DTDs uses XML syntax two documents: structure and datatypes –http://www.w3.org/TR/xmlschema-1 –http://www.w3.org/TR/xmlschema-2 XML-Schema is complex

14 CS561 - Spring 2004.14 XML Schemas DTD:

15 CS561 - Spring 2004.15 Elements v.s. Types in XML Schema DTD:

16 CS561 - Spring 2004.16 Types: –Simple types (integers, strings,...) –Complex types (regular expressions, like in DTDs) Element-type-element alternation: –Root element has a complex type –That type is a regular expression of elements –Those elements have their complex types... –... –On the leaves we have simple types Elements v.s. Types in XML Schema

17 CS561 - Spring 2004.17 Local and Global Types in XML Schema Local type: [define locally the person’s type] Global type: [define here the type ttt] Global types: can be reused in other elements

18 CS561 - Spring 2004.18 Local v.s. Global Elements in XML Schema Local element:... Global element:... Global elements: like in DTDs

19 CS561 - Spring 2004.19 Regular Expressions in XML Schema Recall the element-type-element alternation: [regular expression on elements] Regular expressions: A B C = A B C A B C = A | B | C A B C = (A B C).. = (...)*.. = (...)?

20 CS561 - Spring 2004.20 Attributes in XML Schema............ Attributes are associated to the type, not to the element Only to complex types; more trouble if we want to add attributes to simple types.

21 CS561 - Spring 2004.21 “Mixed” Content, “Any” Type Better than in DTDs: can still enforce the type, but now may have text between any elements Means anything is permitted there....

22 CS561 - Spring 2004.22 Derived Types by Extensions Corresponds to inheritance

23 CS561 - Spring 2004.23 Derived Types by Restrictions (*): may restrict cardinalities, e.g. (0,infty) to (1,1); may restrict choices; other restrictions… … [rewrite the entire content, with restrictions]... Corresponds to set inclusion

24 CS561 - Spring 2004.24 Keys in XML Schema Lawnmower Baby Monitor Lapis Necklace Sturdy Shelves Lawnmower Baby Monitor Lapis Necklace Sturdy Shelves XML: XML Schema:

25 CS561 - Spring 2004.25 Keys in XML Schema In general, two flavors:............ Note: all Xpath expressions “start” at the element currently being defined The fields must identify a single node

26 CS561 - Spring 2004.26 Keys in XML Schema Unique = guarantees uniqueness Key = guarantees uniqueness and existence All Xpath expressions are “restricted”: –/a/b | /a/c OK for selector” –//a/b/*/c OK for field Note: better than DTD’s ID mechanism

27 CS561 - Spring 2004.27 Keys in XML Schema Examples Recall: must have A single forename, Single surname

28 CS561 - Spring 2004.28 Foreign Keys in XML Schema Example

29 XPATH

30 CS561 - Spring 2004.30 XPath Goal = permit to access some nodes from document XPath main construct : axis navigation XPath path consists of one or more navigation steps, separated by / Navigation step : axis + node-test + predicates Examples –/descendant::node()/child::author –/descendant::node()/child::author[parent/attribute::booktitle =“XML”][2] XPath also offers shortcuts –no axis means child –//  / descendant-or-self::node()/

31 CS561 - Spring 2004.31 XPath- Child axis navigation author is shorthand for child::author. Examples: –aaa -- all the child nodes labeled aaa (1,3) –aaa/bbb -- all the bbb grandchildren of aaa children (4) –*/ bbb all the bbb grandchildren of any child (4,6) –. -- the context node –/ -- the root node aaa bbb cccaaa bbb ccc 1 23 4 567 context node

32 CS561 - Spring 2004.32 XPath- child axis navigation –/ doc -- all the doc children of the root –./ aaa -- all the aaa children of the context node (equivalent to aaa ) –text() -- all the text children of the context node –node() -- all the children of the context node (includes text and attribute nodes) –.. -- parent of the context node –.// -- the context node and all its descendants –// -- the root node and all its descendants –//text() -- all the text nodes in the document

33 CS561 - Spring 2004.33 Predicates –[2] -- the second child node of the context node –chapter[5] -- the fifth chapter child of the context node –[last()] -- the last child node of the context node –chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (the string-value is the concatenation of all the text on descendant text nodes) –person[.//firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “ Joe ”

34 CS561 - Spring 2004.34 Axis navigation So far, nearly all our expressions have moved us down by moving to child nodes. Exceptions were –. -- stay where you are –/ go to the root –// all descendants of the root –.// all descendants of the context node XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following-sibling, namespace, parent, preceding, preceding-sibling, self –Some of these ( self, parent ) describe single nodes, others describe sequences of nodes.

35 CS561 - Spring 2004.35 XPath Navigation Axes ancestor descendant followingpreceding following-siblingpreceding-sibling child attribute namespace self

36 CS561 - Spring 2004.36 XPath abbreviated syntax (nothing)child:: @attribute:: ///descendant-or-self::node().self::node().//descendant-or-self::node..parent::node() /(document root)

37 CS561 - Spring 2004.37 XPath Reasonably widely adopted -- in XML- Schema and query languages. Neither more expressive nor less expressive than regular path expressions

38 CS561 - Spring 2004.38 Quilt proposed by Chamberlin, Robbie and Florescu (from the authors’ slides) Leverage the most effective features of several existing and proposed query languages Design a small, clean, implementable language Cover the functionality required by all the XML Query use cases in a single language Write queries that fit on a slide Design a quilt, not a camel

39 CS561 - Spring 2004.39 Quilt/Kweelt URLs Quilt (the language) http://www.almaden.ibm.com/cs/people/chamberlin/quilt_lncs.pdf Kweelt (the implementation) http://db.cis.upenn.edu/Kweelt/ http://db.cis.upenn.edu/Kweelt/useCases (examples in these notes)

40 CS561 - Spring 2004.40 Quilt = XPath+“comprehension” XML -QL Quilt where in in … construct bind variables use variables for x in y in … where return bind variables use variables

41 CS561 - Spring 2004.41 Examples of Quilt ( http://db.cis.upenn.edu/Kweelt/useCases/R/Q1.qlt ) <!DOCTYPE items [ <!ELEMENT item_tuple (itemno, description, offered_by, start_date?, end_date?, reserve_price? )> ]> <!DOCTYPE bids [ ]>

42 CS561 - Spring 2004.42 The data 1001 Red Bicycle U01 1999-01-05 1999-01-20 40 1002 Motorcycle U02 1999-02-11 1999-03-15 500 … U02 1001 35 99-01-07 U04 1001 40 99-01-08 …

43 CS561 - Spring 2004.43 Query 1 FUNCTION date() { "1999-02-01" } ( FOR $i IN document("items.xml")//item_tuple WHERE $i/start_date LEQ date() AND $i/end_date GEQ date() AND contains($i/description, "Bicycle") RETURN $i/itemno, $i/description SORTBY (itemno) ) XPath expressions in orange simple function definitions dates are formatted so that lexicographic ordering gives the right result

44 CS561 - Spring 2004.44 Output from Q1 1003 Old Bicycle 1007 Racing Bicycle

45 CS561 - Spring 2004.45 Query Q2 For all bicycles, list the item number, description, and highest bid (if any), ordered by item number. ( FOR $i IN document("items.xml")//item_tuple LET $b := document("bids.xml")//bid_tuple[itemno = $i/itemno] WHERE contains($i/description, "Bicycle") RETURN $i/itemno, $i/description, IF ($b) THEN NumFormat("#####.##", max(-1, $b/bid)) ELSE "" SORTBY (itemno) ) use of variable in Xpath lots of coercion

46 CS561 - Spring 2004.46 Output from Q2 1001 Red Bicycle 55 1003 Old Bicycle 20 1007 Racing Bicycle 225 1008 Broken Bicycle

47 CS561 - Spring 2004.47 Query Q3 Find cases where a user with a rating worse (alphabetically greater than "C" ) offers an item with a reserve price of more than 1000. ( FOR $u IN document("users.xml")//user_tuple, $i IN document("items.xml")//item_tuple WHERE $u/rating GT 'C' AND $i/reserve_price GT 1000 AND $i/offered_by = $u/userid RETURN $u/name/text(), $u/rating/text(), $i/description/text(), $i/reserve_price ) Comparing sets with singletons Same rules as in XPath? In this case the DTD gives uniqueness

48 CS561 - Spring 2004.48 Quilt -- Attributes and IDs... <!DOCTYPE census [ <!ATTLIST person name ID #REQUIRED spouse IDREF #IMPLIED job CDATA #IMPLIED > ]>

49 CS561 - Spring 2004.49 Query Q1 Find Martha's spouse: FOR $m IN document("census.xml")//person[@name="Martha"] RETURN shallow($m/@spouse->{person@name}) The shallow function strips an element of its subelements. Dereferencing A hack. Kweelt does not read the DTD

50 CS561 - Spring 2004.50 Query Q6 Find Bill's grandchildren. ( FOR $b IN document("census.xml")//person[@name = "Bill"], $c IN $b/person | $b/@spouse->{person@name}/person, $g IN $c/person | $c/@spouse->{person@name}/person RETURN shallow($g) )

51 Query Languages - XQuery

52 CS561 - Spring 2004.52 Summary of XQuery FLWR expressions FOR and LET expressions Collections and sorting Resources XQuery: A Query Language for XML XQuery: A Query Language for XML Chamberlin, Florescu, et al. W3C recommendation: www.w3.org/TR/xquery/

53 CS561 - Spring 2004.53 XQuery Based on Quilt (which is based on XML-QL) http://www.w3.org/TR/xquery/2/2001 XML Query data model (ordered)

54 CS561 - Spring 2004.54 FLWR (“Flower”) Expressions FOR... LET... FOR... LET... WHERE... RETURN...

55 CS561 - Spring 2004.55 XQuery Find all book titles published after 1995: FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title Result: abc def ghi

56 CS561 - Spring 2004.56 XQuery For each author of a book by Morgan Kaufmann, list all books she published: FOR $a IN distinct( document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t FOR $a IN distinct( document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t distinct = a function that eliminates duplicates

57 CS561 - Spring 2004.57 XQuery Result: Jones abc def Smith ghi

58 CS561 - Spring 2004.58 XQuery FOR $x in expr -- binds $x to each element in the list expr LET $x = expr -- binds $x to the entire list expr –Useful for common subexpressions and for aggregations

59 CS561 - Spring 2004.59 XQuery count = a (aggregate) function that returns the number of elms FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p

60 CS561 - Spring 2004.60 XQuery Find books whose price is larger than average: LET $a=avg( document("bib.xml") /bib/book/@price) FOR $b in document("bib.xml") /bib/book WHERE $b/@price > $a RETURN $b LET $a=avg( document("bib.xml") /bib/book/@price) FOR $b in document("bib.xml") /bib/book WHERE $b/@price > $a RETURN $b

61 CS561 - Spring 2004.61 XQuery Summary: FOR-LET-WHERE-RETURN = FLWR FOR/LET Clauses WHERE Clause RETURN Clause List of tuples Instance of Xquery data model

62 CS561 - Spring 2004.62 FOR v.s. LET FOR Binds node variables  iteration LET Binds collection variables  one value

63 CS561 - Spring 2004.63 FOR v.s. LET FOR $x IN document("bib.xml") /bib/book RETURN $x FOR $x IN document("bib.xml") /bib/book RETURN $x Returns:... LET $x := document("bib.xml") /bib/book RETURN $x LET $x := document("bib.xml") /bib/book RETURN $x Returns:...

64 CS561 - Spring 2004.64 Collections in XQuery Ordered and unordered collections –/bib/book/author = an ordered collection –Distinct(/bib/book/author) = an unordered collection LET $a = /bib/book  $a is a collection $b/author  a collection (several authors...) RETURN $b/author Returns:...

65 CS561 - Spring 2004.65 Sorting in XQuery FOR $p IN distinct(document("bib.xml")//publisher) RETURN $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN $b/title, $b/@price SORTBY(price DESCENDING) SORTBY(name) FOR $p IN distinct(document("bib.xml")//publisher) RETURN $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN $b/title, $b/@price SORTBY(price DESCENDING) SORTBY(name)

66 CS561 - Spring 2004.66 Sorting in XQuery Sorting arguments: refer to name space of RETURN clause, not FOR clause To sort on an element you don’t want to display, first return it, then remove it with an additional query.

67 CS561 - Spring 2004.67 If-Then-Else FOR $h IN //holding RETURN $h/title, IF $h/@type = "Journal" THEN $h/editor ELSE $h/author SORTBY (title) FOR $h IN //holding RETURN $h/title, IF $h/@type = "Journal" THEN $h/editor ELSE $h/author SORTBY (title)

68 CS561 - Spring 2004.68 Existential Quantifiers FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title

69 CS561 - Spring 2004.69 Universal Quantifiers FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title


Download ppt "XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington."

Similar presentations


Ads by Google