Presentation is loading. Please wait.

Presentation is loading. Please wait.

CIS550 Handout 6 1 XPATH. CIS550 Handout 6 2 XPath Primary goal = to permit to access some nodes from a given document XPath main construct : axis navigation.

Similar presentations


Presentation on theme: "CIS550 Handout 6 1 XPATH. CIS550 Handout 6 2 XPath Primary goal = to permit to access some nodes from a given document XPath main construct : axis navigation."— Presentation transcript:

1 CIS550 Handout 6 1 XPATH

2 CIS550 Handout 6 2 XPath Primary goal = to permit to access some nodes from a given document XPath main construct : axis navigation An XPath path consists of one or more navigation steps, separated by / A navigation step is a triplet: axis + node-test + list of predicates Examples –/descendant::node()/child::author –/descendant::node()/child::author[parent/attribute::booktitle = “XML”][2] XPath also offers some shortcuts –no axis means child –//  / descendant-or-self::node()/

3 CIS550 Handout 6 3 XPath- child axis navigation author is shorthand for child::author. Examples: –aaa -- all the child nodes labeled aaa (1,3) –aaa/bbb -- all the bbb grandchildren of aaa children (4) –*/ bbb all the bbb grandchildren of any child (4,6) –. -- the context node –/ -- the root node aaa bbb cccaaa bbb ccc 1 23 4 567 context node

4 CIS550 Handout 6 4 XPath- child axis navigation (cont) –/ doc -- all the doc children of the root –./ aaa -- all the aaa children of the context node (equivalent to aaa ) –text() -- all the text children of the context node –node() -- all the children of the context node (includes text and attribute nodes) –.. -- parent of the context node –.// -- the context node and all its descendants –// -- the root node and all its descendants –//text() -- all the text nodes in the document

5 CIS550 Handout 6 5 Predicates –[2] -- the second child node of the context node –chapter[5] -- the fifth chapter child of the context node –[last()] -- the last child node of the context node –chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (the string-value is the concatenation of all the text on descendant text nodes) –person[.//firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “ Joe ” –From the XPath specification: NOTE: If $x is bound to a node set then $x = “foo” does not mean the same as not ($x != “foo”)...

6 CIS550 Handout 6 6 Unions of Path Expressions employee consultant -- the union of the employee and consultant nodes that are children of the context node For some reason person/(employeeconsultant) --as in regular path expressions -- is not allowed However person/node()[boolean(employeeconsultant)] is allowed!! From the XPATH specification: –The boolean function converts its argument to a boolean as follows: a number is true if and only if it is neither positive or negative zero nor NaN a node-set is true if and only if it is non-empty a string is true if and only if its length is non-zero an object of a type other than the four basic types is converted to a boolean in a way that is dependent on that type

7 CIS550 Handout 6 7 Axis navigation So far, nearly all our expressions have moved us down the by moving to child nodes. Exceptions were –. -- stay where you are –/ go to the root –// all descendants of the root –.// all descendants of the context node XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following- sibling, namespace, parent, preceding, preceding-sibling, self –Some of these ( self, parent ) describe single nodes, others describe sequences of nodes.

8 CIS550 Handout 6 8 XPath Navigation Axes (merci, Arnaud) ancestor descendant followingpreceding following-siblingpreceding-sibling child attribute namespace self

9 CIS550 Handout 6 9 XPath abbreviated syntax (nothing)child:: @attribute:: ///descendant-or-self::node().self::node().//descendant-or-self::node..parent::node() /(document root)

10 CIS550 Handout 6 10 XPath Reasonably widely adopted -- in XML-Schema and query languages. Neither more expressive nor less expressive than regular path expressions (can’t do (ab)* ) Particularly messy in some areas: –defining order of results –overloading of operations, e.g. [chapter/title = “Introduction”] why not [ “Introduction” IN chapter/title] ?

11 CIS550 Handout 6 11 Quilt proposed by Chamberlin, Robbie and Florescu (from the authors’ slides) Leverage the most effective features of several existing and proposed query languages Design a small, clean, implementable language Cover the functionality required by all the XML Query use cases in a single language Write queries that fit on a slide Design a quilt, not a camel

12 CIS550 Handout 6 12 Quilt/Kweelt URLs Quilt (the language) http://www.almaden.ibm.com/cs/people/chamberlin/quilt_lncs.pdf Kweelt (the implementation) http://db.cis.upenn.edu/Kweelt/ http://db.cis.upenn.edu/Kweelt/useCases (examples in these notes)

13 CIS550 Handout 6 13 Quilt = XPath + “comprehension” syntax XML -QL Quilt where in in … construct bind variables use variables for x in y in … where return bind variables use variables

14 CIS550 Handout 6 14 Examples of Quilt (from http://db.cis.upenn.edu/Kweelt/useCases/R/Q1.qlt ) Relational data -- two DTDs: <!DOCTYPE items [ <!ELEMENT item_tuple (itemno, description, offered_by, start_date?, end_date?, reserve_price? )> ]> <!DOCTYPE bids [ ]>

15 CIS550 Handout 6 15 The data 1001 Red Bicycle U01 1999-01-05 1999-01-20 40 1002 Motorcycle U02 1999-02-11 1999-03-15 500 … U02 1001 35 99-01-07 U04 1001 40 99-01-08 …

16 CIS550 Handout 6 16 Query 1 FUNCTION date() { "1999-02-01" } ( FOR $i IN document("items.xml")//item_tuple WHERE $i/start_date LEQ date() AND $i/end_date GEQ date() AND contains($i/description, "Bicycle") RETURN $i/itemno, $i/description SORTBY (itemno) ) XPath expressions in orange simple function definitions dates are formatted so that lexicographic ordering gives the right result

17 CIS550 Handout 6 17 Output from Q1 1003 Old Bicycle 1007 Racing Bicycle

18 CIS550 Handout 6 18 Query Q2 For all bicycles, list the item number, description, and highest bid (if any), ordered by item number. ( FOR $i IN document("items.xml")//item_tuple LET $b := document("bids.xml")//bid_tuple[itemno = $i/itemno] WHERE contains($i/description, "Bicycle") RETURN $i/itemno, $i/description, IF ($b) THEN NumFormat("#####.##", max(-1, $b/bid)) ELSE "" SORTBY (itemno) ) use of variable in Xpath lots of coercion

19 CIS550 Handout 6 19 Output from Q2 1001 Red Bicycle 55 1003 Old Bicycle 20 1007 Racing Bicycle 225 1008 Broken Bicycle

20 CIS550 Handout 6 20 Query Q3 Find cases where a user with a rating worse (alphabetically greater than "C" ) offers an item with a reserve price of more than 1000. ( FOR $u IN document("users.xml")//user_tuple, $i IN document("items.xml")//item_tuple WHERE $u/rating GT 'C' AND $i/reserve_price GT 1000 AND $i/offered_by = $u/userid RETURN $u/name/text(), $u/rating/text(), $i/description/text(), $i/reserve_price ) Comparing sets with singletons Same rules as in XPath? In this case the DTD gives uniqueness

21 CIS550 Handout 6 21 Quilt -- Attributes and IDs... <!DOCTYPE census [ <!ATTLIST person name ID #REQUIRED spouse IDREF #IMPLIED job CDATA #IMPLIED > ]>

22 CIS550 Handout 6 22 Query Q1 Find Martha's spouse: FOR $m IN document("census.xml")//person[@name="Martha"] RETURN shallow($m/@spouse->{person@name}) The shallow function strips an element of its subelements. Dereferencing A hack. Kweelt does not read the DTD

23 CIS550 Handout 6 23 Query Q6 Find Bill's grandchildren. ( FOR $b IN document("census.xml")//person[@name = "Bill"], $c IN $b/person | $b/@spouse->{person@name}/person, $g IN $c/person | $c/@spouse->{person@name}/person RETURN shallow($g) )

24 CIS550 Handout 6 24 Status of XML types DTDs -- widely used, but limited –lack of base types –untyped pointers (IDs and IDREFs) –no tuple types (hence no record subtyping or inheritance) XML-schema -- lots of hoopla, but –not stable –too complex Others: RDF (not really types for XML) SOX, Relax, Schematron Opinions: –None of these is good for database design. –Something new is needed (some core of XML-schema)

25 CIS550 Handout 6 25 Status of XML Query languages None of them are really typed (by a DTD or anything else). Type errors show up as empty answers XML-QL probably the most elegant, but too powerful. XSL and descendants are working (in IE 5) Quilt -- nice extension of XPath, but XPath is quite complex. Nothing like an “algebra” for any of these (though some ideas are now emerging) Nothing like database optimization yet exists. Do we need something simpler?


Download ppt "CIS550 Handout 6 1 XPATH. CIS550 Handout 6 2 XPath Primary goal = to permit to access some nodes from a given document XPath main construct : axis navigation."

Similar presentations


Ads by Google