Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lecture 11: Xpath/XQuery Friday, October 20, 2006.

Similar presentations


Presentation on theme: "1 Lecture 11: Xpath/XQuery Friday, October 20, 2006."— Presentation transcript:

1 1 Lecture 11: Xpath/XQuery Friday, October 20, 2006

2 2 Outline XPath XQuery Useful pointers: XPath: –http://java.sun.com/webservices/docs/ea2/tutorial/doc/JAXPXSLT2.htmlhttp://java.sun.com/webservices/docs/ea2/tutorial/doc/JAXPXSLT2.html XQuery: –http://www.w3.org/TR/xmlquery-use-cases/http://www.w3.org/TR/xmlquery-use-cases/ –http://www.xmlportfolio.com/xquery.htmlhttp://www.xmlportfolio.com/xquery.html

3 3 Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will not discuss in class

4 4 Sample Data for Queries Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998 Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998

5 5 Data Model for XPath bib book publisherauthor.. Addison-WesleySerge Abiteboul The root The root element

6 6 XPath: Simple Expressions Result: 1995 1998 Result: empty (there were no papers) /bib/book/year /bib/paper/year /bib / / What’s the difference ?

7 7 XPath: Restricted Kleene Closure Result: Serge Abiteboul Rick Hull Victor Vianu Jeffrey D. Ullman Result: Rick //author /bib//first-name

8 8 Xpath: Attribute Nodes Result: “55” @price means that price is has to be an attribute /bib/book/@price

9 9 Xpath: Wildcard Result: Rick Hull * Matches any element @* Matches any attribute //author/*

10 10 Xpath: Text Nodes Result: Serge Abiteboul Victor Vianu Jeffrey D. Ullman Rick Hull doesn’t appear because he has firstname, lastname Functions in XPath: –text() = matches the text value –node() = matches any node (= * or @* or text()) –name() = returns the name of the current tag /bib/book/author/text()

11 11 Xpath: Predicates Result: Rick Hull /bib/book/author[firstname]

12 12 Xpath: More Predicates Result: … … /bib/book/author[firstname][address[.//zip][city]]/lastname How do we read this ? First remove all qualifiers (predicates): /bib/book/author /lastname Then add them one by one: /bib/book/author[firstname][address]/lastname etc

13 13 Xpath: More Predicates /bib/book[@price < 60] /bib/book[author/@age < 25] /bib/book[author/text()]

14 14 Xpath: More Axes /bib/book[.//review]. means current node /bib/book[./review] Same as /bib/book[review] /bib/author/. /firstname Same as /bib/author/firstname

15 15 Xpath: More Axes /bib/book[.//review/../comments].. means parent node Same as /bib/author/.. /author/zip Same as /bib/author/zip /bib/book[.//comments/review] /bib/book[.//*[comments][review]]

16 16 Xpath: Summary bibmatches a bib element *matches any element /matches the root element /bibmatches a bib element under root bib/papermatches a paper in bib bib//papermatches a paper in bib, at any depth //papermatches a paper at any depth paper|bookmatches a paper or a book @pricematches a price attribute bib/book/@pricematches price attribute in book, in bib bib/book[@price<“55”]/author/lastname matches…

17 17 XQuery Based on Quilt, which is based on XML-QL Uses XPath to express more complex queries

18 18 FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...

19 19 FOR-WHERE-RETURN Find all book titles published after 1995: FOR $x IN document("bib.xml")/bib/book WHERE $x/year/text() > 1995 RETURN $x/title FOR $x IN document("bib.xml")/bib/book WHERE $x/year/text() > 1995 RETURN $x/title Result: abc def ghi

20 20 FOR-WHERE-RETURN Equivalently (perhaps more geekish) FOR $x IN document("bib.xml")/bib/book[year/text() > 1995] /title RETURN $x FOR $x IN document("bib.xml")/bib/book[year/text() > 1995] /title RETURN $x And even shorter: document("bib.xml")/bib/book[year/text() > 1995] /title

21 21 FOR-WHERE-RETURN Find all book titles and the year when they were published: FOR $x IN document("bib.xml")/ bib/book RETURN { $x/title/text() } { $x/year/text() } Result: abc 1995 def 2002 ghk 1980

22 22 FOR-WHERE-RETURN Notice the use of “{“ and “}” What is the result without them ? FOR $x IN document("bib.xml")/ bib/book RETURN $x/title/text() $x/year/text() $x/title/text() $x/year/text()

23 23 Nesting For each author of a book by Morgan Kaufmann, list all books she published: FOR $b IN document(“bib.xml”)/bib, $a IN $b/book[publisher /text()=“Morgan Kaufmann”]/author RETURN { $a, FOR $t IN $b/book[author/text()=$a/text()]/title RETURN $t } FOR $b IN document(“bib.xml”)/bib, $a IN $b/book[publisher /text()=“Morgan Kaufmann”]/author RETURN { $a, FOR $t IN $b/book[author/text()=$a/text()]/title RETURN $t } In the RETURN clause comma concatenates XML fragments

24 24 Result Jones abc def Smith ghi Jones abc def Smith ghi

25 25 Aggregates Find all books with more than 3 authors: count = a function that counts avg = computes the average sum = computes the sum distinct-values = eliminates duplicates FOR $x IN document("bib.xml")/bib/book WHERE count($x/author)>3 RETURN $x

26 26 Aggregates Same thing: FOR $x IN document("bib.xml")/bib/book[count(author)>3] RETURN $x

27 27 Aggregates Print all authors who published more than 3 books – be aware of duplicates ! FOR $b IN document("bib.xml")/bib, $a IN distinct-values($b/book/author/text()) WHERE count($b/book[author/text()=$a])>3 RETURN { $a }

28 28 Aggregates Find books whose price is larger than average: FOR $b in document(“bib.xml”)/bib LET $a:=avg($b/book/price/text()) FOR $x in $b/book WHERE $x/price/text() > $a RETURN $x FOR $b in document(“bib.xml”)/bib LET $a:=avg($b/book/price/text()) FOR $x in $b/book WHERE $x/price/text() > $a RETURN $x

29 29 Flattening “Flatten” the authors, i.e. return a list of (author, title) pairs FOR $b IN document("bib.xml")/bib/book, $x IN $b/title/text(), $y IN $b/author/text() RETURN { $x } { $y } Result: abc efg abc hkj

30 30 Re-grouping For each author, return all titles of her/his books FOR $b IN document("bib.xml")/bib, $x IN $b/book/author/text() RETURN { $x } { FOR $y IN $b/book[author/text()=$x]/title RETURN $y } What about duplicate authors ? Result: efg abc klm....

31 31 Re-grouping Same, but eliminate duplicate authors: FOR $b IN document("bib.xml")/bib LET $a := distinct-values($b/book/author/text()) FOR $x IN $a RETURN $x { FOR $y IN $b/book[author/text()=$x]/title RETURN $y }

32 32 Re-grouping Same thing: FOR $b IN document("bib.xml")/bib, $x IN distinct-values($b/book/author/text()) RETURN $x { FOR $y IN $b/book[author/text()=$x]/title RETURN $y }

33 33 Another Example Find book titles by the coauthors of “Database Theory”: FOR $b IN document("bib.xml")/bib, $x IN $b/book[title/text() = “Database Theory”], $y IN $b/book[author/text() = $x/author/text()] RETURN { $y/title/text() } Result: abc def abc ghk Question: Why do we get duplicates ?

34 34 Distinct-values Same as before, but eliminate duplicates: Result: abc def ghk distinct-values = a function that eliminates duplicates Need to apply to a collection of text values, not of elements – note how query has changed FOR $b IN document("bib.xml")/bib, $x IN $b/book[title/text() = “Database Theory”]/author/text(), $y IN distinct-values($b/book[author/text() = $x] /title/text()) RETURN { $y } FOR $b IN document("bib.xml")/bib, $x IN $b/book[title/text() = “Database Theory”]/author/text(), $y IN distinct-values($b/book[author/text() = $x] /title/text()) RETURN { $y }

35 35 SQL and XQuery Side-by-side Product(pid, name, maker, price) Find all product names, prices, sort by price SELECT x.name, x.price FROM Product x ORDER BY x.price SQL FOR $x in document(“db.xml”)/db/Product/row ORDER BY $x/price/text() RETURN { $x/name, $x/price } XQuery

36 36 abc 7 def 23.... Xquery’s Answer Notice: this is NOT a well-formed document ! (WHY ???)

37 37 Producing a Well-Formed Answer { FOR $x in document(“db.xml”)/db/Product/row ORDER BY $x/price/text() RETURN { $x/name, $x/price } }

38 38 abc 7 def 23.... Xquery’s Answer Now it is well-formed !

39 39 SQL and XQuery Side-by-side Product(pid, name, maker, price) Company(cid, name, city, revenues) Find all products made in Seattle SELECT x.name FROM Product x, Company y WHERE x.maker=y.cid and y.city=“Seattle” SQL FOR $r in document(“db.xml”)/db, $x in $r/Product/row, $y in $r/Company/row WHERE $x/maker/text()=$y/cid/text() and $y/city/text() = “Seattle” RETURN { $x/name } XQuery FOR $y in /db/Company/row[city/text()=“Seattle”], $x in /db/Product/row[maker/text()=$y/cid/text()] RETURN { $x/name } Cool XQuery

40 40 123 abc efg …. ….......

41 41 SQL and XQuery Side-by-side For each company with revenues < 1M count the products over $100 SELECT y.name, count(*) FROM Product x, Company y WHERE x.price > 100 and x.maker=y.cid and y.revenue < 1000000 GROUP BY y.cid, y.name FOR $r in document(“db.xml”)/db, $y in $r/Company/row[revenue/text() { $y/name/text() } { count($r/Product/row[maker/text()=$y/cid/text()][price/text()>100]) }

42 42 SQL and XQuery Side-by-side Find companies with at least 30 products, and their average price SELECT y.name, avg(x.price) FROM Product x, Company y WHERE x.maker=y.cid GROUP BY y.cid, y.name HAVING count(*) > 30 FOR $r in document(“db.xml”)/db, $y in $r/Company/row LET $p := $r/Product/row[maker/text()=$y/cid/text()] WHERE count($p) > 30 RETURN { $y/name/text() } avg($p/price/text()) A collection An element


Download ppt "1 Lecture 11: Xpath/XQuery Friday, October 20, 2006."

Similar presentations


Ads by Google