Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005.

Similar presentations


Presentation on theme: "1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005."— Presentation transcript:

1 1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005

2 2 Outline XML: syntax, semantics, data, DTDs XPath XQuery Strongly recommended readings: XPath: http://java.sun.com/webservices/docs/ea2/tutorial/doc/JAXPXSLT2.html XQuery: http://www.w3.org/TR/xmlquery-use-cases/ http://www.xmlportfolio.com/xquery.html

3 3 Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will not discuss in class

4 4 Sample Data for Queries A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

5 5 Data Model for XPath bib book publisherauthor.. A WesleySerge Abiteboul The root The root element publisher Freeman..

6 6 XPath: Simple Expressions A Wesley Freeman Result: empty (there are no papers) /bib/book/publisher /bib/paper/publisher A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

7 7 XPath: Restricted Kleene Closure Serge Abiteboul Rick Hull Victor Vianu Jeffrey D. Ullman Rick //author /bib//first-name A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

8 8 Xpath: Text Nodes Result: Serge Abiteboul Victor Vianu Jeffrey D. Ullman Rick Hull doesn’t appear because he has firstname, lastname Functions in XPath: –text() = matches the text value –node() = matches any node (= * or @* or text()) –name() = returns the name of the current tag /bib/book/author/text() A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

9 9 Xpath: Wildcard Rick Hull * Matches any element //author/* A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

10 10 Xpath: Attribute Nodes Result: “55” @price means that price is has to be an attribute /bib/book/@price A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

11 11 Xpath: Predicates Result: Rick Hull /bib/book/author[firstname] A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems A Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases Freeman Jeffrey D. Ullman Database Systems

12 12 Xpath: More Predicates Result: … … /bib/book/author[firstname][address[.//zip][city]]/lastname

13 13 Xpath: More Predicates /bib/book[@price < 60] /bib/book[author/@age < 25] /bib/book[author/text()]

14 14 Xpath: Summary bibmatches a bib element *matches any element /matches the root element /bibmatches a bib element under root bib/papermatches a paper in bib bib//papermatches a paper in bib, at any depth //papermatches a paper at any depth paper|bookmatches a paper or a book @pricematches a price attribute bib/book/@pricematches price attribute in book, in bib bib/book[@price<“55”]/author/lastname matches…

15 15 XQuery Based on Quilt, which is based on XML-QL Uses XPath to express more complex queries

16 16 FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...

17 17 Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998 Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998 Sample Data for Queries (more or less)

18 18 FOR-WHERE-RETURN Find all book titles published after 1995: FOR $x IN document("bib.xml")/bib/book WHERE $x/year/text() > 1995 RETURN $x/title FOR $x IN document("bib.xml")/bib/book WHERE $x/year/text() > 1995 RETURN $x/title Result: abc def ghi

19 19 FOR-WHERE-RETURN Equivalently (perhaps more geekish) FOR $x IN document("bib.xml")/bib/book[year/text() > 1995] /title RETURN $x FOR $x IN document("bib.xml")/bib/book[year/text() > 1995] /title RETURN $x And even shorter: document("bib.xml")/bib/book[year/text() > 1995] /title

20 20 FOR-WHERE-RETURN Find all book titles and the year when they were published: FOR $x IN document("bib.xml")/ bib/book RETURN { $x/title/text() } { $x/year/text() } Result: abc 1995 def 2002 ghk 1980 Notice“{“ and “}”

21 21 Nesting For each author of a book by Morgan Kaufmann, list all books she published: FOR $b IN document(“bib.xml”)/bib, $a IN $b/book[publisher /text()=“Morgan Kaufmann”]/author RETURN { $a, FOR $t IN $b/book[author/text()=$a/text()]/title RETURN $t } FOR $b IN document(“bib.xml”)/bib, $a IN $b/book[publisher /text()=“Morgan Kaufmann”]/author RETURN { $a, FOR $t IN $b/book[author/text()=$a/text()]/title RETURN $t } In the RETURN clause comma concatenates XML fragments

22 22 Result Jones abc def Smith ghi Jones abc def Smith ghi

23 23 Aggregates Find all books with more than 3 authors: count = a function that counts avg = computes the average sum = computes the sum distinct-values = eliminates duplicates FOR $x IN document("bib.xml")/bib/book WHERE count($x/author)>3 RETURN $x

24 24 Aggregates Print all authors who published more than 3 books – be aware of duplicates ! FOR $b IN document("bib.xml")/bib, $a IN distinct-values($b/book/author/text()) WHERE count($b/book[author/text()=$a])>3 RETURN { $a }

25 25 Flattening “Flatten” the authors, i.e. return a list of (author, title) pairs FOR $b IN document("bib.xml")/bib/book, $x IN $b/title/text(), $y IN $b/author/text() RETURN { $x } { $y } Result: abc efg abc hkj

26 26 Re-grouping For each author, return all titles of her/his books FOR $b IN document("bib.xml")/bib, $x IN $b/book/author/text() RETURN { $x } { FOR $y IN $b/book[author/text()=$x]/title RETURN $y } What about duplicate authors ? Result: efg abc klm....

27 27 Re-grouping Same, but eliminate duplicate authors: FOR $b IN document("bib.xml")/bib LET $a := distinct-values($b/book/author/text()) FOR $x IN $a RETURN $x { FOR $y IN $b/book[author/text()=$x]/title RETURN $y }

28 28 Re-grouping Same thing: FOR $b IN document("bib.xml")/bib, $x IN distinct-values($b/book/author/text()) RETURN $x { FOR $y IN $b/book[author/text()=$x]/title RETURN $y }

29 29 Another Example Find book titles by the coauthors of “Database Theory”: FOR $b IN document("bib.xml")/bib, $x IN $b/book[title/text() = “Database Theory”], $y IN $b/book[author/text() = $x/author/text()] RETURN { $y/title/text() } Result: abc def abc ghk Question: Why do we get duplicates ?

30 30 Distinct-values Same as before, but eliminate duplicates: Result: abc def ghk distinct-values = a function that eliminates duplicates Need to apply to a collection of text values, not of elements – note how query has changed FOR $b IN document("bib.xml")/bib, $x IN $b/book[title/text() = “Database Theory”]/author/text(), $y IN distinct-values($b/book[author/text() = $x] /title/text()) RETURN { $y } FOR $b IN document("bib.xml")/bib, $x IN $b/book[title/text() = “Database Theory”]/author/text(), $y IN distinct-values($b/book[author/text() = $x] /title/text()) RETURN { $y }

31 31 SQL and XQuery Side-by-side Product(pid, name, maker, price) Find all product names, prices, sort by price SELECT x.name, x.price FROM Product x ORDER BY x.price SQL FOR $x in document(“db.xml”)/db/Product/row ORDER BY $x/price/text() RETURN { $x/name, $x/price } XQuery

32 32 abc 7 def 23.... Xquery’s Answer Notice: this is NOT a well-formed document ! (WHY ???)

33 33 Producing a Well-Formed Answer { FOR $x in document(“db.xml”)/db/Product/row ORDER BY $x/price/text() RETURN { $x/name, $x/price } }

34 34 abc 7 def 23.... Xquery’s Answer Now it is well-formed !

35 35 SQL and XQuery Side-by-side Product(pid, name, maker, price) Company(cid, name, city, revenues) Find all products made in Seattle SELECT x.name FROM Product x, Company y WHERE x.maker=y.cid and y.city=“Seattle” SQL FOR $r in document(“db.xml”)/db, $x in $r/Product/row, $y in $r/Company/row WHERE $x/maker/text()=$y/cid/text() and $y/city/text() = “Seattle” RETURN { $x/name } XQuery FOR $y in /db/Company/row[city/text()=“Seattle”], $x in /db/Product/row[maker/text()=$y/cid/text()] RETURN { $x/name } Cool XQuery

36 36 123 abc efg …. ….......

37 37 SQL and XQuery Side-by-side For each company with revenues < 1M count the products over $100 SELECT y.name, count(*) FROM Product x, Company y WHERE x.price > 100 and x.maker=y.cid and y.revenue < 1000000 GROUP BY y.cid, y.name FOR $r in document(“db.xml”)/db, $y in $r/Company/row[revenue/text() { $y/name/text() } { count($r/Product/row[maker/text()=$y/cid/text()][price/text()>100]) }

38 38 SQL and XQuery Side-by-side Find companies with at least 30 products, and their average price SELECT y.name, avg(x.price) FROM Product x, Company y WHERE x.maker=y.cid GROUP BY y.cid, y.name HAVING count(*) > 30 FOR $r in document(“db.xml”)/db, $y in $r/Company/row LET $p := $r/Product/row[maker/text()=$y/cid/text()] WHERE count($p) > 30 RETURN { $y/name/text() } avg($p/price/text()) A collection An element


Download ppt "1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005."

Similar presentations


Ads by Google