Presentation is loading. Please wait.

Presentation is loading. Please wait.

Querying XML, Part II Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 5, 2008.

Similar presentations


Presentation on theme: "Querying XML, Part II Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 5, 2008."— Presentation transcript:

1 Querying XML, Part II Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 5, 2008

2 2 Today  Two languages building upon XPath:  XSLT  XQuery  Reminder: Assignment 1 milestone 2 due Feb. 12th  Readings for next time:  www.wikipedia.org: DNS, Sections 1-3 www.wikipedia.org  http://quark.humbug.org.au/publications/ldap/ http://quark.humbug.org.au/publications/ldap/

3 3 Users of XPath  XML Schema uses simple XPaths in defining keys and uniqueness constraints  XLink and XPointer, hyperlinks for XML  XSLT – useful for converting from XML to other representations (e.g., HTML, PDF, SVG)  XQuery – useful for restructuring an XML document or combining multiple documents

4 4 XSLT: Transforming an XML Document  XSLT: XML Stylesheet Language Transformations  Companion to XSL:FO, formatting for XML  A language for substituting structured fragments for XML content  Transforms single document  single document  Useful for XML  XML conversions, XML  HTML  Runs on server side (Apache Cocoon) or client-side (modern browsers)

5 5 A Functional Language for XML  XSLT is based on a series of templates that match different parts of an XML document  There’s a policy for what rule or template is applied if more than one matches (it’s not what you’d think!)  XSLT templates can invoke other templates  XSLT templates can be nonterminating (beware!)  XSLT templates are based on XPath “match”es, and we can also apply other templates (potentially to “select”ed XPaths)  Within each template, directly describe what should be output

6 6 An XSLT Template  An XML document itself  XML tags create output OR are XSL operations  All XSL tags are prefixed with “xsl” namespace  All non-XSL tags are part of the XML output  Common XSL operations:  template with a match XPath  Recursive call to apply-templates, which may also select where it should be applied  Attach to XML document with a processing-instruction: http://www.com/my.xsl

7 7 An Example XSLT Stylesheet This is DBLP …

8 8 XSLT Processing Model  List of source nodes  result tree fragment(s)  Start with root  Find all template rules with matching patterns from root  Find “best” match according to some heuristics  Set the current node list to be the set of things it maches  Iterate over each node in the current node list  Apply the operations of the template  “Append” the results of the matching template rule to the result tree structure  Repeat recursively if specified to by apply-templates

9 9 What If There’s More than One Match?  Eliminate rules of lower precedence due to importing  Break a rule into any | branches and consider separately  Choose rule with highest computed or specified priority  Simple rules for computing priority based on “precision”:  QName preceded by XPath child/axis specifier: priority 0  NCName preceded by child/axis specifier: priority -0.25  NodeTest preceded by child/axis specifier: pririty -0.5  else priority 0.5

10 10 Other Common Operations  Iteration:  Conditionals:  Copying current node and children to the result set:

11 11 Creating Output Nodes  Return text/attribute data (this is a default rule):  Create an element from text (attribute is similar):  Copy nodes matching a path

12 12 Embedding Stylesheets  You can “import” or “include” one stylesheet from another: http://www.com/my.xsl http://www.com/my.xsl  “Include”: the rules get same precedence as in including template  “Import”: the rules are given lower precedence

13 13 XSLT Summary  A very powerful, template-based transformation language for XML document  other structured document  Commonly used to convert XML  PDF, SVG, GraphViz DOT format, HTML, WML, …  Primarily useful for presentation of XML or for very simple conversions  But sometimes we need more complex operations when converting data from one source to another  Joins – combining and correlating information from multiple sources  Aggregation – computing averages, counts, etc.

14 14 Why XSLT Isn’t Enough XSLT is focused on reformatting documents  Stylesheets are focused around one XML file  XML file must reference the stylesheet What if we want to:  Manage and combine collections of XML documents?  Make Web service requests for XML?  “Glue together” different Web service requests?  Query for keywords within documents, with ranked answers  This is where XQuery plays a role

15 15 XQuery’s Basic Form  The model: bind nodes (or node sets) to variables; operate over each legal combination of bindings; produce a set of nodes  “FLWOR” statement pattern: for {iterators that bind variables} let {collections} where {conditions} order by {order-conditions} return {output constructor}

16 16 Inputs in XQuery  In XPath, we can only ask for one thing  Predicates, in [brackets], can “branch”, but we can’t ask for an item and its name  In XQuery, we solve this by introducing variable bindings  We can name different parts of the tree (based on XPaths)  We can express XPaths relative to named XPaths

17 17 “Iterations” in XQuery A series of (possibly nested) FOR statements assigning the results of XPaths to variables  Something like a template that pattern-matches, produces a tuple of bindings  For each of these, we evaluate the WHERE and possibly output the RETURN template  document() or doc() function specifies an input file as a URI $dRoot$rtEl$conf (root) <conf key=“2”… for $dRoot in doc (“http://.../dblp.xml”)http://.../dblp.xml for $rtEl in $dRoot/dblp, $conf in $rtEl/conf, …

18 18 Example XML Data Root ?xml dblp mastersthesis inproceedings mdate key authortitleyear school authortitle year crossref ee mdate key 2002… ms/Brown92 Kurt Brown PRPL… 1992 wisc 2002.. conf/sigm../ Paul R. On… sigmod-97 1997 www… university name key wisc Wisconsin country USA

19 19 An XQuery Example for $i in doc (“dblp.xml”)/dblp/inproceedings [author/text() = “Paul R.”] return { $i/title/text() } { $i/@key } { $i/crossref/text() }

20 20 Recall the Relational Join Operator  Two sets of items (e.g., tables) with attributes  We want to combine + correlate info across tables, based on matching the sid attributes  Join STUDENT with EnrolledIn, where STUDENT.sid = EnrolledIn.SID sidname 1Jill 2Qun 3Nitin 4Martha sidcourse 2380-f03 3330-f03 2555-s06 3455-s06 4 STUDENT EnrolledIn

21 21 How Might This Extend to XQuery? for $i in doc (“dblp.xml”)/dblp/inproceedings, $r in $i/crossref/text(), $c in doc (“dblp.xml”)/dblp/conf, $n in $c/@name where ___________ return { $i, $c }


Download ppt "Querying XML, Part II Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 5, 2008."

Similar presentations


Ads by Google