CS 157B: Database Management Systems II February 13 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.

Slides:



Advertisements
Similar presentations
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Advertisements

Managing Data Exchange: XPath
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
XML 6.6 XPath 6. What is XPath? XPath is a syntax used for selecting parts of an XML document The way XPath describes paths to elements is similar to.
2-Jun-15 XPath. 2 What is XPath? XPath is a syntax used for selecting parts of an XML document The way XPath describes paths to elements is similar to.
Lecture 13. The various node tests also work on this axis: eg node() This book has descendant-or- self nodes As expected, text nodes are included in the.
Lecture 13. The various node tests also work on this axis: eg node() This book has descendant-or- self nodes As expected, text nodes are included in the.
XPath s/xmljava/chapters/ch16.html.
Java API for XML Processing (JAXP) CSE 4/586: Distributed Systems Department of Computer Science and Engineering University at Buffalo, New York Jia Zhao.
28-Jun-15 StAX Streaming API for XML. XML parser comparisons DOM is Memory intensive Read-write Typically used for documents smaller than 10 MB SAX is.
Lecture 12. Default Processing in XSLT The default processing in XSLT is to process the XPath root node The default processing for various node types.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Xpath Sources:
PHP and XML TP2653 Advance Web Programming. PHP and XML PHP5 – XML-based extensions, library and functionalities (current XAMPP PHP version is )
CS 174: Web Programming April 16 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
10/06/041 XSLT: crash course or Programming Language Design Principle XSLT-intro.ppt 10, Jun, 2004.
Lecture 7 of Advanced Databases XML Querying & Transformation Instructor: Mr.Ahmed Al Astal.
XML for E-commerce III Helena Ahonen-Myka. In this part... n Transforming XML n Traversing XML n Web publishing frameworks.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
Introduction to XPath Web Engineering, SS 2007 Tomáš Pitner.
Lecture 6 of Advanced Databases XML Querying & Transformation Instructor: Mr.Eyad Almassri.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
CS 157B: Database Management Systems II May 8 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
XPath Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
Computing & Information Sciences Kansas State University Thursday, 15 Mar 2007CIS 560: Database System Concepts Lecture 24 of 42 Thursday, 15 March 2007.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
Database Systems Part VII: XML Querying Software School of Hunan University
XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.
CS 157B: Database Management Systems II February 25 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
WPI, MOHAMED ELTABAKH PROCESSING AND QUERYING XML 1.
CS 153: Concepts of Compiler Design September 16 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
XML Refresher Course Bálint Joó School of Physics University of Edinburgh May 02, 2003.
1 XML Data Management Extracting Data from XML: XPath Werner Nutt based on slides by Sara Cohen, Jerusalem.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
CS 157B: Database Management Systems II February 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
1 XML Data Management XPath Principles Werner Nutt.
CS 157B: Database Management Systems II February 18 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
CMPE 226 Database Systems November 4 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
CS 174: Web Programming November 4 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
1 JAXP & XPATH. Objectives 2  XPath  JAXP Processing of XPath  Workshops.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
XML and Object Serialization. Structure of an XML Document Header Root Element Start Tags / End Tags Element Contents – Child Elements – Text – Both (mixed.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
Martin Kruliš by Martin Kruliš (v1.1)1.
CS 152: Programming Language Paradigms April 7 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 12 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
CMPE 226 Database Systems November 4 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
CS 153: Concepts of Compiler Design September 14 Class Meeting
Intro to XML.
CS 480: Database Systems Lecture 28 March 22, 2013.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
XML Parsers.
XML Programming in Java
XML and Web Services (II/2546)
Presentation transcript:

CS 157B: Database Management Systems II February 13 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 2 XML Namespaces  XML namespaces are similar to Java packages. They prevent element name clashes. An element name can be in the scope of a namespace.  A namespace name must be unique. Use a URI (uniform resource identifier) as the name.  Start with your unique domain name. A URL is a common form of URI.  The URL doesn’t have to point to an actual file.  Declare a namespace in an element tag. The scope of the namespace is that element and its children. Example: A namespace declared in the root element has the entire XML document in its scope. _

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 3 XML Namespaces  Example: This declares the default namespace. All elements in its scope are in the default namespace. Elements not in any namescape scope are “in no namespace”.  Declare a namespace with a prefix: Non-default namespace. Prefix element names that are in the namespace scope. The element containing the declaration is itself in the scope. The prefix is considered part of the element name....

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 4 XML Namespaces  Nested namespaces: Why are the book and author namespaces necessary?  Prevent the book title and the author title name clash. <library xmlns=" xmlns:bk=" xmlns:au=" Java Programming Dr....

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 5 XML Namespaces  Alternate: > Java Programming Dr....

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 6 Parsing XML Documents  In order for your Java program to work with an XML document, it must be able to read and understand its contents. We say that a program “parses” an XML document when it reads the document and decomposes it into elements, attributes, content, etc. XML has a very simple syntax, which makes XML documents relatively easy to parse.  Java has packages specifically for parsing XML: JAXP DOM parser JAXP SAX parser JAXP StAX parser JAXP = Java API for XML Processing

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 7 DOM Parser  DOM is the Document Object Model.  The Java DOM parser parses an XML document and builds a DOM tree consisting of node objects. The nodes represent elements, attributes, content, etc. Compiler writers would call this a “parse tree”.  Once the DOM tree has been built, your program can navigate the tree and visit each node. node.getNodeType() node.getNodeName() node.getValue() node.getChildNodes() node.getAttributes() DOMParserDemo

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 8 DOM Parser  Advantages Easy to build. Easy to navigate and visit the nodes.  Random access is possible. You can modify the nodes. You can modify the tree structure.  Disadvantages If the XML document is large, the DOM tree will take up lots of memory.

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 9 SAX Parser  SAX is the Simple API for XML.  A SAX parser is a “push” parser. As it reads an XML document, it pushes an event to your program as it recognizes each element, attribute, content, etc. Your program contains “event handlers” which are methods that the parser automatically calls as each event occurs.  A SAX parser is in control of your program while it’s parsing a document. “Inversion of control” SAXParserDemo

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 10 SAX Parser  Advantages Your event handlers do all the work of processing each component as the parser encounters it as it reads the document. Very low memory requirements compared to the DOM parser.  There is no tree that represents the entire document in memory – unless your program builds it.  Disadvantages You process the document’s components in the order that they’re parsed. You cannot randomly visit the components. Read only – you cannot modify the XML document.

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 11 StAX Parser  StAX is the Streaming API for XML.  A StAX is a “pull” parser. Your program remains in control as it parses an XML document. Your program requests the delivery of parsing events.  Two APIs to navigate an XML document. Cursor API Iterator API  Your program can both read and write XML documents.

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 12 StAX Cursor API  Use the cursor API to do a walk-through of the XML components in document order. Lowest-level access to a document’s structure and content.  XMLStreamReader interface. next() and hasNext() methods to scan a document’s content. The next() method returns an integer token that represents the next parse event. Depending on the next event, call the appropriate methods of the interface.

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 13 StAX Iterator API  Use the iterator API to access a document’s structure and content in the form of “event objects”.  XMLEventReader interface nextEvent() and hasNext() methods to iterate over a document’s structure and contents. The nextEvent() method returns an XMLEvent object. The XMLEvent interface has methods to determine and process the next event type. StAXParserDemo

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 14 Object-XML Mapping (The Hard Way)  Recall how to do object-relational mapping using JDBC. As your program extracts field values from a row set, it can create objects with those values. Your program has to explicitly create the objects.  Similarly, your program can use any of the XML parsing APIs to create objects from a document. The objects can represent the elements. Define a different Java class for each element type.  Hibernate automates mapping Java classes to database tables and automatically creates objects at run time. We’ll soon see how to automate object-XML mapping. _

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 15 XPath  XPath views an XML document as a node tree. Everything in the document is a node.  element, attribute, text content Every node is related to another node.  parent, child, ancestor, descendant, sibling  An XPath expression is a location path that walks the tree starting from the root in order to select a single node or a set of nodes. The selection is based on the node relations and conditional tests on attribute values. _

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 16 Location Paths  XPath expressions look like Unix file paths. / represents the root node of the document. Add more element names separated by / to step down the tree.  A location path can select a single node in the tree or a set of nodes. _

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 17 Location Path Examples Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan XML document adapted from the book Pro XML Development with Java Technology, by Ajay Vohra and Deepak Vohra, Apress, 2006

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 18 Location Path Examples  /catalog returns the entire document tree.  /catalog/journal/article returns all the article nodes.  /catalog/journal/* returns all the child nodes of journal nodes. Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan XPathDemo

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 19 Location Path Examples  //title returns all title nodes. // means “all descendants of the root node”.  returns all date attributes. Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 20 Location Path Examples  Title nodes of all journal articles at the advanced level. Also: Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan /child::catalog/child::journal/child::article[attribute::level='Advanced']/child::title

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 21 Location Path Examples  Technology']/article All article nodes in journals with title “Java Technology”.  /catalog/journal/article[2] Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 22 Location Path Examples  Technology']] All article nodes whose ancestor is a journal with title “Java Technology”. Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 23 Location Path Examples  //article[preceding-sibling::article] All article nodes that have an earlier (to the left) sibling that’s an article.   //author[. = 'Sean Sullivan']/ancestor::journal Design XML Schemas Using UML Ayesha Malik Design service-oriented architecture frameworks with J2EE technology Naveen Balani Advance DAO Programming Sean Sullivan

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 24 XPath Axes  child:: Shorthand: just the element name of the child  descendant:: Shorthand: //  attribute::  self::  descendant-or-self::  following-sibling::  preceding-sibling::  following::  parent::  ancestor::  preceding::  ancestor-or-self::  namespace::

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 25 XPath Expressions and Functions  XPath expressions can also include arithmetic comparisons let (local variables) if, for, some, every  XPath functions include count() format-number(), round-number() substring-before(), substring-after() contains() string-length() translate()  Exercise for the reader!

Department of Computer Science Spring 2013: February 13 CS 157B: Database Management Systems II © R. Mak 26 XPath and Java  Create an XPath object:  Parse an XML document with the DOM parser:  Evaluate an XPath expression:  You can also pre-compile an XPath expression. Similar to a JDBC prepared statement. XPathFactory factory = XPathFactory.newInstance(); XPath xPath = factory.newXPath(); DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document document = builder.parse(xmlFile); String expr = "//title"; xPath.reset(); NodeList nodeList = (NodeList) xPath.evaluate(expr, document, XPathConstants.NODESET);