Download presentation
Presentation is loading. Please wait.
Published byMartha Henderson Modified over 10 years ago
1
XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Tutorial: Introduction to XML and Java: XML, dom4j and XPath Eran Toch Methodologies in the Development of Information Systems December 2003
2
2 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Sources Major Sources: –http://www.cis.upenn.edu/~cis550/slides/xml.ppt CIS550 Course Notes, U. Penn, source for many slides –http://www.cs.technion.ac.il/~oshmu/ 236804 - Seminar in Computer Science 4: XML - Technology, Systems and Theory –http://dom4j.org
3
3 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Agenda Short Introduction to XML –What is XML –Structure and Terminology –JAVA APIs for XML: an Overview dom4j –Parsing an XML document –Writing to an XML document Xpath –Xpath Queries –Xpath in dom4j References
4
4 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development The Structure of XML XML consists of tags and text Tags come in pairs... They must be properly nested...... --- good...... --- bad
5
5 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development XML text XML has only one “basic” type -- text. It is bounded by tags e.g. The Big Sleep 1935 --- 1935 is still text XML text is called PCDATA (for parsed character data). It uses a 16-bit encoding, e.g. \&\#x0152 for the Hebrew letter Mem Later we shall see how new types are specified by XML-data
6
6 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development XML structure Nesting tags can be used to express various structures. E.g. A tuple (record): Jeff Cohen 04-828-1345 054-470-778 jeffc@cs.technion.ac.il
7
7 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development XML structure (cont.) We can represent a list by using the same tag repeatedly:...
8
8 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development XML structure (cont.) Nested tags can be part of a list too: Yossi Orr 04-828-1345 yossio@cs.technion.ac.il Irma Levy 03-426-1142 irmal@yourmail.com
9
9 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Terminology The segment of an XML document between an opening and a corresponding closing tag is called an element. Meta date about an element can appear in an attribute. Ortal Derech 04-8732122 054-646888 oderech@tx.technion.ac.il element element, a sub-element of attribute text
10
10 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development XML is tree-like person name email tel Malcolm Atchison (215) 898 4321 mp@dcs.gla.ac.sc
11
11 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development A Complete XML Document Jeff Cohen 04-828-1345 054-470-778 jeffc@cs.technion.ac.il Tells whether or not this document references an external entity or an external data type specification
12
12 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development XML Structure Definitions DTD –Document Type Definition – defines structure constraints for XML documents XML Schema –Same as DTD, more powerful because it includes facilities to specify the data type of elements and it is based on XML. Namespaces –Namespaces are a way of preventing name clashes among elements from more than one source within the same XML document.
13
13 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development More Standards Xpath –XML Path Language, a language for locating parts of an XML document. Xquery –A query language for XML documents (like SQL…). XSLT –XSL Transformations, a language for transforming XML documents into other XML documents. RDF –Resource Description Framework. A formal knowledge model from the World Wide Web.
14
14 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Why Is XML Important? Because it exists, and everybody uses it. Plain Text - you can create and edit files with anything. Data Identification - XML tells you what kind of data you have, not how to display it. Separation from style. Hierarchical, and easily processed.
15
15 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development An Overview of the APIs JAXP: Java API for XML Processing –It provides a common interface for creating and using the standard SAX, DOM, and XSLT APIs. JAXB: Java Architecture for XML Binding –defines a mechanism for writing out Java objects as XML. JDOM –Represents an XML file as a tree of objects (sophisticated version of DOM) dom4j –Lightweight version of JDOM.
16
16 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Agenda Introduction to XML –What is XML –Structure and Terminology –JAVA APIs for XML: an Overview dom4j –Parsing an XML document –Writing to an XML document Xpath –Xpath Queries –Xpath in dom4j References
17
17 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development dom4j An Open Source XML framework for Java. Allows you to read, write, navigate, create and modify XML documents. Integrates with DOM and SAX. Full XPath support. XSLT Support.
18
18 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Download and Use Go to: http://dom4j.org.http://dom4j.org Go to http://dom4j.org/download.html, and download the latest release (current = 1.4).http://dom4j.org/download.html Unzip. Don’t forget the classpath. When working in an IDE, don’t forget to add the log4j.jar library. Javadoc: http://dom4j.org/apidocs/index.html.http://dom4j.org/apidocs/index.html Quick start guide: http://dom4j.org/guide.html.http://dom4j.org/guide.html
19
19 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Opening an XML Document import org.dom4j.*; public class Foo { public Document parse(String id) throws DocumentException{ SAXReader reader = new SAXReader(); Document document = reader.read(id); return document; } We can read: file, URL, InputStream, String
20
20 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Example XML File <salesdata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="C:\Documents and Settings\eran\ My Documents\Academic\Courses\XML\xpath_ass_schema.xsd"> 1997 central 34 east 34 west 32 1998 east 35 region> west 42
21
21 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Accessing XML Elements public void dump(Document document) throws DocumentException{ Element root = document.getRootElement(); for (Iterator i = root.elementIterator(); i.hasNext(); ) { Element element = (Element)i.next(); System.out.println(element.getQualifiedName()); System.out.println(element.getTextTrim()); System.out.println(element.elementText("theyear")); } Accessing root element Retrieving child elements Retrieving element text Retrieving element name Retrieving the text of the child element “theyear”
22
22 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Accessing XML Elements – cont’d What will be the output of dump()? year 1997 year 1998 Why?
23
23 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Accessing XML Elements Recursively public void go(Element element, int depth){ for (int d=0; d<depth; d++){ System.out.print(" "); } System.out.print(element.getQualifiedName()); System.out.println(" "+ element.getTextTrim()); for (Iterator i = element.elementIterator(); i.hasNext(); ) { Element son = (Element)i.next(); go(son, depth+1); } What will be the output?
24
24 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Accessing Recursively – cont’d salesdata year theyear 1997 region name central sales 34 region name east sales 34 region name west sales 32 year theyear 1998 region name east sales 35 region name west sales 42 The whole XML tree, element names + values
25
25 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Creating an XML document public Document createDocument() { Document document = DocumentHelper.createDocument(); Element root = document.addElement("phonebook"); Element address1 = root.addElement("address").addAttribute("name", "Yuval").addAttribute("category", "family").addText("Ehud 3, Jerusalem"); Element address2 = root.addElement("address").addAttribute("name", "Ortal").addAttribute("category", "friends").addText("Kibbutz Givaat Haim"); return document; } Creating root element Adding elements What will we get when running go()?
26
26 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Creating an XML document – cont’d phonebook address Ehud 3, Jerusalem address Kibbutz Givaat Haim XML tree structure of the new document FileWriter out = new FileWriter("C:\\addresses.xml"); document.write(out); String XML = document.asXML() Writing the XML document to a file Retrieving the XML itself as string
27
27 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Client Program public static void main(String[] args) { Foo foo = new Foo(); try{ Document doc = foo.parse("C:\\Documents and Settings\\eran\\ My Documents\\Academic\\Courses\\XML\\sales.xml"); foo.dump(doc); foo.go(doc.getRootElement(), 0); foo.xpath(doc); Document newDoc = foo.createDocument(); foo.go(newDoc.getRootElement(), 0); FileWriter out = new FileWriter( "C:\\addresses.xml" ); newDoc.write(out); } catch (Exception E){ System.out.println(E); } Opening the file Dumping and printed recursively Creating a new document
28
28 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Agenda Introduction to XML –What is XML –Structure and Terminology –JAVA APIs for XML: an Overview dom4j –Parsing an XML document –Writing to an XML document Xpath –Xpath Queries –Xpath in dom4j References
29
29 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath - Introduction XML Path Language. XPath is a language for addressing parts of an XML document. Enables node locating and retrieving, very much like directory accessing in file systems. Limited (but not bad) filtering and querying abilities. Retrieved the actual PCDATA or node sets
30
30 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath – Simple Path Selection Xpath Expression: /salesdata/year/theyear 1997 1998 /salesdata/year[2]/theyear 1998 “/” signifies child-of Filtering the level – getting only the second year element
31
31 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath – Conditions /salesdata/year/region[sales > 34] east 35 west 42 Going down to region, and filtering according to the sales element /salesdata/year/region[sales > 34]/name ?
32
32 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath – Traveling Up the Tree /salesdata/year/region[sales > 34]/parent::year/theyear 1998 Going up the XML tree (and then down again)
33
33 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath – Traveling Down Fast /descendant::sales 34 32 35 42 //sales Going all the way down, until the sales element Same same
34
34 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath – Advanced Queries The years (text nodes) for which sales data exists: //region[name=\"west\" and sales > 32]/sales[@unit='millions']/ancestor::year /theyear 1998 Logical operators Accessing attributes ancestor is same as parent but goes all the way up to year
35
35 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath – Advanced Queries (cont’d) The years (text nodes) in which the west region sales were higher than the east region sales; sales may be expressed in thousands or in millions: year[region[name="west"]/sales[@unit='millions' *1000 or @unit='thousands'] > region[name="east"]/sales[@unit='millions‘ *1000 or @unit='thousands']]/theyear/text()
36
36 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath in dom4j Xpath queries can be used in dom4j: public void xpath(Document document) { XPath xpathSelector = DocumentHelper.createXPath("/salesdata/year/theyear"); List results = xpathSelector.selectNodes(document); for (Iterator iter = results.iterator(); iter.hasNext(); ) { Element element = (Element) iter.next(); System.out.println(element.asXML()); } Xpath expression is fed to the xpathSelector The nodes are selected from the document, according to the xpath query
37
37 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Agenda Introduction to XML –What is XML –Structure and Terminology –JAVA APIs for XML: an Overview dom4j –Parsing an XML document –Writing to an XML document Xpath –Xpath Queries –Xpath in dom4j References
38
38 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development References - XML XML tutorial: –http://www.w3schools.com/xml/default.asphttp://www.w3schools.com/xml/default.asp XML Specification from w3c: –http://www.w3.org/XML/http://www.w3.org/XML/ The Java/XML Tutorial: –http://java.sun.com/xml/tutorial_intro.htmlhttp://java.sun.com/xml/tutorial_intro.html DTD Tutorial: –http://www.xmlfiles.com/dtd/http://www.xmlfiles.com/dtd/ XML Schema Tutorial: –http://www.w3schools.com/schema/default.asphttp://www.w3schools.com/schema/default.asp XML Schema Resource Page: –http://www.w3.org/XML/Schemahttp://www.w3.org/XML/Schema
39
39 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development dom4j Web site: –http://dom4j.org/http://dom4j.org/ Javadocs: –http://dom4j.org/apidocs/index.htmlhttp://dom4j.org/apidocs/index.html Quick Start: –http://dom4j.org/guide.htmlhttp://dom4j.org/guide.html Cookbook (main functionality): –http://dom4j.org/cookbook.htmlhttp://dom4j.org/cookbook.html
40
40 XML and Java: XML, dom4j and Xpath – Eran Toch Methodologies in Information System Development Xpath Xpath specification: –http://www.w3.org/TR/xpathhttp://www.w3.org/TR/xpath Xpath tutorial: –http://www.w3schools.com/xpath/default.asphttp://www.w3schools.com/xpath/default.asp Xpath tutorial (extended): –http://www.zvon.org/xxl/XPathTutorial/General/examp les.htmlhttp://www.zvon.org/xxl/XPathTutorial/General/examp les.html Xpath reference: –http://www.vbxml.com/xsl/XPathRef.asphttp://www.vbxml.com/xsl/XPathRef.asp
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.