Presentation is loading. Please wait.

Presentation is loading. Please wait.

Java XML Programming Svetlin Nakov Bulgarian Association of Software Developers www.devbg.org.

Similar presentations


Presentation on theme: "Java XML Programming Svetlin Nakov Bulgarian Association of Software Developers www.devbg.org."— Presentation transcript:

1 Java XML Programming Svetlin Nakov Bulgarian Association of Software Developers www.devbg.org

2 Contents Introduction to XML ParsersIntroduction to XML Parsers The DOM ParserThe DOM Parser The SAX ParserThe SAX Parser The StAX ParserThe StAX Parser Introduction to JAXPIntroduction to JAXP Using DOMUsing DOM Using StAXUsing StAX Java API for XPathJava API for XPath Java API for XSLTJava API for XSLT

3 XML Parsers

4 XML parsers are programming libraries that make the work with XML easierXML parsers are programming libraries that make the work with XML easier They are used for:They are used for: Extracting data from XML documentsExtracting data from XML documents Building XML documentsBuilding XML documents Validating XML documents by given schemeValidating XML documents by given scheme

5 XML Parsers – Models DOM (Document Object Model)DOM (Document Object Model) Represents XML documents as a tree in the memoryRepresents XML documents as a tree in the memory Allows flexible and easy processingAllows flexible and easy processing Supports changing the documentSupports changing the document SAX (Simple API for XML Processing)SAX (Simple API for XML Processing) Reads XML documents consequently (like a stream)Reads XML documents consequently (like a stream) Allows read-only / write-only accessAllows read-only / write-only access StAX (Streaming API for XML)StAX (Streaming API for XML) Similar to SAX but simplifiedSimilar to SAX but simplified

6 Using a XML Parser Three basic steps to using an XML parserThree basic steps to using an XML parser Create a parser objectCreate a parser object Pass your XML document to the parserPass your XML document to the parser Process the resultsProcess the results Generally, writing out XML is outside scope of parsersGenerally, writing out XML is outside scope of parsers Some parsers may implement such mechanismsSome parsers may implement such mechanisms

7 Types of Parser There are several different ways to categorize parsers:There are several different ways to categorize parsers: Validating versus non-validating parsersValidating versus non-validating parsers Parsers that support the Document Object Model (DOM)Parsers that support the Document Object Model (DOM) Parsers that support the Simple API for XML (SAX)Parsers that support the Simple API for XML (SAX) Streaming parsers (StAX)Streaming parsers (StAX) Parsers written in a particular language (Java, C#, C++, Perl, etc.)Parsers written in a particular language (Java, C#, C++, Perl, etc.)

8 The DOM Parser

9 DOM Parser Architecture

10 DOM Key Features The DOM API is generally an easier API to useThe DOM API is generally an easier API to use It provides a familiar tree structure of objectsIt provides a familiar tree structure of objects You can use it to manipulate the hierarchy of a XML documentYou can use it to manipulate the hierarchy of a XML document The DOM API is ideal for interactive applicationsThe DOM API is ideal for interactive applications The entire object model is present in memoryThe entire object model is present in memory

11 The DOM Parser – Example The following XML document is given:The following XML document is given: Programming Microsoft.NET Programming Microsoft.NET Jeff Prosise Jeff Prosise 0-7356-1376-1 0-7356-1376-1 Microsoft.NET for Programmers Microsoft.NET for Programmers Fergal Grimes Fergal Grimes 1-930110-19-7 1-930110-19-7 </library>

12 The DOM Parser – Example This document is represented in the in the memory as a DOM tree in the following way:This document is represented in the in the memory as a DOM tree in the following way: Header part Root node

13 The SAX Parser

14 SAX Key Features The Simple API for XML (SAX)The Simple API for XML (SAX) Event-drivenEvent-driven Serial-access mechanismSerial-access mechanism Element-by-element processingElement-by-element processing Do not allow going backwards or jumping aheadDo not allow going backwards or jumping ahead Require many times less resourcesRequire many times less resources MemoryMemory CPU timeCPU time Work over streamsWork over streams

15 The SAX Parser Working with SAX is much complexWorking with SAX is much complex Old technologyOld technology Use it's new equivalent – the StAX parserUse it's new equivalent – the StAX parser

16 The StAX Parser Like SAX butLike SAX but Not event driven (not callback based)Not event driven (not callback based) "Pull"-based"Pull"-based Developer manually say "go to next element" and analyze itDeveloper manually say "go to next element" and analyze it It's a new feature in Java 6.0!It's a new feature in Java 6.0!

17 When to Use DOM and When to Use SAX/StAX? The DOM processing model is suitable when:The DOM processing model is suitable when: Processing small documentsProcessing small documents There is a need of flexibilityThere is a need of flexibility There is a need of direct access to different nodes of the documentThere is a need of direct access to different nodes of the document We need to change the documentWe need to change the document

18 The SAX/StAX processing model is suitable when:The SAX/StAX processing model is suitable when: Processing big documentsProcessing big documents Big XML documents (e.g. > 20-30 MB) cannot be processed with DOM!Big XML documents (e.g. > 20-30 MB) cannot be processed with DOM! The performance is importantThe performance is important There is no need to change the document nodesThere is no need to change the document nodes SAX/StAX is read-only / write-only (like the streams)SAX/StAX is read-only / write-only (like the streams) When to Use DOM and When to Use SAX/StAX?

19 Introduction to JAXP

20 JAXP Java API for XML ProcessingJava API for XML Processing Designed to be flexibleDesigned to be flexible Facilitate the use of XML on the Java platformFacilitate the use of XML on the Java platform Provides a common interface for these standard APIsProvides a common interface for these standard APIs DOMDOM SAX, StAXSAX, StAX XPath and XSL Transformations (XSLT)XPath and XSL Transformations (XSLT)

21 JAXP – Plugability JAXP allows you to use any XML- compliant parserJAXP allows you to use any XML- compliant parser Regardless of which vendor's implementation is actually being usedRegardless of which vendor's implementation is actually being used Pluggability layerPluggability layer Lets you plug in an implementation of the SAX or DOM APILets you plug in an implementation of the SAX or DOM API Lets you control how your XML data is displayedLets you control how your XML data is displayed

22 JAXP – Independence To achieve the goal of XML processor independenceTo achieve the goal of XML processor independence Application should limit itself to the JAXP APIApplication should limit itself to the JAXP API Avoid using implementation-dependent APIs and behaviorAvoid using implementation-dependent APIs and behavior

23 JAXP Packages javax.xml.parsersjavax.xml.parsers The JAXP APIsThe JAXP APIs Provides a common interface for different vendors' SAX and DOM parsersProvides a common interface for different vendors' SAX and DOM parsers org.w3c.domorg.w3c.dom Defines the DOM classesDefines the DOM classes Document class and all the components of a DOMDocument class and all the components of a DOM

24 JAXP Packages (2) org.xml.saxorg.xml.sax Defines the basic SAX APIsDefines the basic SAX APIs javax.xml.streamjavax.xml.stream Define the basic StAX classesDefine the basic StAX classes javax.xml.xpathjavax.xml.xpath Defines API for the evaluation of XPath expressionsDefines API for the evaluation of XPath expressions javax.xml.transformjavax.xml.transform Defines the XSLT APIs that let you transform XML into other formsDefines the XSLT APIs that let you transform XML into other forms

25 Using the DOM Parser

26 DOM Document Structure Document +---Element +---Text "this is before the first dot | and it continues on multiple lines" +---Element +---Text "" +---Element +---Text "" +---Element | +---Text "flip is on" | +---Element | +---Text "" | +---Element | +---Text "" +---Text "flip is off" +---Element +---Text "" +---Element | +---Text "stuff" +---Text "" +---Comment "a final comment" +---Text "" XML input: Document structure: <dots> this is before the first dot this is before the first dot and it continues on multiple lines and it continues on multiple lines flip is on flip is on flip is off flip is off stuff stuff </dots>

27 DOM Document Structure Theres a text node between every pair of element nodes, even if the text is emptyTheres a text node between every pair of element nodes, even if the text is empty XML comments appear in special comment nodesXML comments appear in special comment nodes Element attributes do not appear in treeElement attributes do not appear in tree Available through Element objectAvailable through Element object

28 DOM Classes Hierarchy

29 Using DOM import javax.xml.parsers.*; import org.w3c.dom.*; // Get a DocumentBuilder object DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilderFactory.newInstance(); DocumentBuilder db = null; try { db = dbf.newDocumentBuilder(); db = dbf.newDocumentBuilder(); } catch (ParserConfigurationException e) { e.printStackTrace(); e.printStackTrace();} // Invoke parser to get a Document Document doc = db.parse(inputStream); Document doc = db.parse(file); Document doc = db.parse(url); Heres the basic recipe for getting started:

30 DOM Document Access Idioms // get the root of the Document tree Element root = doc.getDocumentElement(); // get nodes in subtree by tag name NodeList dots = root.getElementsByTagName("dot"); // get first dot element Element firstDot = (Element) dots.item(0); // get x attribute of first dot String x = firstDot.getAttribute("x"); OK, say we have a Document. How do we get at the pieces of it?OK, say we have a Document. How do we get at the pieces of it? Here are some common idioms:Here are some common idioms:

31 More Document Accessors Node access methods: StringgetNodeName() shortgetNodeType() DocumentgetOwnerDocument() booleanhasChildNodes() NodeListgetChildNodes() NodegetFirstChild() NodegetLastChild() NodegetParentNode() NodegetNextSibling() NodegetPreviousSibling() booleanhasAttributes()... and more... e.g. DOCUMENT_NODE, ELEMENT_NODE, TEXT_NODE, COMMENT_NODE, etc.

32 More Document Accessors Element extends Node and adds these access methods: StringgetTagName() booleanhasAttribute(String name) StringgetAttribute(String name) NodeListgetElementsByTagName(String name) … and more … Document extends Node and adds these access methods: ElementgetDocumentElement() DocumentTypegetDoctype()... plus the Element methods just mentioned...... and more...

33 Writing a Document as XML JAXP do not specify how to write XML document to a fileJAXP do not specify how to write XML document to a file Most JAXP implementations have own classes for writing XML filesMost JAXP implementations have own classes for writing XML files E.g. the class XMLSerializer in Apache Xerces (the standard parser in J2SE 5.0)E.g. the class XMLSerializer in Apache Xerces (the standard parser in J2SE 5.0) import com.sun.org.apache.xml.internal. serialize.XMLSerializer; serialize.XMLSerializer; XMLSerializer xmlser = new XMLSerializer(); xmlser.setOutputByteStream(System.out);xmlser.serialize(doc);

34 Reading and Parsing XML Documents with the DOM Parser Live Demo

35 Creating & Manipulating DOM Documents // Get new empty Document from DocumentBuilder Document doc = docBuilder.newDocument(); // Create a new element // and add it to the document as root Element root = doc.createElement("dots"); doc.appendChild(root); // Create a new element // and add as child of the root Element dot = doc.createElement("dot"); dot.setAttribute("x", "9"); dot.setAttribute("y", "81"); root.appendChild(dot); The DOM API also includes lots of methods for creating and manipulating Document objects:The DOM API also includes lots of methods for creating and manipulating Document objects:

36 More Document Manipulators Node manipulation methods: voidsetNodeValue(String nodeValue) NodeappendChild(Node newChild) NodeinsertBefore(Node newChild, Node refChild) NoderemoveChild(Node oldChild)... and more... Element manipulation methods: voidsetAttribute(String name, String value) voidremoveAttribute(String name) … and more … Document manipulation methods: TextcreateTextNode(String data) CommentcreateCommentNode(String data)... and more...

37 Building Documents with the DOM Parser Live Demo

38 Using The StAX Parser

39 The StAX Parser in Java As from Java 6 the StAX parser is available as part of JavaAs from Java 6 the StAX parser is available as part of Java Two basic StAX classesTwo basic StAX classes XMLStreamReaderXMLStreamReader Pull based XML streaming API for parsing XML documents – read-onlyPull based XML streaming API for parsing XML documents – read-only XMLStreamWriterXMLStreamWriter Streaming based builder for XML documents – write-onlyStreaming based builder for XML documents – write-only

40 Parsing Documents with the StAX Parser – Example FileReader fileReader = new FileReader("Student.xml"); XMLInputFactory factory = XMLInputFactory.newInstance(); XMLInputFactory.newInstance(); XMLStreamReader reader = factory.createXMLStreamReader(fileReader); factory.createXMLStreamReader(fileReader); String element = ""; while (reader.hasNext()) { if (reader.isStartElement()) { if (reader.isStartElement()) { element = reader.getLocalName(); element = reader.getLocalName(); } else if (reader.isCharacters() && !reader.isWhiteSpace()) { } else if (reader.isCharacters() && !reader.isWhiteSpace()) { System.out.printf("%s - %s%n", element, reader.getText()); System.out.printf("%s - %s%n", element, reader.getText()); } reader.next(); reader.next();}reader.close()

41 Parsing Documents with the StAX Parser Live Demo

42 Creating Documents with the StAX Parser – Example String fileName = "Customers.xml"; FileWriter fileWriter = new FileWriter(fileName); XMLOutputFactory factory = XMLOutputFactory.newInstance(); XMLOutputFactory.newInstance(); XMLStreamWriter writer = factory.createXMLStreamWriter(fileWriter); factory.createXMLStreamWriter(fileWriter);writer.writeStartDocument(); writer.writeStartElement("Customers"); writer.writeStartElement("Customers"); writer.writeStartElement("Customer"); writer.writeStartElement("Customer"); writer.writeStartElement("Name"); writer.writeStartElement("Name"); writer.writeCharacters("ABC Pizza"); writer.writeCharacters("ABC Pizza"); writer.writeEndElement(); writer.writeEndElement(); writer.writeStartElement("Address"); writer.writeStartElement("Address"); writer.writeCharacters("1 Main Street"); writer.writeCharacters("1 Main Street"); writer.writeEndElement(); writer.writeEndElement(); writer.writeEndDocument(); writer.writeEndDocument();writer.flush();

43 Parsing Documents with the StAX Parser Live Demo

44 Using XPath in Java Searching nodes in XML documents

45 Parsing XML Documents with XPath To evaluate an XPath expression in Java, create an XPath objectTo evaluate an XPath expression in Java, create an XPath object Then call the evaluate methodThen call the evaluate method expression is an XPath expressionexpression is an XPath expression doc is the Document object that represents the XML documentdoc is the Document object that represents the XML document XPathFactory xpfactory = XPathFactory.newInstance(); XPath xpath = xpfactory.newXPath(); String result = xpath.evaluate(expression, doc)

46 Sample XML Document Zagorka Zagorka 0.54 0.54 kepab kepab 0.48 0.48 Amstel Amstel 0.56 0.56 </items>

47 Parsing with XPath – Example For example, obtains as result the string " 0.48For example, obtains as result the string " 0.48 XPath can also match multiple nodes and return NodeList :XPath can also match multiple nodes and return NodeList : String result = xpath.evaluate("/items/item[2]/price", doc) NodeList nodes = (NodeList) xpath.evaluate( "/items/item[@type='beer']/price", doc, "/items/item[@type='beer']/price", doc, XPathConstants.NODESET); XPathConstants.NODESET); for (int i=0; i<beerPriceNodes.getLength(); i++) { Node priceNode = nodes.item(i); Node priceNode = nodes.item(i); System.out.println(node.getTextContent()); System.out.println(node.getTextContent());}

48 Using XPath Live Demo

49 Modifying XML with DOM and XPath Live Demo

50 XSL Transformations in JAXP javax.xml.transform.Transformer

51 XSLT API

52 Transforming with XSLT in Java with JAXP The JAXP uses a factory design patternThe JAXP uses a factory design pattern This hides the implementation classesThis hides the implementation classes The procedure for XSL transforming is:The procedure for XSL transforming is: 1.Create a TransformerFactory instance 2.Load your stylesheet into a Transformer instance 3.Transform your source to your output using the Transfomer instance

53 Transforming with XSLT in Java with JAXP (2) 1.Establish the factory and environment 2.Load and compile the XSL stylesheet 3.Apply the stylesheet over given document TransformerFactory tFactory = TransformerFactory.newInstance(); TransformerFactory.newInstance(); Transformer xslTransformer = tFactory.newTransformer( tFactory.newTransformer( new StreamSource("stylesheet.xsl")); new StreamSource("stylesheet.xsl")); xslTransformer.transform( new StreamSource("input.xml"), new StreamSource("input.xml"), new StreamResult("output.xml")); new StreamResult("output.xml"));

54 Transforming with XSL – Example Programming Microsoft.NET Programming Microsoft.NET Jeff Prosise Jeff Prosise 0-7356-1376-1 0-7356-1376-1 Microsoft.NET for Programmers Microsoft.NET for Programmers Fergal Grimes Fergal Grimes 1-930110-19-7 1-930110-19-7 </library> library.xmllibrary.xml

55 Transforming with XSL – Example (2) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="utf-8" indent="yes" omit-xml-declaration="yes"/> indent="yes" omit-xml-declaration="yes"/> <meta http-equiv="Content-Type" <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> content="text/html; charset=utf-8" /> Моята библиотека Моята библиотека (example continues) library-xml2html.xsllibrary-xml2html.xsl

56 Заглавие Заглавие Автор Автор </xsl:template></xsl:stylesheet> library-xml2html.xsllibrary-xml2html.xsl Transforming with XSL – Example (3)

57 Transforming with XSL – Example (4) public class XSLTransformDemo { public static void main(String[] args) public static void main(String[] args) throws TransformerException { throws TransformerException { TransformerFactory tFactory = TransformerFactory tFactory = TransformerFactory.newInstance(); TransformerFactory.newInstance(); Transformer xslTransformer = Transformer xslTransformer = tFactory.newTransformer( tFactory.newTransformer( new StreamSource("library-xml2html.xsl")); new StreamSource("library-xml2html.xsl")); xslTransformer.transform( xslTransformer.transform( new StreamSource("library.xml"), new StreamSource("library.xml"), new StreamResult("library.html")); new StreamResult("library.html")); }} XSLTransformDemo.javaXSLTransformDemo.java

58 Transforming with XSL – Example (5) <html><head> <meta http-equiv="Content-Type" <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> content="text/html; charset=utf-8"/></head><body> Моята библиотека Моята библиотека Заглавие Заглавие Автор Автор (example continues) Result: library.html

59 Transforming with XSL – Example (6) Programming Microsoft.NET Programming Microsoft.NET Jeff Prosise Jeff Prosise Microsoft.NET for Programmers Microsoft.NET for Programmers Fergal Grimes Fergal Grimes </body></html> Result: library.html

60 XSL Transformations Live Demo

61 Exercises 1.Write a program that extracts from the file " students.xml " all available information about the students from 3-rd course (name, exams, etc.). Use the DOM parser. 2.Write a program that appends a new student "Peter Petrov" to the file students.xml and produces a new XML file as a result. 3.Write a program that appends a new exam to given student. The students and their exams are taken from the file students.xml and the results should be stored in a new XML file newStudents.xml.

62 Exercises (2) 4.Write a program that extracts from the file " students.xml " all available information about the students from 3-rd course (name, exams, etc.). Use XPath. 5.Write a program that changes all grades for the student "Peter Petrov" to "6". Produce a new XML file as a result. Use StAX parser. 6.Write a program that builds an XML file catalog.xml containing a catalog of books (author, title, isbn, pages). Use StAX parser.

63 Exercises (3) 7.Using the StAX parser write a program that extracts all books' names from the file catalog.xml. 8.Using the StAX parser write a program that extracts from the students.xml all students' names. Process only students with more than one excellent grade. 9.Write an XML file containing orders. Each order is described by date, customer name and a list of order items. Each order item consists of product name, amount and price. 10.Write an XSL stylesheet to transform the XML file to a human readable XHTML document. Sort the products in alphabetical order.

64 Exercises (4) 11.Write a JAXP based Java program to apply the XSL stylesheet over the XML document. 12.Test the produced XHTML file in your Web browser. 13.Write your CV in XML format. It should have the following structure: Personal information (name, DOB,...)Personal information (name, DOB,...) EducationEducation SkillsSkills Work experienceWork experience......

65 Exercises (5) 14.Write a XSL stylesheet for transforming the CV to HTML and XML with other structure. Write a program to apply the stylesheet.


Download ppt "Java XML Programming Svetlin Nakov Bulgarian Association of Software Developers www.devbg.org."

Similar presentations


Ads by Google