Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 24 XML.

Similar presentations


Presentation on theme: "Chapter 24 XML."— Presentation transcript:

1 Chapter 24 XML

2 CHAPTER GOALS Understanding XML elements and attributes
Understanding the concept of an XML parser Being able to read and write XML documents Being able to design Document Type Definitions for XML documents

3 XML Stands for Extensible Markup Language
Lets you encode complex data in a form that the recipient can parse easily Is independent from any programming language

4 XML Encoding of Coin Data
<value>0.5</value> <name>half dollar</name> </coin>

5 Advantages of XML XML files are readable by both computers and humans
XML formatted data is resilient to change It is easy to add new data elements Old programs can process the old information in the new data format

6 Differences Between XML and HTML
Both are descendants of SGML (Standard Generalized Markup Language) XML is a simplified version of SGML XML is very strict but HTML (as used today) is not XML tells what the data means; HTML tells how to display data

7 Differences Between XML and HTML
XML tags are case-sensitive <LI> is different from <li> Every XML start tag must have a matching end tag If a tag has no end-tag, it must end in /> <img src="hamster.jpeg"/> XML attribute values must be enclosed in quotes <img src="hamster.jpeg" width="400" height="300"/>

8 Structure of an XML Document
An XML data set is called a document The document starts with a header <?xml version 1.0?> The data are contained in a root element <purse> more data </purse> The document contains elements and text

9 Structure of an XML Document
An XML element has one of two forms <elementTag optional attributes> contents </elementTag> or <elementTag optional attributes/> The contents can be elements or text or both An example of an element with both elements and text (mixed content): <p>Use XML for <strong>robust</strong> data formats.</p> Avoid mixed content for data descriptions

10 Structure of an XML Document
An element can have attributes The a element in HTML has an href attribute <a href=" ... </a> An attribute has a name (such as href) and a value The attribute value is enclosed in either single or double quotes Attribute is intended to provide information about the content <value currency="USD">0.5</value> or <value currency="EUR">0.5</value> An element can have multiple attributes

11 Parsing XML Documents A parser is a program that Reads a document
Checks whether it is syntactically cornet Takes some action as it processes the document There are two kinds of XML parsers SAX (Simple Access to XML) DOM ( Document Object Model)

12 Parsing XML Documents SAX parser Event-driven
It calls a method you provide to process each construct it encounters More efficient for handling large XML documents DOM parser Builds a tree that represents the document When the parser is done, you can analyze the tree Easier to use for most applications

13 JAXP Stands for Java API for XML Processing
Provides a standard mechanism for DOM parsers to read and create documents Part of Java1.4 and above Earlier versions need to download additional libraries

14 Parsing XML Documents Document interface describes the tree structure of an XML document A DocumentBuilder can generate an object of a class that implements Document interface Get a DocumentBuilder by calling the static newInstance method of the DocumentBuilderFactory class Call newDocumentBuilder method of the factory to get a DocumentBuilder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder();

15 Parsing XML Documents To read a document from a file
String fileName = ; File f = new File(filename); Document doc = builder.parse(f); To read a document from a URL on the Internet String urlName = ; URL u = new URL(urlName); Document doc = builder.parse(u); To read from an input stream InputStream in = ; Document doc = builder.parse(in);

16 Parsing XML Documents You can inspect or modify the document
The document tree consists of nodes Two node type are Element and Text Element and Text are subinterfaces of the Node interface

17 An XML Document <?xml version="1.0"?> <items> <item>
<product> <description>Ink Jet Refill Kit</description> <price>29.95</price> </product> <quantity>8</quantity> </item> <description>4-port Mini Hub</description> <price>19.95</price> <quantity>4</quantity> </items>

18 Tree View of XML Document
previous | start | next Tree View of XML Document previous | start | next

19 Parsing XML Documents Start inspection of the tree by getting the root element Element root = doc.getDocumentElement(); To get the child elements of an element Use the GetChildNodes method of the Element interface The nodes are stored in an object of a class that implements the NodeList interface Use a NodeList to visit the child nodes of an element getLength method gives the number of elements item method gets an item in the node list Code to get a child node NodeList nodes = root.getChildNodes(); int i = ; //a value between o and getlength() - 1 Node child = nodes.item(i); The XML parser keeps all white spaces if you don't use a DTD You can include a test to ignore the white space

20 Parsing XML Documents Get an element name with the getTagName
Element priceElement = ; String name = priceElement.getTagName(); To find the value of the currency attribute String attributeValue = priceElement.getAttribute("currency") You can also iterate through all attributes Use a NamedNodeMap Each attribute is stored in a Node

21 Parsing XML Documents Some elements have children that contain text
Document builder creates nodes of type Text If you don't use mixed content elements Any element containing text has a single Text child node Use getFirstChild method to get it Use getData method to read the text To determine the price stored in the price element Element priceNode = ; Text priceData = (Text)priceNode.getFirstChild(); String priceString = priceNode.getData(); double price = Double.parseDouble(priceString);

22 File ItemListParser.java
001: import java.io.File; 002: import java.io.IOException; 003: import java.util.ArrayList; 004: import javax.xml.parsers.DocumentBuilder; 005: import javax.xml.parsers.DocumentBuilderFactory; 006: import javax.xml.parsers.ParserConfigurationException; 007: import org.w3c.dom.Attr; 008: import org.w3c.dom.Document; 009: import org.w3c.dom.Element; 010: import org.w3c.dom.NamedNodeMap; 011: import org.w3c.dom.Node; 012: import org.w3c.dom.NodeList; 013: import org.w3c.dom.Text; 014: import org.xml.sax.SAXException; 015: 016: /** 017: An XML parser for item lists

23 018: */ 019: public class ItemListParser 020: { 021: /** 022: Constructs a parser that can parse item lists 023: */ 024: public ItemListParser() 025: throws ParserConfigurationException 026: { 027: DocumentBuilderFactory factory 028: = DocumentBuilderFactory.newInstance(); 029: builder = factory.newDocumentBuilder(); 030: } 031: 032: /** 033: Parses an XML file containing an item list 034: @param fileName the name of the file 035: @return an array list containing all items in the XML file 036: */ 037: public ArrayList parse(String fileName)

24 038: throws SAXException, IOException
039: { 040: File f = new File(fileName); 041: Document doc = builder.parse(f); 042: 043: // get the <items> root element 044: 045: Element root = doc.getDocumentElement(); 046: return getItems(root); 047: } 048: 049: /** 050: Obtains an array list of items from a DOM element 051: @param e an <items> element 052: @return an array list of all <item> children of e 053: */ 054: private static ArrayList getItems(Element e) 055: { 056: ArrayList items = new ArrayList(); 057:

25 058: // get the <item> children
059: 060: NodeList children = e.getChildNodes(); 061: for (int i = 0; i < children.getLength(); i++) 062: { 063: Node childNode = children.item(i); 064: if (childNode instanceof Element) 065: { 066: Element childElement = (Element)childNode; 067: if (childElement.getTagName().equals("item")) 068: { 069: Item c = getItem(childElement); 070: items.add(c); 071: } 072: } 073: } 074: return items; 075: } 076: 077: /**

26 078: Obtains an item from a DOM element
079: @param e an <item> element 080: @return the item described by the given element 081: */ 082: private static Item getItem(Element e) 083: { 084: NodeList children = e.getChildNodes(); 085: Product p = null; 086: int quantity = 0; 087: for (int j = 0; j < children.getLength(); j++) 088: { 089: Node childNode = children.item(j); 090: if (childNode instanceof Element) 091: { 092: Element childElement = (Element)childNode; 093: String tagName = childElement.getTagName(); 094: if (tagName.equals("product")) 095: p = getProduct(childElement); 096: else if (tagName.equals("quantity")) 097: {

27 098: Text textNode = (Text)childElement.getFirstChild();
099: String data = textNode.getData(); 100: quantity = Integer.parseInt(data); 101: } 102: } 103: } 104: return new Item(p, quantity); 105: } 106: 107: /** 108: Obtains a product from a DOM element 109: @param e a <product> element 110: @return the product described by the given element 111: */ 112: private static Product getProduct(Element e) 113: { 114: NodeList children = e.getChildNodes(); 115: String name = ""; 116: double price = 0; 117: for (int j = 0; j < children.getLength(); j++)

28 118: { 119: Node childNode = children.item(j); 120: if (childNode instanceof Element) 121: { 122: Element childElement = (Element)childNode; 123: String tagName = childElement.getTagName(); 124: Text textNode = (Text)childElement.getFirstChild(); 125: 126: String data = textNode.getData(); 127: if (tagName.equals("description")) 128: name = data; 129: else if (tagName.equals("price")) 130: price = Double.parseDouble(data); 131: } 132: } 133: return new Product(name, price); 134: } 135: 136: private DocumentBuilder builder; 137: }

29 File ItemListParserTest.java
01: import java.util.ArrayList; 02: 03: /** 04: This program parses an XML file containing an item list. 05: It prints out the items that are described in the XML file. 06: */ 07: public class ItemListParserTest 08: { 09: public static void main(String[] args) throws Exception 10: { 11: ItemListParser parser = new ItemListParser(); 12: ArrayList items = parser.parse("items.xml"); 13: for (int i = 0; i < items.size(); i++) 14: { 15: Item anItem = (Item)items.get(i); 16: System.out.println(anItem.format()); 17: } 18: } 19: }

30 Creating XML Documents
We can build a Document object in a Java program and then save it as an XML document We need a DocumentBuilder object to create a new, empty document DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.newDocument(); //empty document The Document class has methods to create elements and text nodes

31 Creating XML Documents
To create an element use createElement method and pass it a tag Element itemElement = doc.createElement("item"); To create a text node, use createTextNode and pass it a string Text quantityText= doc.createTextNode("8"); Use setAttribute method to add an attribute to the tag priceElement.setAttribute("currency", "USD");

32 Creating XML Documents
To construct the tree structure of a document start with the root add children with appendChild To build an XML tree that describes an item // create elements Element itemElement = doc.createElement("item"); Element productElement = doc.createElement("product"); Element descriptionElement = doc.createElement("description"); Element priceElement = doc.createElement("price"); Element quantityElement = doc.createElement("quantity"); Text descriptionText = doc.createTextNode("Ink Jet Refill Kit"); Text priceText = doct.createTextNode("29.95"); Text quantityText = doc.createTextNode("8");

33 // add elements to the document
doc.appendChild(itemElement); itemElement.appendChild(productElement); itemElement.appendChild(quantityElement); productElement.appendChild(descriptionElement); productElement.appendChild(priceElement); descriptionElement.appendChild(descriptionText); priceElement.appendChild(priceText); quantityElement.appendChild(quantityText);

34 Creating XML Documents
Use a Transformer to write an XML document to a stream Create a transformer Transformer t = TransformerFactory.newInstance().newTransformer(); Create a DOMSource from your document Create a StreamResult from your output stream Call the transform method of your transformer t.transform(new DOMSource(doc), new StreamResult(System.out));

35 File ItemListBuilder.java
001: import java.util.ArrayList; 002: import javax.xml.parsers.DocumentBuilder; 003: import javax.xml.parsers.DocumentBuilderFactory; 004: import javax.xml.parsers.ParserConfigurationException; 005: import org.w3c.dom.Document; 006: import org.w3c.dom.Element; 007: import org.w3c.dom.Text; 008: 009: /** 010: Builds a DOM document for an array list of items. 011: */ 012: public class ItemListBuilder 013: { 014: /** 015: Constructs an item list builder. 016: */ 017: public ItemListBuilder()

36 018: throws ParserConfigurationException
019: { 020: DocumentBuilderFactory factory 021: = DocumentBuilderFactory.newInstance(); 022: builder = factory.newDocumentBuilder(); 023: } 024: 025: /** 026: Builds a DOM document for an array list of items. 027: @param items the items 028: @return a DOM document describing the items 029: */ 030: public Document build(ArrayList items) 031: { 032: doc = builder.newDocument(); 033: Element root = createItemList(items); 034: doc.appendChild(root); 035: return doc; 036: } 037:

37 038: /** 039: Builds a DOM element for an array list of items. 040: @param items the items 041: @return a DOM element describing the items 042: */ 043: private Element createItemList(ArrayList items) 044: { 045: Element itemsElement = doc.createElement("items"); 046: for (int i = 0; i < items.size(); i++) 047: { 048: Item anItem = (Item)items.get(i); 049: Element itemElement = createItem(anItem); 050: itemsElement.appendChild(itemElement); 051: } 052: return itemsElement; 053: } 054: 055: /** 056: Builds a DOM element for an item. 057: @param anItem the item

38 058: @return a DOM element describing the item
059: */ 060: private Element createItem(Item anItem) 061: { 062: Element itemElement = doc.createElement("item"); 063: Element productElement 064: = createProduct(anItem.getProduct()); 065: Text quantityText = doc.createTextNode( 066: "" + anItem.getQuantity()); 067: Element quantityElement = doc.createElement("quantity"); 068: quantityElement.appendChild(quantityText); 069: 070: itemElement.appendChild(productElement); 071: itemElement.appendChild(quantityElement); 072: return itemElement; 073: } 074: 075: /** 076: Builds a DOM element for a product. 077: @param p the product

39 078: @return a DOM element describing the product
079: */ 080: private Element createProduct(Product p) 081: { 082: Text descriptionText 083: = doc.createTextNode(p.getDescription()); 084: Text priceText = doc.createTextNode("" + p.getPrice()); 085: 086: Element descriptionElement 087: = doc.createElement("description"); 088: Element priceElement = doc.createElement("price"); 089: 090: descriptionElement.appendChild(descriptionText); 091: priceElement.appendChild(priceText); 092: 093: Element productElement = doc.createElement("product"); 094: 095: productElement.appendChild(descriptionElement); 096: productElement.appendChild(priceElement); 097:

40 098: return productElement;
099: } 100: 101: private DocumentBuilder builder; 102: private Document doc; 103: }

41 File ItemListBuilderTest.java
01: import java.util.ArrayList; 02: import org.w3c.dom.Document; 03: import javax.xml.transform.Transformer; 04: import javax.xml.transform.TransformerFactory; 05: import javax.xml.transform.dom.DOMSource; 06: import javax.xml.transform.stream.StreamResult; 07: 08: /** 09: This program tests the item list builder. It prints the 10: XML file corresponding to a DOM document containing a list 11: of items. 12: */ 13: public class ItemListBuilderTest 14: { 15: public static void main(String[] args) throws Exception 16: { 17: ArrayList items = new ArrayList();

42 18: items.add(new Item(new Product("Toaster", 29.95), 3));
19: items.add(new Item(new Product("Hair dryer", 24.95), 1)); 20: 21: ItemListBuilder builder = new ItemListBuilder(); 22: Document doc = builder.build(items); 23: Transformer t = TransformerFactory 24: newInstance().newTransformer(); 25: t.transform(new DOMSource(doc), 26: new StreamResult(System.out)); 27: } 28: }

43 Document Type Definitions
A DTD is a set of rules for correctly formed documents of a particular type Describes the legal attributes for each element type Describes the legal child elements for each element type Legal child elements are described with an ELEMENT rule <!ELEMENT items (item*)> The items element (the root in this case) can have 0 or more item elements Definition of an item node <!ELEMENT item (product, quantity)> Children of the item node must be a product node followed by a quantity node

44 Document Type Definitions
Definition of product node <! ELEMENT product (description, price)> The other nodes <!ELEMENT quantity (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT price (#PCDATA)> #PCDATA stands for parsable character data which is just text Can contain any characters Special characters have to be encoded when they occur in character data

45 Encodings for Special Characters

46 DTD for Item List <!ELEMENT items (item)*>
<!ELEMENT item (product, quantity)> <!ELEMENT product (description, price)> <!ELEMENT quantity (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT price (#PCDATA)>

47 Regular Expressions for Element Content

48 Document Type Definitions
A DTD gives you control over the allowed attributes of an element <!ATTLIST Element Attribute Type Default> Type can be any sequence of character data specified as CDATA Type can also specify a finite number of choices <!ATTLIST price currency (USD | EUR | JPY ) #REQUIRED >

49 Common Attribute Types

50 Attribute Defaults

51 Document Type Definitions
#IMPLIED keyword means you can supply an attribute or not. <!ATTLIST price currency CDATA #IMPLIED > If you omit the attribute, the application processing the XML data implicitly assumes some default value You can specify a default to be used if the attribute is not specified <!ATTLIST price currency CDATA "USD" >

52 Parsing with Document Type
Definitions Specify a DTD with every XML document Instruct the parser to check that the document follows the rules of the DTD Then the parser can be more intelligent about parsing If the parser knows that the children of an element are elements, it can suppress white spaces

53 Parsing with Document Type
Definitions An XML document can reference a DTD in one of two ways The document may contain the DTD The document may refer to a DTD stored elsewhere A DTD is introduced with a DOCTYPE declaration

54 Parsing with Document Type
Definitions If the document contains the DTD, the declaration looks like this: <!DOCTYPE rootElement [ rules ]> Example <?xml version="1.0"?> <!DOCTYPE items [ <!ELEMENT items (item*)> <!ELEMENT item (product, quantity)> <!ELEMENT product (description, price)> <!ELEMENT quantity (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT price (#PCDATA)> ]>

55 <items> <item> <product> <description>Ink Jet Refill Kit</description> <price>29.95</price> </product> <quantity>8</quantity> </item> <description>4-port Mini Hub</description> <price>19.95</price> <quantity>4</quantity> </items>

56 Parsing with Document Type
Definitions If the DTD is stored outside the document, use the SYSTEM keyword inside the DOCTYPE declaration This indicates that the system must locate the DTD The location of the DTD follows the SYSTEM keyword A DOCTYPE declaration can point to a local file <!DOCTYPE items SYSTEM "items.dtd" > A DOCTYPE declaration can point to a URL <!DOCTYPE items SYSTEM "

57 Parsing with Document Type
Definitions When your XML document has a DTD, use validation when parsing Then the parser will check that all child elements and attributes conform to the ELEMENT and ATTRIBUTE rules in the DTD The parser throws an exception if the document is invalid Use the setValidating method of the DocumentBuilderFactory before calling newDocumentBuilder method DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setValidating(true); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse(. . .);

58 Parsing with Document Type
Definitions If the parser validates the document with a DTD, you can avoid validity checks in your code You can tell the parser to ignore white space in non-text elements factory.setValidating(true); factory.setIgnoringElementContentWhitespace(true); If the parser has access to a DTD, it can fill in defaults for attributes

59 File ItemListParser.java
001: import java.io.File; 002: import java.io.IOException; 003: import java.util.ArrayList; 004: import javax.xml.parsers.DocumentBuilder; 005: import javax.xml.parsers.DocumentBuilderFactory; 006: import javax.xml.parsers.ParserConfigurationException; 007: import org.w3c.dom.Attr; 008: import org.w3c.dom.Document; 009: import org.w3c.dom.Element; 010: import org.w3c.dom.NamedNodeMap; 011: import org.w3c.dom.Node; 012: import org.w3c.dom.NodeList; 013: import org.w3c.dom.Text; 014: import org.xml.sax.SAXException; 015: 016: /** 017: An XML parser for item lists

60 018: */ 019: public class ItemListParser 020: { 021: /** 022: Constructs a parser that can parse item lists 023: */ 024: public ItemListParser() 025: throws ParserConfigurationException 026: { 027: DocumentBuilderFactory factory 028: = DocumentBuilderFactory.newInstance(); 029: factory.setValidating(true); 030: factory.setIgnoringElementContentWhitespace(true); 031: builder = factory.newDocumentBuilder(); 032: } 033: 034: /** 035: Parses an XML file containing an item list 036: @param fileName the name of the file 037: @return an array list containing all items in the XML file

61 038: */ 039: public ArrayList parse(String fileName) 040: throws SAXException, IOException 041: { 042: File f = new File(fileName); 043: Document doc = builder.parse(f); 044: 045: // get the <items> root element 046: 047: Element root = doc.getDocumentElement(); 048: return getItems(root); 049: } 050: 051: /** 052: Obtains an array list of items from a DOM element 053: @param e an <items> element 054: @return an array list of all <item> children of e 055: */ 056: private static ArrayList getItems(Element e) 057: {

62 058: ArrayList items = new ArrayList();
059: 060: // get the <item> children 061: 062: NodeList children = e.getChildNodes(); 063: for (int i = 0; i < children.getLength(); i++) 064: { 065: Element childElement = (Element)children.item(i); 066: Item c = getItem(childElement); 067: items.add(c); 068: } 069: return items; 070: } 071: 072: /** 073: Obtains an item from a DOM element 074: @param e an <item> element 075: @return the item described by the given element 076: */ 077: private static Item getItem(Element e)

63 078: { 079: NodeList children = e.getChildNodes(); 080: 081: Product p = getProduct((Element)children.item(0)); 082: 083: Element quantityElement = (Element)children.item(1); 084: Text quantityText 085: = (Text)quantityElement.getFirstChild(); 086: int quantity = Integer.parseInt(quantityText.getData()); 087: 088: return new Item(p, quantity); 089: } 090: 091: /** 092: Obtains a product from a DOM element 093: @param e a <product> element 094: @return the product described by the given element 095: */ 096: private static Product getProduct(Element e) 097: {

64 098: NodeList children = e.getChildNodes();
099: 100: Element descriptionElement = (Element)children.item(1); 101: Text descriptionText 102: = (Text)descriptionElement.getFirstChild(); 103: String description = descriptionText.getData(); 104: 105: Element priceElement = (Element)children.item(1); 106: Text priceText 107: = (Text)priceElement.getFirstChild(); 108: double price = Double.parseDouble(priceText.getData()); 109: 110: return new Product(description, price); 111: } 112: 113: private DocumentBuilder builder; 114: }

65 File ItemListParserTest.java
01: import java.util.ArrayList; 02: 03: /** 04: This program parses an XML file containing an item list. 05: The XML file should reference the items.dtd 06: */ 07: public class ItemListParserTest 08: { 09: public static void main(String[] args) throws Exception 10: { 11: ItemListParser parser = new ItemListParser(); 12: ArrayList items = parser.parse("items.xml"); 13: for (int i = 0; i < items.size(); i++) 14: { 15: Item anItem = (Item)items.get(i); 16: System.out.println(anItem.format()); 17: } 18: } 19: }


Download ppt "Chapter 24 XML."

Similar presentations


Ads by Google