Modernās Programmēšanas Tehnoloģijas (Advanced Programming Technologies) Edgars Celms, Mārtiņš Opmanis Latvijas Universitātes Matemātikas.

Modernās Programmēšanas Tehnoloģijas (Advanced Programming Technologies) Edgars Celms, Mārtiņš Opmanis (askola@mii.lu.lv) Latvijas Universitātes Matemātikas un informātikas institūts 2007, Rīga, Latvija

Java un XML Pārskats par XML Kam domāta Javas XML API? Pārskats par JAXP  JAXP XML parseri  SAX vs. DOM SAX  SAX arhitektūra  SAX lietojums  SAXExample.java DOM  DOM arhitektūra  DOM lietojums  DOMExample.java JDOM

XML (EXtensible Markup Language) Ja runā par XML, tad tipiska ir situācija, ka patiesībā tiek runāts par XML un ar to saistītām tehnoloģijām

XML Resources XML 1.0 Specification (Extensible Markup Language (XML) 1.0 (Fourth Edition) W3C Recommendation 16 August 2006, edited in place 29 September 2006  http://www.w3.org/TR/REC-xml WWW consortium’s Home Page on XML  http://www.w3.org/XML/ Sun Page on XML and Java  http://java.sun.com/xml/ Apache XML Project  http://xml.apache.org/ XML Resource Collection  http://xml.coverpages.org/ O’Reilly XML Resource Center  http://www.xml.com/ http://www.xml.com/

XML EXtensible Markup Language (XML) is a meta-language that describes the content of the document (self-describing data) Java = Portable Programs XML = Portable Data Some advantages of XML  Human-readable  Machine-readable (easy to parse)  Standard format for data interchange  Possible to validate  Extensible  can represent any data  can add new tags for new data formats  Hierarchical structure (nesting)

Applications of XML Configuration files  Used extensively in J2EE architectures Media for data interchange  A better alternative to proprietary data formats B2B transactions on the Web  Electronic business orders (ebXML)  Financial Exchange (IFX)  Messaging exchange (SOAP)

XML versus HTML XML fundamentally separates content (data and language) from presentation; HTML specifies the presentation HTML explicitly defines a set of legal tags as well as the grammar (intended meaning)  … XML allows any tags or grammar to be used (hence, eXtensible)  … Note: Both are based on Standard Generalized Markup Language (SGML)

Simple XML Example Larry Brown Marty Hall...

XML Components Prolog  Defines the xml version, entity definitions, and DOCTYPE Components of the document  Tags and attributes  CDATA (character data)  Entities  Processing instructions  Comments

XML Prolog XML Files always start with a prolog   The version of XML is required  The encoding identifies character set (default UTF-8)  The value standalone identifies if an external document is referenced for DTD or entity definition Note: the prolog can contain entities and DTD definitions Document Type Definition (DTD)  DTD defines the syntax and structure of elements in the document

XML Root Element Required for XML-aware applications to recognize beginning and end of document Example  Core Web Programming Designing Web Pages with HTML Block-level Elements in HTML 4.0 Text-level Elements in HTML 4.0...

XML Tags Tag names:  Case sensitive  Start with a letter or underscore  After first character, numbers, - and. are allowed  Cannot contain whitespaces  Avoid use of colon expect for indicating namespaces For a well-formed XML documents  Every tag must have an end tag …  All tags are completely nested (tag order cannot be mixed)

XML Tags and Attributes Tags can also have attributes We put the. in.com. What did you do? Element Attributes  Attributes provide metadata for the element  Every attribute must be enclosed in “ “ with no commas in between  Same naming conventions as elements

Document Entities Entities refer to a data item, typically text  General entity references start with & and end with ;  The entity reference is replaced by it’s true value when parsed  The characters & ' " require entity references to avoid conflicts with the XML application (parser) < > & " ' Entities are user definable <!DOCTYPE book [ ]> Core Web Programming, &COPYRIGHT; Document Type Declarations

Well-Formed versus Valid An XML document can be well-formed if it follows basic syntax rules An XML document is valid if its structure matches a Document Type Definition (DTD)

Document Type Definition (DTD) Defines Structure of the Document  Allowable tags and their attributes  Attribute values constraints  Nesting of tags  Number of occurrences for tags  Entity definitions Limitations of DTDs  DTD itself is not in XML format – more work for parsers  Does not express data types (weak data typing)  No namespace support  Document can override external DTD definitions  No DOM support  XML Schema is intended to resolve these issues but … DTDs are going to be around for a while DTD and Data Exchange  Both sides must agree on DTD ahead of time  DTD can be part of document or stored separately

XML Schema W3C recommendation released May 2001 (actual version – Extensible Markup Language (XML) 1.1 (Second Edition) W3C Recommendation 16 August 2006) - http://www.w3.org/TR/2006/REC- xml11-20060816/http://www.w3.org/TR/2006/REC- xml11-20060816/  http://www.w3.org/TR/xmlschema-0/  http://www.w3.org/TR/xmlschema-1/  http://www.w3.org/TR/xmlschema-2/  Depends on following specifications XML-Infoset, XML-Namespaces, XPath Benefits:  Standard and user-defined data types  Express data types as patterns  Higher degree of type checking  Better control of occurrences

SML / Minimal XML SML – Simplified Markup Language Subset of XML Easier to understand, parse No  DTDs  Processing instructions  etc.

XML Summary XML is a self-describing meta data DOCTYPE defines the root element and location of DTD Document Type Definition (DTD) defines the grammar of the document  Required to validate the document  Constrains grouping and cardinality of elements DTD processing is expensive Schema uses XML to specify the grammar  More complex to express but easier to process

What are XML APIs good for? You want to read/write data from/to XML files, and you don't want to write an XML parser. Applications:  processing an XML-tagged corpus  saving configs, prefs, parameters, etc. as XML files  sharing results with outside users in portable format example: typed dependency relations  alternative to serialization for persistent stores doesn't break with changes to class definition human-readable

Overview of JAXP JAXP = Java API for XML Processing  Provides a common interface for creating and using the standard SAX, DOM, and XSLT APIs in Java.  All JAXP packages are included as standard in Java 5.0 The key packages are: javax.xml.parsers  The main JAXP APIs, which provide a common interface for various SAX and DOM parsers. org.w3c.dom  Defines the Document class (a DOM), as well as classes for all of the components of a DOM. Java 1.4 includes the core module of the Level 2 DOM, and Java 5.0 includes the core, events, and load/save modules of the Level 3 DOM org.xml.sax  Defines the basic SAX APIs. javax.xml.transform  Defines the XSLT APIs that let you transform XML into other forms. (neapskatīsim šajā kursā)  XSLT(Extensible Stylesheet Language Transformations) javax.xml.validation  This Java 5.0 package provides support for validating an XML document against a schema. (neapskatīsim šajā kursā) javax.xml.xpath  New in Java 5.0. This package supports the evaluation of XPath for selecting nodes in an XML document. (neapskatīsim šajā kursā)

JAXP XML Parsers XML Parsers are the bottom layer of any XML processing system. An XML parser is either validating or non-validating.  validating parsers check documents for conformity with DTD's  validation is slow and often ignored.  harder to write a validating parser. Non-validating parsers only check that the document is well- formed XML

javax.xml.parsers defines abstract classes DocumentBuilder (for DOM) and SAXParser (for SAX).  It also defines factory classes DocumentBuilderFactory and SAXParserFactory. By default, these give you the “reference implementation” of DocumentBuilder and SAXParser, but they are intended to be vendor- neutral factory classes, so that you could swap in a different implementation if you preferred. The JDK includes three XML parser implementations from Apache:  Crimson: The original. Small and fast. Based on code donated to Apache by Sun. Standard implementation for J2SE 1.4.  Xerces: More features. Supports XML Schema. Based on code donated to Apache by IBM.  Xerces 2: Standard implementation for J2SE 5.0. There are many other parsers available: Aelfred, Lark, Expat, MSXML, xmlproc … JAXP XML Parsers

SAX = Simple API for XML  Java-specific  interprets XML as a stream of events  you supply event-handling callbacks  SAX parser invokes your event- handlers as it parses  doesn't build data model in memory  serial access  very fast, lightweight  good choice when no data model is needed, or natural structure for data model is list, matrix, etc. SAX vs. DOM DOM = Document Object Model  W3C standard for representing structured documents  platform and language neutral (not Java-specific!)  interprets XML as a tree of nodes  builds data model in memory  enables random access to data  therefore good for interactive apps  more CPU- and memory-intensive  good choice when data model has natural tree structure There is also JDOM … about it later

SAX Architecture

Here’s the standard recipe for starting with SAX: Using SAX import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; // get a SAXParser object SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser saxParser = factory.newSAXParser(); // invoke parser using your custom content handler saxParser.parse(inputStream, myContentHandler); saxParser.parse(file, myContentHandler); saxParser.parse(url, myContentHandler); (This reflects SAX 1, which you can still use, but SAX 2 prefers a new incantation…)

In SAX 2, the following usage is preferred: Using SAX2 // tell SAX which XML parser you want (here, it’s Crimson) System.setProperty("org.xml.sax.driver", "org.apache.crimson.parser.XMLReaderImpl"); // get an XMLReader object XMLReader reader = XMLReaderFactory.createXMLReader(); // tell the XMLReader to use your custom content handler reader.setContentHandler(myContentHandler); // Have the XMLReader parse input from Reader myReader: reader.parse(new InputSource(myReader)); But where does myContentHandler come from?

Easiest route: define a new class which extends org.xml.sax.helpers.DefaultHandler. Override event-handling methods from DefaultHandler :  startDocument()// receive notice of start of document  endDocument()// receive notice of end of document  startElement()// receive notice of start of each element  endElement()// receive notice of end of each element  characters()// receive a chunk of character data  error()// receive notice of recoverable parser error  //...plus more... Defining a ContentHandler

The SAXParser invokes your callbacks to notify you of events: startElement() and endElement() For simple usage, ignore namespaceURI and localName, and just use qName (the “qualified” name). startElement () and endElement () events always come in pairs:  “ ” will generate calls: startElement("", "", "foo", null) endElement("", "", "foo") startElement(String namespaceURI,// for use w/ namespaces String localName,// for use w/ namespaces String qName,// "qualified" name -- use this one! Attributes atts) endElement(String namespaceURI, String localName, String qName)

Every call to startElement() includes an Attributes object which represents all the XML attributes for that element. Methods in the Attributes interface: SAX Attributes getLength() // return number of attributes getIndex(String qName)// look up attribute's index by qName getValue(String qName)// look up attribute's value by qName getValue(int index)// look up attribute's value by index //... and others...

The characters() event handler receives notification of character data (i.e. content that is not part of an XML element): SAX characters() May be called multiple times within each block of character data—for example, once per line. So, you may want to use calls to characters() to accumulate characters in a StringBuffer, and stop accumulating at the next call to startElement(). public void characters(char[] ch, // buffer containing chars int start, // start position in buffer int length) // num of chars to read

SAXExample : Input XML this is before the first dot and it continues on multiple lines flip is on flip is off stuff

SAXExample : Code Please see SAXExample.javaSAXExample.java

// SAXExample.java // Example of using SAX for XML parsing import java.io.*; import java.util.*; import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; public class SAXExample extends DefaultHandler { /** XML tag strings */ public final String DOTS = "dots"; public final String DOT = "dot"; public final String X = "x"; public final String Y = "y"; public final String FLIP = "flip"; /** Data model */ private class Dot { int x; int y; public Dot(int x, int y) { this.x = x; this.y = y; } public String toString() { return "(" + x + ", " + y + ")"; } } public List dotList = new ArrayList(); /** State variables */ private int x; private int y; private boolean flip; /** Constructor: initialize state */ public SAXExample() { clear(); } /** Clear state. */ public void clear() { x = -1; y = -1; flip = false; }

/** Read XML from input stream and parse, generating SAX events */ public void readXML(InputStream inStream) { try { clear(); // SAX 1 approach: SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser saxParser = factory.newSAXParser(); saxParser.parse(inStream, this); // SAX 2 approach: //System.setProperty("org.xml.sax.driver", "org.apache.crimson.parser.XMLReaderImpl"); //XMLReader reader = XMLReaderFactory.createXMLReader(); //reader.setContentHandler(this); //reader.parse(new InputSource(new InputStreamReader(inStream))); } catch (Exception e) { e.printStackTrace(); } // SAX ContentHandler methods ===================================== /* Receive notice of start of document. */ public void startDocument() throws SAXException { System.out.println("startDocument"); } /* Receive notice of end of document. */ public void endDocument() throws SAXException { System.out.println("endDocument"); } /* Receive notice of start of XML element. */ public void startElement (String namespaceURI, String localName, String qName, Attributes atts) throws SAXException { System.out.println("startElement: " + qName + " (" + atts.getLength() + " attributes)"); if (qName.equals(DOT)) { x = Integer.parseInt(atts.getValue(X)); y = Integer.parseInt(atts.getValue(Y)); if (flip) { int temp = x; x = y; y = temp; } dotList.add(new Dot(x, y)); // add to data model } else if (qName.equals(FLIP)) { flip = true; }

/* Receive notice of end of XML element. */ public void endElement(String namespaceURI, String localName, String qName) throws SAXException { System.out.println("endElement: " + qName); if (qName.equals(FLIP)) { flip = false; } /* Receive notice of character data (text not in an XML element). */ public void characters (char[] ch, int start, int length) throws SAXException { String s = new String(ch, start, length); s = s.trim(); if (! s.equals("")) System.out.println("characters: " + s); } // end SAX ContentHandler methods ================================= /** Test by running "java SAXExample " */ public static void main (String[] args) { if (args.length != 1) { System.err.println ("Usage: cmd filename"); System.exit(1); } try { SAXExample example = new SAXExample(); InputStream in = new BufferedInputStream(new FileInputStream(new File(args[0]))); example.readXML(in); System.out.println("\nFinished parsing input. Got the following dots:"); System.out.println(example.dotList); } catch (Throwable t) { t.printStackTrace (); }

SAXExample : Result C:\Edgars\LUMII\LU_Kursi\Java\EC>java SAXExample dots.xml startDocument startElement: dots (0 attributes) characters: this is before the first dot characters: and it continues on multiple lines startElement: dot (2 attributes) endElement: dot startElement: dot (2 attributes) endElement: dot startElement: flip (0 attributes) characters: flip is on startElement: dot (2 attributes) endElement: dot startElement: dot (2 attributes) endElement: dot endElement: flip characters: flip is off startElement: dot (2 attributes) endElement: dot startElement: extra (0 attributes) characters: stuff endElement: extra endElement: dots endDocument Finished parsing input. Got the following dots: [(9, 81), (11, 121), (14, 196), (13, 169), (12, 144)]

SAXExample : Input  Output startDocument startElement: dots (0 attributes) characters: this is before the first dot and it continues on multiple lines startElement: dot (2 attributes) endElement: dot startElement: dot (2 attributes) endElement: dot startElement: flip (0 attributes) characters: flip is on startElement: dot (2 attributes) endElement: dot startElement: dot (2 attributes) endElement: dot endElement: flip characters: flip is off startElement: dot (2 attributes) endElement: dot startElement: extra (0 attributes) characters: stuff endElement: extra endElement: dots endDocument Finished parsing input. Got the following dots: [(9, 81), (11, 121), (14, 196), (13, 169), (12, 144)] this is before the first dot and it continues on multiple lines flip is on flip is off stuff

DOM Architecture

DOM Document Virtual representation of the HTML or XML document Used to retrieve elements from the document Used to create new document components which can later be inserted into the document Holds the root element

DOM Document Structure There’s a text node between every pair of element nodes, even if the text is empty. XML comments appear in special comment nodes. Element attributes do not appear in tree – available through Element object. Document +---Element +---Text "this is before the first dot | and it continues on multiple lines" +---Element +---Text "" +---Element +---Text "" +---Element | +---Text "flip is on" | +---Element | +---Text "" | +---Element | +---Text "" +---Text "flip is off" +---Element +---Text "" +---Element | +---Text "stuff" +---Text "" +---Comment "a final comment" +---Text "" XML Input: Document structure: this is before the first dot and it continues on multiple lines flip is on flip is off stuff

Here’s the basic recipe for getting started with DOM: Using DOM import javax.xml.parsers.*; import org.w3c.dom.*; // get a DocumentBuilder object DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = null; try { db = dbf.newDocumentBuilder(); } catch (ParserConfigurationException e) { e.printStackTrace(); } // invoke parser to get a Document Document doc = db.parse(inputStream); Document doc = db.parse(file); Document doc = db.parse(url);

OK, say we have a Document. How do we get at the pieces of it? Here are some common idioms: DOM Document access idioms // get the root of the Document tree Element root = doc.getDocumentElement(); // get nodes in subtree by tag name NodeList dots = root.getElementsByTagName("dot"); // get first dot element Element firstDot = (Element) dots.item(0); // get x attribute of first dot String x = firstDot.getAttribute("x");

More Document accessors Node access methods: StringgetNodeName() shortgetNodeType() DocumentgetOwnerDocument() booleanhasChildNodes() NodeListgetChildNodes() NodegetFirstChild() NodegetLastChild() NodegetParentNode() NodegetNextSibling() NodegetPreviousSibling() booleanhasAttributes()... and more... Element extends Node and adds these access methods: StringgetTagName() booleanhasAttribute(String name) StringgetAttribute(String name) NodeListgetElementsByTagName(String name) … and more … Document extends Node and adds these access methods: ElementgetDocumentElement() DocumentTypegetDoctype()... plus the Element methods just mentioned...... and more... e.g. DOCUMENT_NODE, ELEMENT_NODE, TEXT_NODE, COMMENT_NODE, etc.

The DOM API also includes lots of methods for creating and manipulating Document objects: Creating & manipulating Documents // get new empty Document from DocumentBuilder Document doc = db.newDocument(); // create a new Element and add to Document as root Element root = doc.createElement("dots"); doc.appendChild(root); // create a new Element and add as child of root Element dot = doc.createElement("dot"); dot.setAttribute("x", "9"); dot.setAttribute("y", "81"); root.appendChild(dot);

More Document manipulators Node manipulation methods: voidsetNodeValue(String nodeValue) NodeappendChild(Node newChild) NodeinsertBefore(Node newChild, Node refChild) NoderemoveChild(Node oldChild)... and more... Element manipulation methods: voidsetAttribute(String name, String value) voidremoveAttribute(String name) … and more … Document manipulation methods: TextcreateTextNode(String data) CommentcreateCommentNode(String data)... and more...

Strangely, since JAXP 1.1, there is no simple, documented way to write out a Document object as XML. Instead, you can exploit an undocumented trick: cast the Document to a Crimson XmlDocument, which knows how to write itself out: There is a supported way to write Documents as XML via the XSLT library, but it is far more clumsy than this two-line trick. Of course, one could just walk the Document tree and write XML using printlns. JDOM remedies this with easy XML output! Creating & manipulating Documents import org.apache.crimson.tree.XmlDocument; XmlDocument x = (XmlDocument) doc; x.write(out, "UTF-8");

DOMExample : Code Please see DOMExample.javaDOMExample.java

// DOMExample.java // Example of using DOM for XML parsing import java.util.*; import java.io.*; import javax.xml.parsers.*; import org.w3c.dom.*; import javax.xml.transform.*; // For transforming a DOM tree to an XML file. class DOMExample { /** XML tag strings */ public final String DOTS = "dots"; public final String DOT = "dot"; public final String X = "x"; public final String Y = "y"; public final String FLIP = "flip"; /** Data model */ private class Dot { int x; int y; public Dot(int x, int y) { this.x = x; this.y = y; } public String toString() { return "(" + x + ", " + y + ")"; } } public List dotList = new ArrayList(); private DocumentBuilder db; /** Construct instance and initialize DocumentBuilder. */ public DOMExample() { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); try { db = dbf.newDocumentBuilder(); } catch (ParserConfigurationException e) { e.printStackTrace(); }

/** Read XML from input stream, construct Document (i.e. XML node * tree) in memory, and process nodes. */ private void readXML(InputStream in) { try { // Invoke DocumentBuilder to parse input and create Document Document doc = db.parse(in); System.out.println("\nHere's the Document tree I just read:"); printDocument(doc); // Get the root element Element root = doc.getDocumentElement(); // Get all the DOT children NodeList dots = root.getElementsByTagName(DOT); // Iterate through them and add them to data model for (int i = 0; i < dots.getLength(); i++) { Element dotElement = (Element) dots.item(i); int x = Integer.parseInt(dotElement.getAttribute(X)); int y = Integer.parseInt(dotElement.getAttribute(Y)); dotList.add(new Dot(x, y)); } catch (Exception e) { e.printStackTrace(); }

/** Create the XML element for a single dot. */ private Element createDotElement(Document doc, int x, int y) { Element dot = doc.createElement(DOT); dot.setAttribute(X, Integer.toString(x)); dot.setAttribute(Y, Integer.toString(y)); return(dot); } /** Construct Document (i.e. XML node tree) from data model. */ public Document createDocument() { // Make new empty Document Document doc = db.newDocument(); // Create the root node and add to the document Element root = doc.createElement(DOTS); doc.appendChild(root); // Go through all the dots and append them to the DOTS node Iterator it = dotList.iterator(); while (it.hasNext()) { Dot dot = (Dot) it.next(); Element dotElement = createDotElement(doc, dot.x, dot.y); root.appendChild(dotElement); } return doc; }

/** Construct Document (i.e. XML node tree) from data model and * ask it to write itself out. * * Note 1: Since JAXP 1.1, there is no simple, documented way to * write out a Document object. This code uses an undocumented * trick. There is a supported way via the XSLT library, but it * is far more clumsy than this two-line trick. * * Note 2: Another strategy would be just to println() the XML * text straight from our data model. */ public void writeXML(OutputStream o) { try { Document doc = createDocument(); // ----------------------------------------------------- // Here's the trick: //Writer out = new OutputStreamWriter(o); // 1. Downcast Document to a Crimson XmlDocument //org.apache.crimson.tree.XmlDocument x = // (org.apache.crimson.tree.XmlDocument) doc; // 2. XmlDocument knows how to write itself out -- woo hoo! //x.write(out, "UTF-8"); //out.close(); // ----------------------------------------------------- // Output the DOM tree to the specified stream as an XML document. PrintWriter printWriter = new PrintWriter(new FileOutputStream("dots2.xml")); // declare a file output object TransformerFactory factory = TransformerFactory.newInstance(); Transformer transformer = factory.newTransformer(); transformer.transform(new javax.xml.transform.dom.DOMSource(doc), new javax.xml.transform.stream.StreamResult(printWriter)); } catch (Exception e) { System.err.println("Save XML err:" + e); }

/** Test by running "java DOMExample " */ public static void main (String[] args) { if (args.length != 1) { System.err.println ("Usage: cmd filename"); System.exit(1); } try { DOMExample example = new DOMExample(); InputStream in = new BufferedInputStream(new FileInputStream(new File(args[0]))); example.readXML(in); System.out.println("\nFinished parsing input. Got the following dots:"); System.out.println(example.dotList); System.out.println("\nRecreated XML document:"); example.writeXML(System.out); } catch (Exception e) { e.printStackTrace(); } // The remaining methods just enable tree printing. // They are not especially instructive to read. private String nodeToString(Node node, int indentLevel) { StringBuffer buf = new StringBuffer("\n"); for (int i = 0; i < indentLevel; i++) buf.append(" "); if (node.hasChildNodes()) { buf.append("(" + nodeLabel(node)); NodeList kids = node.getChildNodes(); for (int i = 0; i < kids.getLength(); i++) { buf.append(" " + nodeToString(kids.item(i), indentLevel + 1)); } buf.append(")"); } else { buf.append(nodeLabel(node)); } return buf.toString(); }

// An array of names for DOM node-types // (Array indexes = nodeType() values.) static final String[] typeName = { "none", "Element", "Attr", "Text", "CDATA", "EntityRef", "Entity", "ProcInstr", "Comment", "Document", "DocType", "DocFragment", "Notation", }; private String nodeLabel(Node node) { short typeCode = node.getNodeType(); String type = typeName[typeCode]; String name = node.getNodeName(); String value = node.getNodeValue(); if (value != null) value = value.trim(); return type + " \"" + name + (((typeCode != Node.TEXT_NODE && typeCode != Node.COMMENT_NODE) || value.equals("") || value.equals("null")) ? "" : "\", value=\"" + value) + "\""; } public void printDocument(Document doc) { System.out.println(nodeToString(doc, 0)); }

C:\Edgars\LUMII\LU_Kursi\Java\EC>java DOMExample dots.xml Here's the Document tree I just read: (Document "#document" (Element "dots" Text "#text", value="this is before the first dot and it continues on multiple lines" Element "dot" Text "#text" Element "dot" Text "#text" (Element "flip" Text "#text", value="flip is on" Element "dot" Text "#text" Element "dot" Text "#text") Text "#text", value="flip is off" Element "dot" Text "#text" (Element "extra" Text "#text", value="stuff") Text "#text" Comment "#comment", value="a final comment" Text "#text")) Finished parsing input. Got the following dots: [(9, 81), (11, 121), (196, 14), (169, 13), (12, 144)] Recreated XML document:

DOM Document Structure Some usefull methods to work with XML file  getElementsByTagName()  getAttribute()  getChildNodes()  getNodeType()  getNodeName()  getNodeValue()  getTextContent() (Element "Komanda" Text "#text" (Element "Speletajs" Text "#text" (Element "Minutes" Text "#text", value="2") Text "#text" Element "Sodi_mesti" Text "#text" Element "Sodi_iemesti" Text "#text" Element "Divpunktu_mesti" Text "#text" Element "Divpunktu_iemesti" Text "#text" Element "Trispunktu_mesti" Text "#text" Element "Trispunktu_iemesti" Text "#text") Text "#text" (Element "Speletajs" Text "#text" (Element "Minutes" Text "#text", value="25") Text "#text" (Element "Sodi_mesti" Text "#text", value="9") Text "#text" (Element "Sodi_iemesti" Text "#text", value="6") Text "#text" (Element "Divpunktu_mesti" Text "#text", value="8") Text "#text" (Element "Divpunktu_iemesti" Text "#text", value="3") Text "#text" (Element "Trispunktu_mesti" Text "#text", value="1") Text "#text" (Element "Trispunktu_iemesti" Text "#text", value="0") Text "#text") Text "#text" (Element "Speletajs" Text "#text" Team2=Vilki Augusts Septembris:4 Minutes=2 Sodi_mesti= Sodi_iemesti= Divpunktu_mesti= Divpunktu_iemesti= Trispunktu_mesti= Trispunktu_iemesti= Zigurds Zvilnis:11 Minutes=25 Sodi_mesti=9 Sodi_iemesti=6 Divpunktu_mesti=8 Divpunktu_iemesti=3 Trispunktu_mesti=1 Trispunktu_iemesti=0 DOMExample.java

DOM Document Structure Team2=Vilki Augusts Septembris:4 Minutes=2 Sodi_mesti= Sodi_iemesti= Divpunktu_mesti= Divpunktu_iemesti= Trispunktu_mesti= Trispunktu_iemesti= Zigurds Zvilnis:11 Minutes=25 Sodi_mesti=9 Sodi_iemesti=6 Divpunktu_mesti=8 Divpunktu_iemesti=3 Trispunktu_mesti=1 Trispunktu_iemesti=0... tmpElement = (Element) example.teamsList.item(1); System.out.println("Team2="+tmpElement.getAttribute("Nosaukums")); for (int i = 0; i < example.teamPlayerList2.getLength(); i++) { tmpElement = (Element) example.teamPlayerList2.item(i); System.out.println(" " + tmpElement.getAttribute("Vards") + " " + tmpElement.getAttribute("Uzvards") + ":" + tmpElement.getAttribute("Numurs")); if (tmpElement.hasChildNodes()) { NodeList kids = tmpElement.getChildNodes(); for (int j = 0; j < kids.getLength(); j++) { Node tmpNode = kids.item(j); if (tmpNode.getNodeType() == Node.ELEMENT_NODE) System.out.println(" " + tmpNode.getNodeName() + "=" + tmpNode.getTextContent()); //tmpNode.getFirstChild().getNodeValue()); }... Skatīt DOMParseXML2.javaDOMParseXML2.java Skatīt spele1.xmlspele1.xml

DOM can be awkward for Java programmers  Language-neutral  does not use Java features Example: getChildNodes() returns a NodeList, which is not a List. ( NodeList.iterator() is not defined.)  Written by C programmers  Multiple ways to traverse, with different interfaces  Tedious to walk around tree to do simple tasks  Doesn't support Java standards ( java.util.collections ) JDOM looks like a good alternative:  open source project, Apache license  builds on top of JAXP, integrates with SAX and DOM  similar to DOM model, but no shared code  API designed to be easy & obvious for Java programmers  exploits power of Java language: collections, method overloading  XML output is easy!  Key packages: org.jdom, org.jdom.transform, org.jdom.input, org.jdom.output. JDOM Overview

The DOM way: DOM vs. JDOM DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.newDocument(); Element root = doc.createElement("root"); Text text = doc.createText("This is the root"); root.appendChild(text); doc.appendChild(root); Document doc = new Document(); Element e = new Element("root"); e.setText("This is the root"); doc.addContent(e); The JDOM way:

There’s a good JAXP/SAX/DOM tutorial at: http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/ http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/ You can learn more about JDOM at http://www.jdom.org/docs/faq.html http://www.jdom.org/docs/faq.html Information

Jautājumi ?

Modernās Programmēšanas Tehnoloģijas (Advanced Programming Technologies) Edgars Celms, Mārtiņš Opmanis Latvijas Universitātes Matemātikas.

Similar presentations

Presentation on theme: "Modernās Programmēšanas Tehnoloģijas (Advanced Programming Technologies) Edgars Celms, Mārtiņš Opmanis Latvijas Universitātes Matemātikas."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modernās Programmēšanas Tehnoloģijas (Advanced Programming Technologies) Edgars Celms, Mārtiņš Opmanis Latvijas Universitātes Matemātikas.

Similar presentations

Presentation on theme: "Modernās Programmēšanas Tehnoloģijas (Advanced Programming Technologies) Edgars Celms, Mārtiņš Opmanis Latvijas Universitātes Matemātikas."— Presentation transcript:

Similar presentations

About project

Feedback