Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.

Similar presentations


Presentation on theme: "XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion."— Presentation transcript:

1 XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion

2 Types of Parsers There are several different ways to categorise parsers: – Validating versus non-validating parsers – Parsers that support the Document Object Model (DOM) – Parsers that support the Simple API for XML (SAX) – Parsers written in a particular language (Java, C++, Perl, etc.)

3 Non-validating Parsers  Speed and efficiency -It takes a significant amount of effort for an XML parser to process a DTD and make sure that every element in an XML document follows the rules of the DTD.  If only want to find tags and extract information - use non-validating

4 Using XML Parsers Three basic steps to use an XML parser –Create a parser object –Pass your XML document to the parser –Process the results Generally, writing out XML is outside scope of parsers (though some may implement proprietary mechanisms)

5 Parsing XML Two established API's: –SAX (Simple API for XML) Define handlers containing methods as XML parsed –DOM (Document Object Model) Defines a logical tree representing the parsed XML

6 Parsing XML: DOM Document Object Model standard API for accessing and creating XML data tree-based programming language indepedent developed by W3C whole document is read into memory read and write

7 Creating a DOM Tree A DOM implementation will have a method to pass a XML file to a factory object that will return a Document object that represents root element of whole document After this, may use DOM standard interface to interact with XML structure APIAPI Application

8 Parsing XML: DOM XML FileDOM Tree

9 DOM Interfaces The DOM defines several interfaces –NodeThe base data type of the DOM –ElementRepresents element –AttrRepresents an attribute of an element –TextThe content of an element or attribute –DocumentRepresents the entire XML document. A Document object is often referred to as a DOM tree

10 DOM Level DOM Level 1 - basic functionality for document navigation and manipulation. DOM Level 2 - includes a style sheet object model - defines an event model and provides support for XML namespaces. DOM Level 3 - still under development - addresses document loading and saving - content model (DTDs and schemas) with document validation support.

11 Parsing XML: SAX Simple API for XML API for accessing xml data event based programming language indepedent application has to store fragments into memory read only

12 Parsing XML: SAX SAX is an interface to the XML parser based on streaming and call-backs You need to implement the HandlerBase interface : startDocument, endDocument startElement, endElement characters warning, error, fatalError

13 Parsing XML: SAX XML FileSAX calls

14 SAX versus DOM DOM: read and write need to move back and forth in data document is human created SAX: read only huge data or streams data is machine generated

15 DOM pro and contra PRO The file is parsed only once. High navigation abilities : this is the aim of the DOM design. CONTRA More memory needed since the XML tree is in memory.

16 SAX pro and contra PRO Low memory needs since the XML file is never entirely in memory Can deal with XML streams CONTRA The file has to be parsed entirely to access any node. Thus, getting the 10 nodes included in a catalog ended up in parsing 10 times the same file. Poor navigation abilities : no way to get easily the children of a given node or the list of "B" nodes

17 SAX versus DOM If your document is very large and you only need a few elements - use SAX If you need to process many elements and perform operations on XML - use DOM If you need to access the XML many times - use DOM

18 Parser Products Xerces4J / Xerces4C++ (Apache) James Clark’s XP (Java) IBM XML4J / XML4C++ Java Project X (Sun) Oracle’s XML Parser for Java MSXML (Microsoft) Dan Connolly’s XML Parser (Phyton) …

19 Conclusion The parser is key building block for every XML application. When building XML applications, you have to think how will you handle large chunks of data Choosing between SAX and DOM is not always trivial

20 The End Questions? Thank you!


Download ppt "XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion."

Similar presentations


Ads by Google