XML Study-Session: Part III

Slides:



Advertisements
Similar presentations
1/7 ITApplications XML Module Session 8: Introduction to Programming with XML.
Advertisements

3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
XML Study-Session: Part IV Transforming XML Documents Copyright Quddus Chong 2001.
14-Jun-15 DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
XML DOM and SAX Parsers By Omar RABI. Introduction to parsers  The word parser comes from compilers  In a compiler, a parser is the module that reads.
XML Parser. Why Need a XML Parser ? Check XML syntax. ( is well-formed ? ) Validation. ( DTD and XML Schema ) Allow programmatic access to the document’s.
CS 898N – Advanced World Wide Web Technologies Lecture 22: Applying XML Chin-Chih Chang
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
Technical Track Session XML Techie Tools Tim Bornholt.
Chapter 13 XML Concept of XML Simple Example of XML XML vs. HTML in Syntax XML Structure DTD and CDATA Sections Concept of SAX Processing Download and.
By: Shawn Li. OUTLINE XML Definition HTML vs. XML Advantage of XML Facts Utilization SAX Definition DOM Definition History Comparison between SAX and.
DHTML. What is DHTML?  DHTML is the combination of several built-in browser features in fourth generation browsers that enable a web page to be more.
JavaScript, Fifth Edition Chapter 1 Introduction to JavaScript.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
XSLT for Data Manipulation By: April Fleming. What We Will Cover The What, Why, When, and How of XSLT What tools you will need to get started A sample.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
DOM Robin Burke ECT 360. Outline XHTML in Schema JavaScript DOM (MSXML) Loading/Parsing Transforming parameter passing DOM operations extracting data.
Working with the XML Document Object Model ©NIITeXtensible Markup Language/Lesson 7/Slide 1 of 44 Objectives In this lesson, you will learn to: *Identify.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
1 Dr Alexiei Dingli XML Technologies XML Advanced.
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Scripting with the DOM Ellen Pearlman Eileen Mullin Programming the Web.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
C# and Windows Programming XML Processing. 2 Contents Markup XML DTDs XML Parsers DOM.
1 Dr Alexiei Dingli XML Technologies SAX and DOM.
SDPLNotes 3.2: DOM1 3.2 Document Object Model (DOM) n How to provide uniform access to structured documents in diverse applications (parsers, browsers,
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
COP 3813 Intro to Internet Computing Prof. Roy Levow XML.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Document Object Model.  The XML DOM (Document Object Model) defines a standard way for accessing and manipulating XML documents.  The DOM presents an.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
XML DOM  XML Document Object Model provides a robust international standard for XML Documents.  DOM Level 1 is a Dec 11, 1998 W3C recommendation.  XML.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
21-Jun-16 Document Object Model DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C.
XML 1.Introduction to XML 2.Document Type Definition (DTD) 3.XML Parser 4.Example: CGI Gateway to XML Middleware.
WELL- FORMEDNESS CH 6. Objective Well-formedness rules Text in XML Elements and Tags in Atributes Entity references CDATA sections Comments Unicode XML1.1.
Chapter 13 XML Concept of XML Simple Example of XML
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Unit 4 Representing Web Data: XML
Using XML Tools CS551 – Fall 2001.
Intro to XML.
Database Processing with XML
The XML Language.
Chapter 7 Representing Web Data: XML
Creating an XML Document
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
More Sample XML By Sadia Anjum.
XML Problems and Solutions
XML Parsers.
XML Programming in Java
XML and Web Services (II/2546)
Presentation transcript:

XML Study-Session: Part III Parsing XML Documents

Objectives By completing this study-session, you should be able to: Learn to use the IBM XML4J Java XML parser. Gain familiarity with the Document Object Model (DOM). Be able to create a parsing application to display, navigate, and modify an XML document.

What is parsing? Interpretation of text. The XML parser’s job is load the document, check that follows all necessary rules (at minimum, for well-formedness), and build a document tree structure that can be passed on to the application. The application is any program (e.g. browser, reader, middleware) that acts upon the tree structure, processing the data it contains.

Overview of XML parsing Packets of parsed XML data Application to manipulate XML Data XML Document XML Parser XML Application Fig. 1 (from “Building XML Applications,” St. Laurent and Cerami) Every XML application includes at least two pieces: an XML parser and an application to manipulate the parsed XML data.

Types of parsers Validating vs. Non-validating: A validating parser checks a document against a declared DTD. Tree-based vs. Event-driven interface: Parser with tree-based interface will read entire document and create an internal tree representation of the data which can then be traversed by the application. A standardized API for this interface is the W3C DOM. In the event-driven model, the parser reads through the document and signals each significant parsing event (e.g. start of document, start of element, end of element). Callback methods are used to handle these events as they occur. This approach is used by the Simple API for XML (SAX).

The IBM XML4J parser Open source Java parser developed by IBM and now available as part of the xml.apache.org project under the codename Xerces. Version 3.1.1 API supports DOM level 1 and SAX level 1. Can be downloaded from as .zip file from www.alphaworks.ibm.com/tech/xml4j. Ideal for standalone Java applications and working with Java servlets.

Setting up your environment To use the classes in XML4J, you must set your Java CLASSPATH variable so that Java can locate the xerces.jar and xercesSamples.jar files To set classpath in Jcreator: Configure -> Options -> JDK Profiles -> select JDK version -> Edit -> Add Package -> add d:/xml4j/xerces.jar and d:/xml4j/xercesSamples.jar To run/execute project with command-line arguments: Project -> Project Settings -> JDK Tools -> Select tool type: Run Application -> select <Default> -> Edit -> Parameters -> set “Prompt for main function argument” checkbox to ‘True’.

Understanding DOM The W3C DOM specifies an interface for treating a document as a tree of nodes. A Node object, implemented in Java DOM, has methods such as getChildNode(), getNextSibling(), getParentNode(), getNodeType(), etc. Possible node types in DOM include: Element, Attribute, Comment, Text, CDATA section, Entity reference, Entity, Processing Instruction, Document, Document type, Document fragment, and Notation.

Example: (petfile.xml) <?xml version=‘1.0’ encoding=‘UTF-8’?> <Pets> <Pet ID=‘001’Registered=‘030801’> <Name>Rover</Name> <Age>3</Age> <Description Species=‘Dog’> Yellow colored Golden Retreiver </Description> </Pet> <Pet ID=‘002’Registered=‘101100’> <Name>Ella</Name> <Age>1</Age> <Description Species=‘Tortoise’> Green and black shelled pond crawler </Pets>

Example DOM structure Pets Pet Pet ID Registered Name Age Description Yellow colored Golden Retriever 001 030801 Rover 3 Dog

Understanding DOM (contd.) In XML4J, the classes that support the W3C DOM interface are stored in the org.w3c.dom class and the classes for the DOM parser are stored in the org.apache.xerces.parsers.DOMparser class. High-level constructs such as Element and Attribute in DOM extend the Node interface. So, for instance, an Attribute object has methods such as getName() and getValue() and also getNodeName(). Complete API documentation can be found online at http://xml.apache.org/apiDocs/index.html.

Creating a parser From the XML Reference page, download and view the FirstParser.java sample code. This program will parse an XML document (“customer.xml”, passed as a command-line argument) and display the number of a certain element (in this case, the number of <Customer> elements) in it.

Displaying a document From the XML Reference page, download and view the IndentingParser.java sample code. This program will parse and display an entire XML document (passed as a command-line argument) with proper indentation. Separate handler methods are used to handle the document (i.e. root) node, element nodes, attributes, CDATA sections, text nodes, and Processing Instruction nodes.

Navigating a document From the XML Reference page, download and view the nav.java sample code. This program will parse the “meetings.xml” document and navigate the tree structure to locate the name of the third person. Note that the XML4J parser treats indented space in the XML document as text nodes. We can set the parser to ignore whitespace by calling the parser method setIncludeIgnorableWhitespace with the value ‘false’.

Modifying a document From the XML Reference page, download and view the XMLWriter.java sample code. This program will parse an XML document (“customer.xml”, passed as a command-line argument) and modify it by adding a new <Middle_Name>XML</Middle_Name> element to every customer. The modified document tree is then written to a new file with the name “customer2.xml”.

Next session: Presenting XML Documents Stylesheets Writing your own XSL applications