Presentation is loading. Please wait.

Presentation is loading. Please wait.

SE 5145 – eXtensible Markup Language (XML ) DOM (Document Object Model) (Part I) 2011-12/Spring, Bahçeşehir University, Istanbul.

Similar presentations


Presentation on theme: "SE 5145 – eXtensible Markup Language (XML ) DOM (Document Object Model) (Part I) 2011-12/Spring, Bahçeşehir University, Istanbul."— Presentation transcript:

1 SE 5145 – eXtensible Markup Language (XML ) DOM (Document Object Model) (Part I) 2011-12/Spring, Bahçeşehir University, Istanbul

2 4h Assignment: DOM implementation of XML Resume Implement DOM to your Resume. Required: There must exist a Checkbox containing the labels "Technical" and "Management". Upon checking the label(s), relevant sections of your Resume must be highligted with different colors. Optional: A "Show experience duration" button should calculate the sum of check-ed experiences. You can write the implementation by using one of the following ways; a) Javascript (Due April 27) or b) Java, C#, or any other language you prefer (Due May11) Notes: 1. A sample Javascript based DOM implementation is displayed on the next slide. Relevant files will also be sent to you. 2. Details of the assignment was already discussed at the last lecture. If you did not attend it, please ask the details to your classmates, not to me since nowadays I' m not available to reply individual emails. 2

3 Sample DOM Implementation Inspect the files: JoesCafe.html JoesCafeCode.js 3

4 DOM (Document Object Model) Q: How to provide uniform access to structured documents in diverse applications (parsers, browsers, editors, databases)? A: Use an XML API (DOM or SAX) 4

5 XML APIs XML processors make the structure and contents of XML documents available to applications through APIs Event-based APIs notify application through parsing events e.g., the SAX call-back interfaces Object-model (or tree) based APIs provide a full parse tree e.g, DOM, W3C Recommendation more convenient, but may require too much resources with the largest documents Major parsers support both SAX and DOM 5

6 DOM: What is it? W3C standard adopted in 1998 An object-based API for XML and HTML documents Developed to support "dynamic HTML" Provide a standard tree interface to document structure across browsers, for use in JavaScript Provides a standardized way of building documents, navigating their structure, adding, modifying or deleting elements and content using programmatic techniques (by programs and scripts). Provides a foundation for developing, querying, filtering, transformation, rendering etc. applications on top of DOM implementations 6

7 DOM: What is it? Platform-, browser- and language-neutral Programming-language specific mappings for JavaScript, Java as part of the specification Implementations in other languages: C++, Python, C#,... In contrast to “Serial Access XML” (for SAX) could think as “Directly Obtainable in Memory” (for DOM). 7

8 8 SAX vs. DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML document and sends an event for each element that it encounters Consequences: DOM provides “random access” into the XML document SAX provides only sequential access to the XML document DOM is slow and requires huge amounts of memory, so it cannot be used for large XML documents SAX is fast and requires very little memory, so it can be used for huge documents (or large numbers of documents) This makes SAX much more popular for web sites Some DOM implementations have methods for changing the XML document in memory; SAX implementations do not

9 Overview of W3C DOM Specification Second one in the “XML-family” of recommendations Different levels of implementation: Level 1 (W3C Rec, Oct. 1998) : Flat object model (two features: DOM and HTML) Level 2 (W3C Rec, Nov. 2000): API structured into multiple modules Core, XML, HTML, Range, Traversal,... Level 3 (W3C Working Draft (January 2002): New and revised modules new: Load and Save, Validation revised: Core, Events in progress, W3C Notes: XPath, Views and Formatting,... 9

10 DOM Features Core: Represent basic structure of well-formed XML documents XML: Access entities, notations,... Events: Communicate user interaction and document changes to the application HTMLEvents, MutationEvents, UIEvents,... Range: Select portions of a document Traversal: Process/Filter nodes in sequence Views: Access alternative representations of a document StyleSheet/CSS: Represent of style sheets HTML: Represent HTML documents LS, LS-Async: Load and Save 10

11 DOM Tree & Nodes Text nodes are seperate nodes in the tree, not considered to be parts of the tags 11

12 DOM Tree & Nodes 12

13 DOM Tree & Nodes 13

14 DOM Tree & Nodes 14

15 DOM structure model Based on O-O concepts: methods (to access or change object’s state) interfaces (declaration of a set of methods) objects (encapsulation of data and methods) Roughly similar to the XSLT/XPath data model (to be discussed later)  a parse tree Tree-like structure implied by the abstract relationships defined by the programming interfaces; Does not necessarily reflect data structures used by an implementation (but probably does) 15

16 Structure of DOM Level 1 I: DOM Core Interfaces Fundamental interfaces basic interfaces to structured documents Extended interfaces XML specific: CDATASection, DocumentType, Notation, Entity, EntityReference, ProcessingInstruction II: DOM HTML Interfaces more convenient to access HTML documents (we ignore these) 16

17 DOM Level 2 Level 1: basic representation and manipulation of document structure and content (No access to the contents of a DTD) DOM Level 2 adds support for namespaces accessing elements by ID attribute values optional features interfaces to document views and style sheets an event model (for, say, user actions on elements) methods for traversing the document tree and manipulating regions of document (e.g., selected by the user of an editor) Loading and writing of docs not specified (-> Level 3) 17

18 18 Structure of the DOM tree The DOM tree is composed of Node objects Node is an interface Some of the more important subinterfaces are Element, Attr, and Text An Element node may have children Attr and Text nodes are leaves Additional types are Document, ProcessingInstruction, Comment, Entity, CDATASection and several others Hence, the DOM tree is composed entirely of Node objects, but the Node objects can be downcast into more specific types as needed

19 19 invoice invoicepage name addressee addressdata address form="00" type="estimatedbill" Leila Laskuprintti streetaddresspostoffice 70460 KUOPIOPyynpolku 1 <invoicepage form="00" type="estimatedbill"> Leila Laskuprintti Pyynpolku 1 70460 KUOPIO... Document Element NamedNodeMap Text DOM structure model

20 Core Interfaces: Node & its variants Node Comment DocumentFragmentAttr Text Element CDATASection ProcessingInstruction CharacterData EntityDocumentTypeNotation EntityReference “Extendedinterfaces” Document 20

21 21 DOM interfaces: Node invoice invoicepage name addressee addressdata address form="00" type="estimatedbill" Leila Laskuprintti streetaddresspostoffice 70460 KUOPIOPyynpolku 1 Node getNodeType getNodeValue getOwnerDocument getParentNode hasChildNodesgetChildNodes getFirstChild getLastChild getPreviousSibling getNextSibling hasAttributesgetAttributes appendChild(newChild) insertBefore(newChild,refChild) replaceChild(newChild,oldChild) removeChild(oldChild) Document Element NamedNodeMap Text

22 22 Operations on Node s, I The results returned by getNodeName(), getNodeValue(), getNodeType() and getAttributes() depend on the subtype of the node, as follows: Element Text Attr getNodeName() getNodeValue() getNodeType() getAttributes() tag name null ELEMENT_NODE NamedNodeMap "#text" text contents TEXT_NODE null name of attribute value of attribute ATTRIBUTE_NODE null

23 23 Distinguishing Node types Here’s an easy way to tell what kind of a node you are dealing with: switch(node.getNodeType()) { case Node.ELEMENT_NODE: Element element = (Element)node;...; break; case Node.TEXT_NODE: Text text = (Text)node;... break; case Node.ATTRIBUTE_NODE: Attr attr = (Attr)node;... break; default:... }

24 24 Operations on Node s, II Tree-walking operations that return a Node : getParentNode() getFirstChild() getNextSibling() getPreviousSibling() getLastChild() Tests that return a boolean : hasAttributes() hasChildNodes()

25 25 invoice invoicepage name addressee addressdata address form="00" type="estimatedbill" Leila Laskuprintti streetaddresspostoffice 70460 KUOPIOPyynpolku 1 Document getDocumentElement createAttribute(name) createElement(tagName) createTextNode(data) getDocType() getElementById(IdVal) Node Document Element NamedNodeMap Text DOM interfaces: Document

26 26 DOM interfaces: Element invoice invoicepage name addressee addressdata address form="00" type="estimatedbill" Leila Laskuprintti streetaddresspostoffice 70460 KUOPIOPyynpolku 1 Element getTagName getAttributeNode(name) setAttributeNode(attr) removeAttribute(name) getElementsByTagName(name) hasAttribute(name) Node Document Element NamedNodeMap Text

27 27 Operations for Element s String getTagName() Returns the name of the tag boolean hasAttribute(String name) Returns true if this Element has the named attribute String getAttribute(String name) Returns the (String) value of the named attribute boolean hasAttributes() Returns true if this Element has any attributes This method is actually inherited from Node Returns false if it is applied to a Node that isn’t an Element NamedNodeMap getAttributes() Returns a NamedNodeMap of all the Element ’s attributes This method is actually inherited from Node Returns null if it is applied to a Node that isn’t an Element

28 28 Operations on Text s Text is a subinterface of CharacterData which, in turn, is a subinterface of Node In addition to inheriting the Node methods, it inherits these methods (among others) from CharacterData : public String getData() throws DOMException Returns the text contents of this Text node public int getLength() Returns the number of Unicode characters in the text public String substringData(int offset, int count) throws DOMException Returns a substring of the text contents Text also declares some methods public String getWholeText() Returns a concatenation of all logically adjacent text nodes

29 29 Operations on Attr s String getName() Returns the name of this attribute. Element getOwnerElement() Returns the Element node this attribute is attached to, or null if this attribute is not in use boolean getSpecified() Returns true if this attribute was explicitly given a value in the original document String getValue() Returns the value of the attribute as a String

30 Object Creation in DOM Each DOM object X lives in the context of a Document: X.getOwnerDocument() Objects implementing interface X are created by factory methods D.create X (…), where D is a Document object. E.g: createElement("A"), createAttribute("href"), createTextNode("Hello!") Creation and persistent saving of Document s left to be specified by implementations 30

31 Accessing properties of a Node Node.getNodeName () for an Element = getTagName() for an Attr: the name of the attribute for Text = "#text" etc Node.getNodeValue() content of a text node, value of attribute, …; null for an Element (!!) (in XSLT/Xpath: the full textual content) Node.getNodeType() : numeric constants (1, 2, 3, …, 12) for ELEMENT_NODE, ATTRIBUTE_NODE,TEXT_NODE, …, NOTATION_NODE 31

32 Content and element manipulation Manipulating CharacterData D : D.substringData( offset, count ) D.appendData( string ) D.insertData( offset, string ) D.deleteData( offset, count ) D.replaceData( offset, count, string ) (= delete + insert) Accessing attributes of an Element object E : –E.getAttribute( name ) –E.setAttribute( name, value ) –E.removeAttribute( name ) 32

33 Additional Core Interfaces (1) NodeList for ordered lists of nodes e.g. from Node.getChildNodes() or Element.getElementsByTagName("name") all descendant elements of type "name" in document order (wild-card "*" matches any element type) Accessing a specific node, or iterating over all nodes of a NodeList : Accessing a specific node, or iterating over all nodes of a NodeList : –E.g. Java code to process all children: for (i=0; i<node.getChildNodes().getLength(); i++) process(node.getChildNodes().item(i)); 33

34 Additional Core Interfaces (2) NamedNodeMap for unordered sets of nodes accessed by their name: e.g. from Node.getAttributes() NodeList s and NamedNodeMap s are "live": changes to the document structure reflected to their contents Some operations on a NamedNodeMap are: getNamedItem(String name) returns (as a Node ) the attribute with the given name getLength() returns (as an int ) the number of Node s in this NamedNodeMap item(int index) returns (as a Node ) the index th item This operation lets you conveniently step through all the nodes in the NamedNodeMap Java does not guarantee the order in which nodes are returned 34

35 Exercises – JAVA SCRIPT Discovering Document Information Creating New Document Content 35

36 Discovering Document Information 36

37 Exercise 1: Discovering Document Information 37

38 Exercise 1: Discovering Document Information 38

39 Exercise 1: Discovering Document Information 39

40 Exercise 2: Discovering Document Information 40

41 Exercise 2: Discovering Document Information 41

42 Creating New Document Content 42

43 Exc. 3: Creating New Document Content 43

44 Exc. 3: Creating New Document Content 44

45 Exc. 3: Creating New Document Content 45

46 Exc. 3: Creating New Document Content Now, try: Adding more s Modifying Tags (change, remove, etc) 46


Download ppt "SE 5145 – eXtensible Markup Language (XML ) DOM (Document Object Model) (Part I) 2011-12/Spring, Bahçeşehir University, Istanbul."

Similar presentations


Ads by Google