Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stein XML 2.1 XML a first course Part 2 Yaakov J. Stein Chief Scientist RAD Data Communications.

Similar presentations


Presentation on theme: "Stein XML 2.1 XML a first course Part 2 Yaakov J. Stein Chief Scientist RAD Data Communications."— Presentation transcript:

1 Stein XML 2.1 XML a first course Part 2 Yaakov J. Stein Chief Scientist RAD Data Communications

2 Stein XML 2.2 Course Objectives XML what and why? Well-formed XML –Displaying XML in IE Valid XML and DTDs Parsing XML using JavaScript Processing XML using XSL

3 Stein XML 2.3 XML Parsing XML using JavaScript

4 Stein XML 2.4 XML Parsers All XML parsers MUST check for well-formed input Some XML parsers are validating, others nonvalidating There are two XML parser “philosophies” Event driven parsers (SAX) –Fast and small memory footprint –Output parsing results on-the-fly –Application must store information it needs –Can use stack to track hierarchy Tree parsers (DOM) –Slow and large memory footprint –Build full tree first, then user can traverse tree –Exploit “Object Oriented” languages

5 Stein XML 2.5 SAX Simple API for XML (present version SAX 2.0) Not developed by W3C BUT de-facto standard Versions for Java (Apache Xerces parser), C++, VB, Python, Perl some ContentHandler methods (callbacks) void setDocumentLocator (Locator locator) supplies application with event location void startDocument() throws SAXException receive notification of XML beginning void endDocument() throws SAXException receive notification of XML end void startElement (…) throws SAXException receive notification of element start tag void endElement (…) throws SAXException receive notification of element end tag void characters (…) throws SAXException receive notification of text void ignorableWhitespace(…) throws SAXException receive notification of space Example to be or not to be startElement quote characters “to be” startElement bold characters “or” endElement bold characters “not to be” endElement quote

6 Stein XML 2.6 Document Object Model DOM - API that provides access to XML/HTML document structure - Enables reading, deleting, changing, adding elements/attributes There is a good match between XML and tree hierarchy and object oriented programming vehicles vehicles.airplanes vehicles.motor_vehicles vehicles.motor_vehicles.trucks vehicles.motor_vehicles.cars vehicles.bicycles vehicles airplanes motor_vehicles bicycles truckscars

7 Stein XML 2.7 Nodes The basic unit in the DOM tree is the Node object Nodes that are not null also implement more specialized interfaces Node properties nodeName (readonly String) nodeType (readonly unsigned short) nodeValue (String) attributes (readonly NamedNodeMap) parentNode (readonly Node) childNodes (readonly NodeList) firstChild (readonly Node) lastChild (readonly Node) previousSibling (readonly Node) nextSibling (readonly Node) ownerDocument (readonly Document) prefix (String) localName (readonly String) namespaceURI (readonly String) Node methods boolean hasChildNodes() Node cloneNode(…) Node appendChild(…) Node removeChild(…) Node replaceChild(…) Node insertBefore(…) void normalize() boolean hasAttributes() boolean isSupported(…)

8 Stein XML 2.8 Node Types The W3C DOM defines the following types (as constants in the Node object - but IE doesn’t implement) constant’s name nodeName nodeValue data type 1.ELEMENT_NODE tag’s name null Element 2.ATTRIBUTE_NODE attribute’s name value Attr 3.TEXT_NODE #text text Text 4.CDATA_SECTION_NODE #cdata_section text CDATASection 5.ENTITY_REFERENCE_NODE referenced name null EntityRerence 6.ENTITY_NODE entity’s name null Entity 7.PROCESSING_INSTRUCTION_NODE PI’s target rest of PI ProcessingInstruction 8.COMMENT_NODE #comment text Comment 9.DOCUMENT_NODE #document null Document 10.DOCUMENT_TYPE_NODE dtd name null DocumentType 11.DOCUMENT_FRAGMENT_NODE #document-fragment null DocumentFragment 12.NOTATION_NODE notation’s name null Notation

9 Stein XML 2.9 Elements Element nodes have the following properties and methods (for full list see W3C site) Property tagName (readonly String) Methods boolean hasAttribute(String name) String getAttribute(String name) void setAttribute(String name, String value) Attr getAttributeNode (String name) Attr setAttributeNode(Attr newAttr) void removeAttribute(String name) Attr removeAttributeNode(Attr oldAttr) NodeList getElementsbyTagName(String name)

10 Stein XML 2.10 Attributes Attr nodes have the following properties (no methods) Properties name (readonly String) ownerElement (readonly Element) specified (readonly boolean) value (String)

11 Stein XML 2.11 NodeList and NamedNodeMap NodeList is an array of nodes Node.childNodes Property length (readonly unsigned long) Method Node item (unsigned long index) nl.item(k) is the same as nl[k] NamedNodeMap is a collection of Nodes indexed by names Property Node.Attributes length (readonly unsigned long) Methods Node item(unsigned Long index) Node getNamedItem(name) Node setNamedItem(…) Node removeNamedItem(name)

12 Stein XML 2.12 Character Data CharacterData nodes are the father of text and comment nodes text nodes are the father of CDATASection nodes Properties data (String) length (readonly unsigned long) Methods appendData() deleteData() insertData() replaceData() substringData() Node CharacterData TextComment CDATASection

13 Stein XML 2.13 Document Document nodes are needed to start everything Properties documentElement (readonly Element) root element of xml Doctype (readonly DocumentType) dtd Methods Element createElement(name) Attr createAttribute(name) Text createTextNode(…) Comment createComment(…) createEntityReference(…) createCDATASection(…) createProcessingInstruction(…) createDocumentFragment(…) Element getElementById(id) NodeList getElementsByTagName(name) createNodeIterator(…) createTreeWalker(…)

14 Stein XML 2.14 Parsing with JavaScript There are DOM interfaces for many (object oriented) languages –Java –JavaScript, ECMAScript, Jscript –C++ –VBScript It is easier to use a scripting language –Many required features are pre-programmed –Interpreted, not compiled –Platform independent JavaScript runs only inside a browser JavaScript is easier that Java which is easier than C++ (kids use it!) JavaScript is FUN (kids use it!)

15 Stein XML 2.15 How to use JavaScript Use JavaScript by placing script tags in HTML document internal javascript code or URL You can place SCRIPT tag anywhere, in HEAD or in BODY It is recommended to hide scripts from older non-scripting browsers <!-- HTML COMMENT // JAVASCRIPT COMMENT --> This page requires a modern browser!

16 Stein XML 2.16 Quick overview of JavaScript ECMAscript, see ECMA-262 Object oriented (object has properties, methods and events) Loosely typed (string (default), numbers, boolean) functions with arguments (not checked even for number) optional return value var declares local scope new allocates object don’t need ; Operators ++ - - + (numbers,strings) - * / % (mod) >= == != ~ (bit negation) ! && || ?: (conditional), NaN infinity Flow if if/else while for (c-like) for/in continue break return with Math PI E SQRT2 abs ceil floor round max min sqrt pow eval sin cos tan acos asin atan exp log random Date WeekDay DayFromTime DaysInYear etc.etc.etc.

17 Stein XML 2.17 Javascript Events EVENTS Onclick Mouse click Ondblclick Mouse double click onmouseover Mouse enters an element onmouseout Mouse leaves an element onmousemove Mouse moves onmousedown Mouse button is pressed onmouseup Mouse button is released onkeypress Visible character is pressed onkeydown Key is pressed onkeyup Key is released onload Document has finished loading onblur Element loses the focus onfocus Element gains the focus

18 Stein XML 2.18 Javascript Example function hi() { with (hello.style) { posLeft=event.clientX; posTop=event.clientY; } } function flying() { with (fly.style) { if (posLeft<300) { posLeft+=5; posTop+=5; } else { posLeft=10; posTop=10; } } setTimeout('flying()',10); } Hello World! I'm Flying!!! DHTML

19 Stein XML 2.19 XML Islands What happens when we define an XML island inside an HTML file ? XML Island Demo Nothing happens - the XML is in the DOM, but the browser doesn’t know what to do! (When we directly display an XML file HTML uses a default XSL) We have to manually extract from the XML DOM and insert it into the browser window as HTML!

20 Stein XML 2.20 An IE specific-feature XML islands are Microsoft-specific, and Microsoft supplies some non-standard ways of retrieving info XML island Demo printout Hello world! printout Hello world!

21 Stein XML 2.21 Javascript to the rescue Using javascript we can access the XML DOM in a standard way! Hello world! XML DOM Demo alert(hellodata.xml) document.write(hellodata.xml) alert displays the DOM object write displays the text (suppresses tags)

22 Stein XML 2.22 Let’s try a more interesting file! ASM-20 products/family/asm-20/asm-20.htm 4-wire D1 synchronous unmanaged 19.2 256 7.5 V.24...... Try alert and document.write !!!

23 Stein XML 2.23 Javascript Access to DOM What happens when we walk through the DOM tree? // main section of DOM (DTD after xsl please!) document.writeln("The document has " + modemdata.childNodes.length + " sections. ") for (n=0;n<modemdata.childNodes.length;n++) { document.writeln( " " + n + " " + " nodeType=" + modemdata.childNodes(n).nodeType + " nodeName=" + modemdata.childNodes(n).nodeName + " nodeValue=" + modemdata.childNodes(n).nodeValue + " " ) } The document has 5 sections. 0 nodeType=7 nodeName=xml nodeValue=version="1.0“ 1 nodeType=7 nodeName=xml-stylesheet nodeValue=type="text/xsl" href="modems.xsl“ 2 nodeType=8 nodeName=#comment nodeValue= modems.xml 3 nodeType=10 nodeName=modems nodeValue=null 4 nodeType=1 nodeName=modems nodeValue=null the XML tree

24 Stein XML 2.24 Let’s walk through the real tree! // first get the XML root node var rootnode = modemdata.documentElement // var rootnode = modemdata.childNodes(modemdata.childNodes.length-1) var nmodems = rootnode.childNodes.length document.writeln(" The root is " + rootnode.nodeName + " " + " and it has " + nmodems + " child nodes. ") // now traverse XML tree for (n=0;n<nmodems;n++) { // find the modem var thismodem = rootnode.childNodes(n) document.writeln(" "+ n + ". " + thismodem.nodeName+" ") numfields = thismodem.childNodes.length // print all the child nodes for this modem for (i=0;i<numfields;i++) { document.writeln( " " + i + " " + " " + thismodem.childNodes(i).nodeType + " " + thismodem.childNodes(i).nodeName + " " + thismodem.childNodes(i).text + " ") }

25 Stein XML 2.25 And the answer is … The root is modems and it has 7 child nodes. 0. copper 0 1 name ASM-20 1 1 webpage products/family/asm-20/asm-20.htm 2 1 medium 4-wire 3 1 linecode D1 4 1 sync synchronous 5 1 management unmanaged 6 1 minrate 19.2 7 1 maxrate 256 8 1 maxrange 7.5 9 1 interfaces V.24 RS-232 V.35 V.36 X.21 …

26 Stein XML 2.26 More generally There are more levels and we have to recursively walk through the tree function parseChildren(node) var x = node.childNodes var n = x.length if (n>0) { for (var I=0; I<n; I++) {... parseChildren( x(i) ) } There will usually be attributes (etc) as well We often want to jump to specific nodes, etc We may want to append, delete, change nodes and persist the changes EXERCISE TIME!! See NodeIterator and TreeWalker

27 Stein XML 2.27 XML Processing XML using XSL

28 Stein XML 2.28 Stylesheets Stylesheets are commonplace in presentation tools They enable customization, standardization of documents A stylesheets is usually a set of rules describing how different elements are to be displayed For example look of headers font face and size effects (underline, bullets) Use of color Cascaded Style Sheets are used to changes HTML defaults SGML had D ocument S tyle and S emantics S pecification L anguage Based on Scheme (LISP variant) Influenced XSLT’s philosophy, but not its syntax

29 Stein XML 2.29 CSS We can add style to XML using CSS - just like HTML book {display:block} article {display:block} talk {display:block} title {display:block; background:red; color:yellow; font-size:20pt;} author {color:blue; font-size:20pt;} <?xml-stylesheet type="text/css" href="biblio3.css"?>... But such style is very limited Treatment of tags is not environment dependent Can hide tags (display:none) but can’t sort or filter them CSS is not a full programming language CSS is not XML-based and not extensible

30 Stein XML 2.30 XSL One can process with procedural languages (e.g. javascript) But instead one can use an XML-based pattern matching language –First step of compilation is XML –Declaritive languages are more suitable for transformation applications XSL eXtensible Stylesheet Language XSL has 2 components XSLT and XSLFO Both are XML applications (can be verified using DTD) XSLT has 2 versions NEW VERSION (MSXML3, IE6?) OLD VERSION (IE5+) XSLT is supported by IE5+ XMLSPY Apache’s Xalan Saxon XP Sablotron Unicorn Xesalt

31 Stein XML 2.31 XSL Transformations If we are already processing the XML file (XML in XML out) we can do a lot more! Examples: Change tag names (e.g. … to … ) Change attributes to child elements or vice-versa Manipulate fields (including numeric computation) Reorder elements Change entire hierarchical structure Filter elements or SELECT records Hence there are two equivalent opening tags for “embedded” XSL for “standalone” XSLT not in IE XML format conversion

32 Stein XML 2.32 XSLT Processing XSLT Inputs 2 XML files: XML and XSL Outputs 1 XML file (can be HTML for display) XSLT supports recursion and iteration (it relies on an XML DOM parser) XSLT supports XPath (although IE support is minimal) XSLT supports internationalization (languages) Unfortunately, present-day XSLT processors are limited require tree in memory and are hence limited in database size (write SAX programs for large applications) are relatively slow Processing features: template matching commands value commands extract fields standard programming constructs (e.g. basic math, loops, conditionals) special features (e.g. filtering, sorting) noncommands are passed to output

33 Stein XML 2.33 Simple XSL Example Bibliography Digital Signal Processing... Y. Stein Critical Temperature... Y. Stein Storage Capacity for Neural Network Models Y. Stein <?xml-stylesheet type="text/css" href="biblio3.css"?>... Bibliography

34 Stein XML 2.34 template match The heart of XSLT is template matching (triggering) The xsl:template element with the match attribute is used... Put here whatever you want to do! Actually the match attribute’s value is not merely a nodename it is a complex expression matching any of the children of the current node We must always start processing by matching to the document node which is nicknamed / (WARNING - this is NOT the XML root!)...

35 Stein XML 2.35 Recursion and Iteration At every moment there is a current node We will need to match the current node’s children We can do this by recursion Or by iteration (looping)... When recursing XSL should perform default actions on all the child nodes, but IE doesn’t

36 Stein XML 2.36 value-of select The explicit value of a node is obtained using where as usual nodename is actually an expression For the current node’s value use “.” Example :

37 Stein XML 2.37 XPath expressions The expression in match and select attributes are in XPath XPath expressions are NOT XML syntax Here are some XPath goodies / like in directories is both the “top” and hierarchy divider * wildcard @ attribute // any number of intervening levels type() e.g. text(), comment() nodes of a particular type Test brackets [xxx] only nodes with child or attribute which match Examples

38 Stein XML 2.38 Sorting The for-each element has several ordering options Sorting is specified using the order-by attribute By default ordering is lexicographical (unless explicitly number) and ascending (use - for descending) Multiple keys can be specified (separate by ; ) <xsl:for-each select="copper|fiber" order-by="number(minrate); -interfaces">... There is also a command not implemented by IE Also, you can count with (position in current node)

39 Stein XML 2.39 Default (IE) XSL... <!DOCTYPE (View source for full doctype... )>...............

40 Stein XML 2.40 XSLing on-the-fly By defining two XML islands one for the XML and one for the XSL We can process the XML before displaying it function load() { var result = xmli.transformNode(xsli.documentElement); fakeDiv.innerHTML = result; }

41 Stein XML 2.41 XML and XSL and Javascript! XSL is great - but it has NO GUI !!!!!! Javascript is great - but it is tedious to use Idea: process XML with XSL use HTML buttons, forms, etc. events trigger Javascript functions Javascript changes XSL in DOM XSL retransforms XML to HTML

42 Stein XML 2.42 XSML+XSL+JS Example function load() { var result = xmli.transformNode(xsli.documentElement); fakeDiv.innerHTML = result; } function change(value) { // parse XSL and make changes load() } 0 1

43 Stein XML 2.43 A Example modems.xml -->... modems.xsl...... find.html... function load()... function selectmedium(key)... function selectman(key)... function inputrate(key)... function inputrange(key)......


Download ppt "Stein XML 2.1 XML a first course Part 2 Yaakov J. Stein Chief Scientist RAD Data Communications."

Similar presentations


Ads by Google