Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Overview Andy Scholand 2/29/00 For Georgia Tech courses: ME 6754 et al. 11/13/00 - minor updates (R. Peak)

Similar presentations

Presentation on theme: "XML Overview Andy Scholand 2/29/00 For Georgia Tech courses: ME 6754 et al. 11/13/00 - minor updates (R. Peak)"— Presentation transcript:

1 XML Overview Andy Scholand 2/29/00 For Georgia Tech courses: ME 6754 et al. 11/13/00 - minor updates (R. Peak)

2 2 Motivation - Why XML? zIndustry Focus (Resume Fodder!) zHTML is broken

3 3 Microsoft Goals for XML Incubate/Integrate/Innovate zMake XML mainstream zDeeply integrate XML with the platform zEnable applications zCreate opportunity zEvolve in steps “XML is a breakthrough technology” Bill Gates, Oct 97, Seybold

4 4 Motivation

5 5 HTML Evolution zStarted with very few tags … zLanguage evolved, as more tags were added yforms ytables yfonts yframes

6 6 HTML Problems zDesire for personalized tags zWant to put data into HTML form ymathematics, database entries, literary text, poems, purchase orders …. zHTML just isn’t designed for that!

7 7 Goals for XML zBetter search results zPresenting various views of same data zIntegration of data from different sources zEasy use over the Internet zEasy development of applications that process documents zDocuments readable by humans zDocuments easy to create zInterchange of data

8 8 XML Background zWhere does XML come from? zWhat is it (in general) zWho specifies it?

9 9 Idea: Back to the Basics zHTML was defined using SGML yStandard Generalized Markup Language yA meta-language for defining languages. zComplex, sophisticated, powerful zIdea: Use SGML

10 10 Problems with SGML zToo complicated a language zRules are too strict zNot good in a distributed environment zCan’t mix different data together

11 11 Idea (2): “Webified” SGML zNew eXtensible Markup Language: XML zCan use XML to define new languages zDistributes easily on the Web zCan mix different types of data together

12 12 Basic XML Rules zTags like in HTML, but... zTechnical details yAlways need end tags ySpecial empty-element tags yAlways quote attribute values

13 13 Just what is XML? zIt's a markup language used for annotating text zIt is concerned with logical structure yto identify sections, titles, section headers, chapters, paragraphs,… zIt is not concerned with appearance yyou say 'this is a subtitle' not 'this is in bold, 14pt, centered' yyou say 'this is an example' not 'this is in verbatim, indented by 5pts, ragged right'

14 14 Like this example ….. Title of text XHTML Document Heading of Page ….. And here is another paragraph, this one containing an inline image, and a line break.

15 15 Who Specifies XML? zeXtensible Markup Language zA text-based, data description meta language  Design your own markup language zA streamlined subset of SGML zDesigned for use on the Internet  A W3C Technical Recommendation (February 10, 1998)

16 16 The W3C zThe W3C is The World Wide Web Consortium, a voluntary association of companies and non-profit organisations. Membership costs serious money, confers voting rights. Complex procedures, with the Chairman (Tim Berners-Lee) holding all the high cards, but the big vendors (e.g. Microsoft, Adobe, Netscape) have a lot of power.

17 17 XML in Depth zWhat is an XML Document? zWhat is it - syntax

18 18 Documents zWell-formed document: obeys the syntax of XML zValid documents: a well-formed document that contains a proper document type declaration and obeys the constraints of that declaration

19 19 Structure of XML Document zProlog yXML declaration - information about document XML version and encoding Example: yDocument Type Declaration (DTD) xinternal - embedded in the document xexternal - pointer to the file with defined grammar zBody - structured data with one root element

20 20 XML Markup zComments zEntity references zCharacter references zProcessing instructions zCDATA sections zStart tags and end tags zEmpty elements XML markup specifies the structure of the document. All text that is not markup is the character data of the document:

21 21 Comments zComments make the structure of the document clearer zCan appear anywhere in a document zComments are not part of the document data (content of comments may be ignored by XML parsers) zExample: Smith

22 22 Entity References zEntity is a term that represents certain data zXML parser will substitute that data for the entity zEntities can be used to store binary data  Predefined entities: amp, lt, gt, apos, quot that stand for: &,, ‘, “ zExample: 5 < 8

23 23 Character References zCharacter reference is a character in the ISO 10646 character set, usually not directly accessible from available input devices zCharacter reference is specified as a hexadecimal or decimal code for a character zExample: #x000d is a carriage return

24 24 Processing Instructions zProcessing instructions are not part of the document’s data but must be passed through to the application yBy-passing the XML processor and delivering instructions directly to a process zPI begins with the target application identifier

25 25 CDATA Sections zCDATA section can be used to store marked-up text so that the markup is not evaluated zCDATA sections are useful if the user wants to store XML markup as a data zExample: 50 ]]>

26 26 Elements- Start & End Tags zFormat: start tag, data, and end tag zThe tags describe the data contained between tags zThe data within the tags is the value of the element The Fish Slapping Dance

27 27 Attributes zAttributes associate name-value pairs with elements zAttributes may appear only within start- tags and empty-element tags 829

28 28 Empty Elements  Empty element tag has special form zRepresent elements that have no content zExample:

29 29 Example XML Document Collaborative Virtual Workspace Spellman Peter 1997 collaboration framework virtual environments

30 30 Documents - Example zThe example shows plain XML document displayed in Internet Explorer 5.0 zURL:

31 31 The XML Success Story zXML has been a runaway success, on a much greater scale than its designers anticipated yNot for the reason they had hoped xBecause separation of form from content is right yBut for a reason they barely thought about xData must travel the web zTree structured documents are a useable transfer syntax for just about anything ySo data-oriented web users think of XML as a transfer mechanism for their data

32 32 XML is ASCII for the 21st century zASCII (ISO 646) solved a fundamental interchange problem yWhat bits encode what characters zUNICODE/ISO 10646 extends to world-wide zXML thought it was doing the same for simple tree-structured documents yThe emphasis in the XML design was on simplifying SGML to move it to the Web yXML didn't touch SGML's architectural vision xflexible linearization/transfer syntax xfor tree-structured documents with internal links

33 33 Take Two: Just what is XML? zIt's a markup language used for transferring data zIt is concerned with data models yto convert between application-appropriate and transfer-appropriate forms zIt is not concerned with human beings yIt's produced and consumed by programs

34 34 Application Data Part NamePart IDPriceInStock window001 $ 40yes muffler002 $ 150yes door003 $ 30no

35 35 Structured Markup window 40 yes muffler 150 yes

36 36 The Cambridge Communiqué  A W3C Note resulting from a meeting Aug’99 ( ) zSignaled a widespread acceptance of XML as a data layer: "XML has defined a transfer syntax for tree- structured documents; "Many data-oriented applications are being defined which build their own data structures on top of an XML document layer, effectively using XML documents as a transfer mechanism for structured data; "

37 37 The Communiqué, cont'd zCalled for support in XML Schema for specifying mapping between the XML document data model (or XML Infoset) and application-specific data models zXML Schema is a W3C recommendation- in-progress for defining the structure of document families zA grammar for markup structure artice -> title, subtitle?, section+ or POORDERHDR -> DATETIME, ORDERAMT

38 38 Other XML Basic Concepts zContent Descriptions yDTDs zNamespaces zContent Descriptions ySchemas zDocument Object Model zData Islands

39 39 Content Description zDocuments have structure yDocument types yDocument instances zStructure can be defined yInformally ySGML DTD yXML DTD ySchema using XML

40 40 Document Grammar Specifications Document Type Definition (DTD) zProvides formal definition of: ytags used in XML document yorder of the tags ycontainment relationships between tags ytypes of data contained in the elements zUsed for: XML document validation Describing grammar for other users

41 41 More about a DTD zControls the manipulation of data yRequires everyone to use the same set of tags the same way zAssociation with XML document: (External Reference)

42 42 Document Type Definition - Example

43 43 Internal vs. External DTD zExternal DTD - usual case Hello zInternal DTD Hello

44 44 DTD Notation Used for Element Content Declarations (expression) - expression treated as a unit (a, b) - sequence: a followed by b (a|b) - choice: a or b but not both a? - a or nothing a+ - one or more occurrences of a a* - zero or more occurrences of a Example: (title,author+,date,keywords?)*

45 45 Problem Namespaces Addresses zSometimes XML elements have the same name but mean different things The Fish Slapping Dance Mr  Namespaces are a mechanism for solving elements and attributes name conflicts

46 46 Namespaces zCollection of related XML elements and attributes identified by a URI reference NOTE: URIs are used to avoid collisions in namespace’s names zProvides unique names for elements and attributes by adding context to the tags zEnables reuse of grammar specifications

47 47 XML Namespaces zMakes XML truly extensible zEnables developers to mix data described by multiple schemas into one XML document instance  Schema components are reusable The Fish Slapping Dance Mr

48 48 Namespaces - Declaration zDefault declaration The Fish Slapping Dance zExplicit declaration The Fish Slapping Dance 14.76

49 49 Document Grammar Specifications zXML Schema yProvides greater functionality than Document Type Definition xDeveloped later (still in WD form) xAimed at Structured Data, not just Documents xTherefore, Full Data-type support yUses XML syntax yAssociation with XML document:

50 50 XML Schema zValidation of documents with markup from different namespaces zExtensibility ySchema authors can add their own elements and attributes to XML Schema documents zDefault element content zData types with possibility of constraint specifications zUser defined data types

51 51 XML Schema - MS XDR Example

52 52 More on XML Schema zFortunately, XML Schema is actually notated in XML itself yElements yAttributes yTypes zA type is a collection of constraints on element content and attribute values zA type may be either ysimple, for constraining string values ycomplex, for constraining elements which contain other elements

53 53 The XML Schema worldview zValidity and well-formedness are XML 1.0 concepts yThey are defined over character sequences zNamespace-compliant yIt's defined over character sequences too zXML Schema Schema-validity is layered on top of XML 1.0 well-formedness plus Namespaces yXML document Infosets = Validity + WF + NS

54 54 Valid and Well-Formed XML Documents zWell-formed document: ycontains one or more elements ythere is precisely one root element yall other tags nest within each other correctly zValid document: document complies with DTD/Schema ycontent model validity: order and nesting ydata type validity: correct type and other constraints satisfaction e.g. value range

55 55 Why validate? zDTD/Schema guarantees an interface zProducers validate to ensure they are providing what they promised zConsumers validate to check up on producers yand to protect their applications zApplication authors validate to simplify their task yLeave error detection and analysis to the validating parser

56 56 Programming API for XML documents Describes logical structure of document and the way a document is accessed and manipulated Defines naming convention for document components Enables straightforward access to the document components Implemented by the tools that enable manipulation of XML documents Document Object Model

57 57 Document Object Model - Example

58 58 Document Object Model - Example root = doc.getDocumentElement(); //print tag name System.out.println(root.getTagName()); //get first child element of the root docElem = (Node) root.getFirstChild(); //print tag name System.out.println(docElem.getTagName()); //print element type System.out.println(docElem.getNodeType()); //print node value docElem1 = (Text) docElem.getFirstChild(); System.out.println(docElem1.getNodeValue());

59 59 XML Data Islands zXML code embedded in an HTML page zEnables integration of XML with HTML page (XML data can be accessed by scripts)  Contains well-formed XML document within tags The Fish Slapping Dance 829 clipXML.documentElement.childNodes.item(0).text

60 60 Uses for XML zSpecific Languages for Specific Purposes zXML for Message Exchange zProgramming with XML ySAX vs. DOM Parsers yValidation zStoring objects as XML documents

61 61 Languages Based on XML zResource Description Framework (RDF) standard for metadata exchange, enables better content searching on the Web zSynchronized Multimedia Integration Language (SMIL) enables simple authoring of TV-like multimedia presentations such as training courses on the Web zScalable Vector Graphics (SVG) - a language for describing two-dimensional graphics in XML

62 62 Evolution of XML zMany XML languages, optimised for different roles yMathML -- for mathematics ySMIL -- for synchronised multimedia yRDF -- for describing “things” yXUL -- for describing the Navigator 5 user interface

63 63 The XML Family Tree SGML XML HTMLTEI...... XHTML SMIL MathML SpeechML RDF XUL

64 64 SMIL - General Information zSynchronized Multimedia Integration Language - allows integrating a set of multimedia objects into a synchronized multimedia presentation zSMIL provides mechanisms for ydescription of temporal behavior of the presentation ydescription of layout of the presentation on the screen yassociation of hyperlinks with media objects

65 65 SMIL - Basic Concepts zLayout: the layout of the visual clips can be defined, the clips can be assigned to the predefined separate regions zClip playback: the clips can be played from various sources and in different modes zClip temporal dependency: the clips can be played in parallel or sequential manner zHyperlinks: a clip can be connected to another clip or presentation

66 66 SMIL Example

67 67 SMIL - Example zThe example shows a presentation created using SMIL zThe presentation is displayed using RealPlayer from RealNetworks zThe file with presentation is available from: uction.html

68 68 MathML zDesigned to express semantics of maths zAlso can express layout zCut & paste into Maple, Mathematica z x 2 + 4x + 4 =0 x 2 + 4 &invisibletimes; x + 4 = 0

69 69 XHTML: NextGen HTML Title of text XHTML Document Heading of Page here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. Here is another paragraph with inline emphasized text, and absolutely no sense of humor. And another paragraph, this one with an waste of time image, and a line break.

70 70 XHTML zJust like HTML, but based on XML rules zWill support integration of different data into a single document

71 71 Other Use: Data Abstraction zXML as a universal format for data interchange yMachines exchange data as XML-format messages yEliminates proprietary data formats yLots of XML processing software available

72 72 XML Messaging Factory Supplier Place order Response

73 73 Example Message Gold sprockel grommets, with matching hamster 12 …. Order something else …..

74 74 XML Tools zParsers, used for accessing, analyzing and transforming XML documents (Sun, Microsoft, IBM) yValidating parsers yNon-validating parsers zEditors, used for creating XML content (Microsoft, IBM) zJava APIs for XML (Sun)

75 75 XML Editor - Example zThe example demonstrates abilities of XML Notepad zThe XML Notepad is available from:

76 76 XML Parsers zValidation zApplication Programming Interfaces yDocument Object Model (DOM) - tree based xcompiles an XML document into an internal tree structure xallows application to navigate the tree ySimple API for XML (SAX) - event based xreports parsing events directly to the application through callbacks xdoes not usually build an internal tree

77 77 SAX Parser ErrorHandler DocumentHandler

78 78 Browsing Document with SAX Parser import org.xml.sax.HandlerBase; import org.xml.sax.AttributeList; public class MyHandler extends HandlerBase { String tag = "outside"; int indent = 0; public void startElement (String name, AttributeList atts) { int i; indent = indent + 2; for(i = 0; i < indent; i ++) { System.out.print(" "); } System.out.println("Start element: " + name); tag = "inside"; }

79 79 Browsing Document with SAX Parser import org.xml.sax.Parser; import org.xml.sax.DocumentHandler; import org.xml.sax.helpers.ParserFactory; public class XMLContent { static final String parserClass = ""; public static void main (String args[]) throws Exception { Parser parser = ParserFactory.makeParser(parserClass); DocumentHandler handler = new MyHandler(); parser.setDocumentHandler(handler); for (int i = 0; i < args.length; i++) { parser.parse(args[i]); }

80 80 Browsing Document with SAX Parser - Results Start element: doc Start element: publication Start element: title Content: Collaborative Virtual Workspace End element: title Start element: author Start element: lastname Content: Spellman End element: lastname Start element: firstname Content: Peter End element: firstname End element: author Start element: date Content: 1997 End element: date Start element: keywords Start element: keyword Content: collaboration framework

81 81 DOM Parser Document Element publication Element doc Element publication Element authorElement title TextElement lastnameElement firstname Text

82 82 Browsing Document with DOM Parser import org.w3c.dom.*; import; public class DOMAccess { static final String parserClass = ""; public static void main (String args[]) throws Exception { DOMParser parser = new DOMParser(); Document document; Element root; NodeList publications; Element publication; Element author; Element lastname; Text nameString; parser.parse(args[0]);

83 83 Browsing Document with DOM Parser document = parser.getDocument(); root = document.getDocumentElement(); System.out.println("Node name: " + root.getNodeName()); publications = root.getElementsByTagName("publication"); publication = (Element) publications.item(0); System.out.println("Node name: "+publication.getNodeName()); author = (Element) (publication.getElementsByTagName("author")).item(0); System.out.println("Node name: " + author.getNodeName()); lastname = (Element) (author.getElementsByTagName("lastname")).item(0); System.out.println("Node name: " + lastname.getNodeName()); nameString = (Text) lastname.getFirstChild(); System.out.println("Last Name: " + nameString.getData());

84 84 Browsing Document with DOM Parser - Results Node name: doc Node name: publication Node name: author Node name: lastname Last Name: Spellman

85 85 Validation of Document import org.xml.sax.Parser; import org.xml.sax.ErrorHandler; import org.xml.sax.helpers.ParserFactory; public class Validator { static final String parserClass = ""; public static void main (String args[]) throws Exception { Parser parser = ParserFactory.makeParser(parserClass); ErrorHandler handler = new ErrorReport(); parser.setErrorHandler(handler); parser.parse(args[0]); }

86 86 Validation of Document import org.xml.sax.ErrorHandler; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; public class ErrorReport implements ErrorHandler { /** Warning. */ public void warning(SAXParseException ex) { System.err.println("[Warning] "+ getLocationString(ex)+": "+ ex.getMessage()); } /** Error. */ public void error(SAXParseException ex) { System.err.println("[Error] "+ getLocationString(ex)+": "+ ex.getMessage()); }

87 87 Validation of Document - Modification Something More Interesting Spellman Peter 1997 collaboration framework virtual environments.

88 88 Validation of Document - Results D:\docs\cis\domexample>java Validator..\publications.xml [Error] publications.xml:15:24: Attribute, "number", is not declared in element, "publications". [Error] publications.xml:26:18: Element, "publications" is not declared in the DTD [Error] publications.xml:56:7: Element " " is not valid because it does not follow the rule, "(publication)*".)

89 89 Objects as XML Documents zCustomizing Java serialization mechanism   readObject( ywriteObject( zSolution for JavaBeans yuse of information gathered from BeanInfo class yuse of ‘set’ methods for each object field

90 90 XSL Stylesheets zOverview zProcess zResult tree construction zXSL template element zXSL patterns zImportant XSL elements zDisplaying XML data in Web browsers

91 91 Extensible Stylesheet Language (XSL) Overview zEnables display of XML by transforming XML into structure suitable for display, for example HTML zXSL transformations can be executed on the server to provide HTML documents for older browsers zProvides mechanisms for transformation of XML data from one schema to another zEnables converting XML documents through querying, sorting, and filtering zAssociation with XML document:

92 92 XSL - Style Sheets zContain a template of the desired result structure zIdentify data in the source document to insert into the template Example: Fragments of XSL document define how elements of XML document should be transformed into HTML document

93 93 XSL Process Source TreeResult Tree Interpretation of result tree

94 94 Process zConstruction of source tree from XML document zTransformation of source tree to result tree using stylesheet in XSL document zApplication of style rules to each node of result tree zDisplay of document by user agent using appropriate styling on a display, on paper or some other medium

95 95 XSL - Example zThe example illustrates how the XSL document is applied to XML document and displayed in the Web browser zThe example must be viewed using Internet Explorer 5.0 zURL: viewer/transform-viewer.htm

96 96 XSL Patterns zSimple query language for identifying nodes in an XML document zIdentify nodes depending on: ytype, name, and values yrelationship of the node to other nodes in the document clip clip/title clip/* clip/priceinfo/regprice

97 97 XSL Patterns - Example zThe example shows how the parts of the XML document can be identified using XSL patterns zThe example must be displayed in Internet Explorer 5.0 zURL: patterns.htm

98 98 Result Tree Construction zStylesheet - set of template rules zTemplate rule ypattern - identifies the source node to which the processing is applied ytemplate - the fragment to be instantiated to form a part of the result tree zCreation of result tree: finding the template rule for the root node and instantiating its template

99 99 XSL Template Element zDescribes template rule  match attribute - source node to which the rule applies zContent - the template, may contain XSL formatting vocabulary zConflict resolution ymost specific rule will be applied ypriorities (priority attribute of the rule) zNamespaces used to distinguish XSL instructions from other template content

100 100 XSL Patterns zMatching by name zMatching by ancestry zMatching several names zMatching the root zWildcard matches

101 101 XSL Patterns zMatching by ID zMatching by attribute zMatching by child zMatching by position

102 102 Other Important Elements zApplies template rules to the children of the node zExtracts value of element pattern zPerforms operation for each element described by pattern  Used in apply-template or for-each element, sorts children according to the key

103 103 XSL Stylesheet - Translation to HTML Publications

104 104 XSL Stylesheet - Translation to HTML Title: Author: Date: Keywords:

105 105 XSL Stylesheet - Translation to HTML

106 106 XML Before XSL X-form Collaborative Virtual Workspace Spellman Peter 1997 collaboration framework virtual environments

107 107 XSL Stylesheet - Translation to HTML

108 108 XML Benefits z21st Century ASCII zValidation zGood representation of tree data zMultiple views thru XSL

109 109 XML Benefits zSupported by all major vendors, including Microsoft, IBM, Netscape, Sun zEasy Client-side manipulation yDesigned to be easy to parse y26K of Java code (Aelfred) y5K of JavaScript zFree XML parsers available, even for commercial use

110 110 Resources zMicrosoft Website y y y zW3C XML Web Site y zXML Resources: zBooks: y“The XML Handbook” Charles Goldfarb (Prentice Hall) y“XML Applications” (Wrox Press)

111 111 Resources zMore tutorials and informations for developers: zResources at NPAC: /

112 112 References zXML Applications by Frank Boumphrey et al. zXML Complete by Steven Holtzner zExtensible Stylesheet Language (XSL) Version 1.0, W3C Working Draft 16-Dec-98 zExtensible Markup Language (XML) 1.0, W3C Recommendation 10-Feb-98 zSAX - zDocument Object Model (DOM) Level 1 Specification, Verson 1.0, W3C Recommendation 1-Oct-98 zVarious Info - zParsers and Tools -

Download ppt "XML Overview Andy Scholand 2/29/00 For Georgia Tech courses: ME 6754 et al. 11/13/00 - minor updates (R. Peak)"

Similar presentations

Ads by Google