Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHAPTER 5 EXTENSIBLE MARKUP LANGUAGE (XML)

Similar presentations


Presentation on theme: "CHAPTER 5 EXTENSIBLE MARKUP LANGUAGE (XML)"— Presentation transcript:

1 CHAPTER 5 EXTENSIBLE MARKUP LANGUAGE (XML)

2 WHAT IS XML? XML is subset of the Standard Generalized Markup Language (SGML) defined in ISO standard 8879:1986 that is designed to make it easy to interchange structured documents over the Internet by the World Wide Web Consortium (W3C) XML files always clearly mark where the start and end of each of the logical parts (called elements) of an interchanged document occurs. It also defines how Internet Uniform Resource Locators can be used to identify component parts of XML data streams

3 DIFFERENCE BETWEEN XML AND HTML
HTML includes over 100 pre-defined tags to allow the author to specify how each piece of content should be presented to the end user. <b></b> XML XML allows you to create your own tags to describe the data between them. XML is primarily used for data storage and transfer purposes - not for presentation purposes.

4 HISTORY Problems with HTML : Fixed set of tags and attribute
-User cannot define new tags or attributes One solution to the first of these problems – Let each group of users define their own tags (with implied meanings) – (i.e., design their own “HTML”s using SGML) Problem with using SGML: It’s too large and complex to use, and it is very difficult to build a Parser for it A better solution: Define a light version of SGML

5 WHY XML? XML is not a replacement for HTML
HTML is a markup language used to describe the layout of any kind of information XML is a meta-markup language that can be used to define markup languages that can define the meaning of specific kinds of information XML is a very simple and universal way of storing and transferring data of any kind XML does not predefine any tags XML has no hidden specifications All documents described with an XML-derived markup language can be parsed with a single parser

6 FAMILY OF TECHNOLOGIES
First, let’s compare SGML to HTML. Just as SGML was designed for publishers, HTML was designed for information-rich documents, to be displayed via Web browsers. It has of course evolved beyond that, which led to the creation of XML, then XHTML. There really is no direct relationship to be made between XML and HTML, except that the real world use of HTML exposed the technical complications of SGML, and led to its simplification, as XML. Personally, I think a lot of the confusion about the nature of the relationship between HTML and SGML is the “ML” at the end. If HTML were called “Web Browser Document Format” (WBDF) there might be less confusion in attempting to understand how HTML and SGML are related.

7 WHAT IS XML USED FOR? Separate Data from HTML- your data is stored outside your HTML. Exchange Data -data can be exchanged between incompatible systems. XML and B2B - financial information can be exchanged over the Internet. Share Data/Store -Plain text files can be used to share/store data. Make your Data more Useful- Data is available to more users. May create new Languages - WAP and WML.

8

9 XML Application: Web portal
Banner ad selected from database Layout is simple HTML Table template file Yahoo service promotion Stock Market info from markets around the world Login form communicates with user authentication application Photo and news copy from wire services Weather information syndicated from Weather.com

10 XML CONTENT COMPONENTS
There are 3 components for XML content: - the XML document - DTD (Document Type Declaration) - XSL (Extensible Stylesheet Language) The DTD and XSL do not need to be present in all cases

11    THE SYNTAX RULES OF XML
The syntax rules of XML are very simple and logical. 1. All XML Elements Must Have a Closing Tag <p>This is a paragraph <p>This is another paragraph</p> 2. XML Tags are Case Sensitive <Message>This is incorrect</message> <message>This is correct</message> 3. XML Elements Must be Properly Nested <b><i>This text is not properly nested</b></i> <b>< i>This text is properly nested </i></b>

12  THE SYNTAX RULES OF XML 4. XML Documents Must Have a Root Element
<root>   <child>     <subchild>.....</subchild>   </child> </root> 5. XML Attribute Values Must be Quoted <note date="12/11/2007">   <to>Tove</to>   <from>Jani</from> </note> <note date=12/11/2007>   <to>Tove</to>   <from>Jani</from> </note>

13 Structure of XML Documents
An XML document comprises of the following basic units: Element: includes the start-tag, the enclosing character data and/or nested elements, and the end-tag. Attribute: defined in the start-tag to provide extra information about the element, in the form of attribute_name="attribute_value". Entities References: in the form of &name;, e.g., < (<), > (>), & (&), " ("), and &apos; (').

14 Structure of XML Documents
Character References: in the form of &#decimal-number; or &#xhex-code; for replacing any Unicode character, e.g., both © and © can be used for copyright symbol ©. PCDATA (Parsed Character Data): Text between start-tag and end-tag that WILL be examined by the parser for entity references and nested elements. CDATA (Character Data): Text between start-tag and end-tag that will NOT be examined by the parser for entity references and nested tags.

15 XML EXAMPLE An example XML document
<?xml version="1.0" encoding="ISO "?> <!– Information--> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> The first line in the document - the XML declaration - defines the XML version and the character encoding used in the document. The next line describes the root element of the document The next 4 lines describe 4 child elements of the root (to, from, heading, and body):

16 This is an example of an XML document for a bookstore:
An XML document is well-formed, if its structure meets the XML specification, i.e., it is syntactically correct. A well-formed XML document exhibits a tree-like structure, and can be processed by an XML processor. For example, the tree structure of the "bookstore.xml“.

17 XML ELEMENTS Element Naming
XML elements must follow these naming rules: Names can contain letters, numbers, and other characters . Avoid "-" and "." in names. For example??? Names must not start with a number or punctuation character Names must not start with the letters xml (or XML or Xml ..) Names cannot contain spaces Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book>.  The ":" should not be used in element names because it is reserved to be used for something called namespaces (more later).

18 XML ELEMENTS XML Elements are Extensible
XML documents can be extended to carry more information. <note> <to>Tove</to> <from>Jani</from> <body>Don't forget me this weekend!</body> </note> What is the output of the above XML? Imagine that the author of the XML document added some extra information to it: <date> </date> <heading>Reminder</heading>

19 XML ELEMENTS XML Elements have Relationships
Elements are related as parents and children <book> <title>My First XML</title> <prod id="33-657" media="paper"></prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter>XML Syntax <para>Elements must have a closing tag</para> <para>Elements must be properly nested</para> </book> Book is the root element. Title, prod, and chapter are child elements of book. Book is the parent element of title, prod, and chapter. Title, prod, and chapter are siblings (or sister elements) because they have the same parent.

20 XML ELEMENTS Elements have Content
Elements can have different content types. An XML element is everything from (including) the element's start tag to (including) the element's end tag, example??? An element can have element content, mixed content, simple content, or empty content. An element can also have attributes. In the example before, book has element content, because it contains other elements. Chapter has mixed content because it contains both text and other elements. Para has simple content (or text content) because it contains only text. Prod has empty content, because it carries no information. In the example before only the prod element has attributes. The attribute named id has the value "33-657". The attribute named media has the value "paper". 

21 WRITING XML SCHEMES To define custom xml language you will first identify its elements and their attributes. This information is called schema. Two principal system for writing schemes DTD (older version, widely used, limited syntax) XML schema (the Successors of DTDs)

22 DATA TYPE DEFINITION PCDATA=Parsed Character Data
Document Type Definition (DTD) is used to define the structure of an XML document. It describes the objects (such as elements, attributes, entities) and the relationship of the objects. It specifies a set of constraints and establishes the trees that are acceptable in an XML document. Example demonstrates what a DTD could look like: <!ELEMENT tutorials (tutorial)+> <!ELEMENT tutorial (name,url)> <!ELEMENT name (#PCDATA)> <!ELEMENT url (#PCDATA)> <!ATTLIST tutorials type CDATA #REQUIRED> PCDATA=Parsed Character Data

23 DOCTYPE Syntax To use a DTD within your XML document, you need to declare it. The DTD can either be 1. Internal (written into the same document that it's being used in) 2. External (located in another document). The basic syntax is: <!DOCTYPE rootname [DTD]> Internal <!DOCTYPE rootname SYSTEM URL> External

24 EXAMPLE INTERNAL DTD <?xml version="1.0" standalone="yes"?> <!DOCTYPE tutorials [ <!ELEMENT tutorials (tutorial)+> <!ELEMENT tutorial (name,url)> <!ELEMENT name (#PCDATA)> <!ELEMENT url (#PCDATA)> <!ATTLIST tutorials type CDATA #REQUIRED> ]> <tutorials> <tutorial> <name>XML Tutorial</name> <url> </tutorial> <tutorial> <name>HTML Tutorial</name> <url> </tutorials>

25 EXAMPLE EXTERNAL DTD <?xml version="1.0" standalone="no"?> <!DOCTYPE tutorials SYSTEM "tutorials.dtd"> <tutorials> <tutorial> <name>XML Tutorial</name> <url> </tutorial> <name>HTML Tutorial</name> <url> </tutorials> tutorials.dtd <!ELEMENT tutorials (tutorial)+> <!ELEMENT tutorial (name,url)> <!ELEMENT name (#PCDATA)> <!ELEMENT url (#PCDATA)> <!ATTLIST tutorials type CDATA #REQUIRED>

26 USAGE & LIMITATION OF DTD
Usage of DTD DTD defines the structure of a certain type of XML documents, which could facilitate exchanging of documents between computer systems electronically. It also helps in standardizing a certain class of documents. Limitations of DTD DTD has its own syntax (which is inherited from SGML DTD) and requires a dedicate processing tool to process the content. It does not use XML syntax and XML processor. DTD does not support object-oriented concepts such as hierarchies and inheritance. DTD's data type is limited to text string; and does not support other data types like number, date etc. DTD does not support namespaces. DTD's occurrence indicator is limited to 0, 1 and many; cannot support a specific number such as 8.

27 XML SCHEMA An XML Schemas is developed by W3C, which overcomes the limitation of DTD and meant to replace DTD. In brief, the XML Schema: is a well-formed XML document, which uses XML syntax. is object-oriented, support concepts like inheritance. supports namespaces. supports more data type. more element occurrence indicators. The XML Schema language is also referred to as XML Schema Definition (XSD).

28 XML SCHEMAS Problems with DTDs:
28 XML SCHEMAS Problems with DTDs: Syntax is different from XML - cannot be parsed with an XML parser It is confusing to deal with two different syntactic forms DTDs do not allow specification of particular kinds of data XML Schemas is one of the alternatives to DTD Two purposes: Specify the structure of its instance XML documents Specify the data type of every element and attribute of its instance XML documents Schemas are written using a namespace Every XML schema has a single root, schema The schema element must specify the namespace for schemas as its xmlns:xsd attribute

29 XML SCHEMAS Defining an instance document Data Type Categories 29
The root element must specify the namespaces it uses The default namespace The standard namespace for instances (XMLSchema-instance) The location where the default namespace is defined, using the schemaLocation attribute, which is assigned two values <planes xmlns = “ xmlns:xsi = “ xsi:schemaLocation = " planes.xsd" > Data Type Categories Simple (strings only, no attributes and no nested elements) Complex (can have attributes and nested elements)

30 XML SCHEMAS XMLS defines over 40 data types
30 XML SCHEMAS XMLS defines over 40 data types Primitive: string, Boolean, float, … Derived: byte, decimal, positiveInteger, … User-defined (derived) data types – specify constraints on an existing type (the base type) Constraints are given in terms of facets (totalDigits, maxInclusive, etc.) Both simple and complex types can be either named or anonymous DTDs define global elements (context is irrelevant) With XMLS, context is essential, and elements can be either: Local, which appears inside an element that is a child of schema, or Global, which appears as a child of schema

31 XML SCHEMAS Defining a simple type: User-Defined Types 31
Use the element tag and set the name and type attributes <xsd:element name = "bird" type = "xsd:string" /> An instance could have: <bird> Yellow-bellied sap sucker </bird> Element values can be constant, specified with the fixed attribute fixed = "three-toed“ User-Defined Types Defined in a simpleType element, using facets specified in the content of a restriction element Facet values are specified with the value attribute <xsd:simpleType name = "middleName" > <xsd:restriction base = "xsd:string" > <xsd:maxLength value = "20" /> </xsd:restriction> </xsd:simpleType>

32 XML SCHEMAS Categories of Complex Types 32 Element-only elements
Text-only elements Mixed-content elements Empty elements Defined with the complexType element Use the sequence tag for nested elements that must be in a particular order Use the all tag if the order is not important <xsd:complexType name = "sports_car" > <xsd:sequence> <xsd:element name = "make“ type = "xsd:string" /> <xsd:element name = "model " type = "xsd:string" /> <xsd:element name = "engine" type = "xsd:string" /> <xsd:element name = "year" type = "xsd:string" /> </xsd:sequence> </xsd:complexType>

33 33 XML SCHEMAS Nested elements can include attributes that give the allowed number of occurrences (minOccurs, maxOccurs, unbounded) SHOW planes.xsd and planes.xml We can define nested elements elsewhere <xsd:element name = "year" > <xsd:simpleType> <xsd:restriction base = "xsd:decimal" > <xsd:minInclusive value = "1990" /> <xsd:maxInclusive value = "2003" /> </xsd:restriction> </xsd:simpleType> </xsd:element> The global element can be referenced in the complex type with the ref attribute <xsd:element ref = "year" />

34 XML SCHEMAS Validating Instances of XML Schemas 34
Can be done with several different tools One of them is xsv, which is available from: Note: If the schema is incorrect (bad format), xsv reports that it cant find the schema

35 DISPLAYING RAW XML DOCUMENTS
35 DISPLAYING RAW XML DOCUMENTS There is no presentation information in an XML document An XML browser should have a default style sheet for an XML document that does not specify one You get a stylized listing of the XML  SHOW planes.xml on browser.

36 DISPLAYING XML DOCUMENTS WITH CSS
36 DISPLAYING XML DOCUMENTS WITH CSS A CSS style sheet for an XML document is just a list of its tags and associated styles The connection of an XML document and its style sheet is made through an xml-stylesheet processing instruction <?xml-stylesheet type = "text/css" href = "mydoc.css"?>  SHOW planesCSS.xml and planes.css

37 37 XSLT STYLE SHEETS XSL began as a standard for presentations of XML documents Split into two parts: XSLT – Transformations XSL-FO - Formatting objects XSLT uses style sheets to specify transformations An XSLT processor merges an XML document into an XSLT style sheet This merging is a template-driven process An XSLT style sheet can specify page layout, page orientation, writing direction, margins, page numbering, etc. The processing instruction we used for connecting a CSS style sheet to an XML document is used to connect an XSLT style sheet to an XML document <?xml-stylesheet type = "text/xsl" href = "XSLT style sheet"?>

38 XSLT STYLE SHEETS An example:
38 XSLT STYLE SHEETS An example: <?xml version = "1.0"?> <!-- xslplane.xml --> <?xml-stylesheet type = "text/xsl" href = "xslplane.xsl" ?> <plane> <year> 1977 </year> <make> Cessna </make> <model> Skyhawk </model> <color> Light blue and white </color> </plane> An XSLT style sheet is an XML document with a single element, stylesheet, which defines namespaces <xsl:stylesheet xmlns:xsl = "

39 39 XSLT STYLE SHEETS If a style sheet matches the root element of the XML document, it is matched with the template: <xsl:template match = "/"> A template can match any element, just by naming it (in place of /) XSLT elements include two different kinds of elements, those with content and those for which the content will be merged from the XML doc Elements with content often represent HTML elements <span style = "font-size: 14"> Happy Easter! </span> XSLT elements that represent HTML elements are simply copied to the merged document

40 XSLT STYLE SHEETS The XSLT value-of element Has no content
40 XSLT STYLE SHEETS The XSLT value-of element Has no content Uses a select attribute to specify part of the XML data to be merged into the XSLT document <xsl:value-of select = “CAR/ENGINE” /> The value of select can be any branch of the document tree SHOW xslplane.xsl The XSLT for-each element Used when an XML document has a sequence of the same elements SHOW xslplanes.xml SHOW xslplanes.xsl

41 End of lecture..


Download ppt "CHAPTER 5 EXTENSIBLE MARKUP LANGUAGE (XML)"

Similar presentations


Ads by Google