Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Extensible Markup Language. What is XML? o XML stands for EXtensible Markup Language o XML is a markup language much like HTML o XML was designed.

Similar presentations


Presentation on theme: "XML Extensible Markup Language. What is XML? o XML stands for EXtensible Markup Language o XML is a markup language much like HTML o XML was designed."— Presentation transcript:

1 XML Extensible Markup Language

2 What is XML? o XML stands for EXtensible Markup Language o XML is a markup language much like HTML o XML was designed to describe data o XML tags are not predefined in XML. You must define your own tags. o XML uses a Document Type Definition (DTD) or XML Schema to describe the data o XML with a DTD or XML Schema is designed to be self-descriptive

3 XML Vs HTML XML was designed to carry data. XML is not a replacement for HTML. XML and HTML were designed with different goals:  XML was designed to describe data and to focus what data is.  HTML was designed to display data and to focus on how data looks. HTML is about displaying information, XML is about describing information.

4 XML was designed NOT to do anything XML was created to structure, store, and to send information. Example of a note to Tove from Jani, stored as XML: Example of a note to Tove from Jani, stored as XML: <to>Tove</to><from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! Don’t forget me this weekend! </note> The note has a header, and a message body. It also has sender and receiver information. But still, this XML document does not do anything. It’s just pure information wrapped in XML tags. Someone must write a piece of software to send, receive or display it.

5 XML is free and extensible XML tags are not predefined. You must invent your own tags. The tags used to markup HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in HTML standard (like,, etc.). XML allows the author to define his own tags and his own document structure. The tags like and in the example above are not defined in any XML standard. These tags are invented by the author of the XML document.

6 XML is a complement to HTML XML is not a replacement for HTML. In the future Web development, it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data. XML is a cross-platform, software and hardware independent tool for transmitting information. XML is expected to be as important to the future of the Web as HTML has been to the foundation of the Web and to be the most common tool for all data manipulation and data transmition.

7 XML was not designed to display data. It is important to understand that XML was designed to store, carry, and exchange data.

8 XML Features XML is used to Exchange Data with XML, data can be exchanged between incompatible systems. XML can be used to Share Data with XML, plain text files can be used to share data. XML can be used to Store Data with XML, plain text files can be used to store data. XML can make your Data more Useful with XML, your data is available to more users. XML can be used to Create new Languages XML is the mother of WAP and WML The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML.

9 An example XML document XML documents use a self-describing and simple syntax. XML documents use a self-describing and simple syntax. <to>Tove</to><from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! Don’t forget me this weekend! </note> The first line in the document- describes the root element of the document (like: “this document is a note”).

10 An example XML document (Cont.) The next 4 lines describe 4 child elements of the root (to, from, heading, and body): Tove (to, from, heading, and body): Tove <from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! And finally the last line defines the end of the root Don’t forget me this weekend! And finally the last line defines the end of the root element:. From the example, don’t you agree that XML is pretty self- descriptive?

11 Tags All XML elements must have a closing tag With XML it is illegal to omit the closing tag. For example: this is a paragraph this is another paragraph For example: this is a paragraph this is another paragraph XML Tags are case sensitive Unlike HTML, XML tags are case sensitive. The tag is different from the tag : This is incorrect This is incorrect This is correct This is correct

12 Properly Nested All XML element must be properly nested This is not properly nested All XML element must be properly nested This is not properly nested This is properly nested This is properly nested Improper nesting of tags make no sense to XML. All XML document must have a root tag The first tag in an XML document is a root tag. All XML documents must contain a single tag pair to define the root element. All other elements must be nested within the root element.

13 Properly Nested (Cont.) All elements can have sub elements (children). Sub elements must be correctly nested within their parent element: parent element: <subchild>......</subchild> </root>

14 Attribute values must always be quoted With XML, it is illegal to omit quotation marks around attribute values. For example: For example: <to>Tove</to><from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! Don’t forget me this weekend! </note> This is incorrect! The date attribute in note element is not quoted

15 Attribute values must always be quoted (Cont.) <to>Tove</to><from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! Don’t forget me this weekend! </note> This is correct! This is correct: date=“12/11/02” This is incorrect: date=12/11/02

16 Comments in XML The syntax for writing comments in XML is similar to that of HTML:

17 XML Elements are Extensible XML documents can be extended to carry more information. E.g.<note><to>Tove</to><from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! Don’t forget me this weekend! </note> Imagine if later on the author decided to add some extra information to it:

18 XML Elements are Extensible (Cont.) <note><date>2002-10-12</date><to>Tove</to><from>Jani</from><heading>Remainder</heading> Don’t forget me this weekend! Don’t forget me this weekend! </note> Should the application break or crash? No. The application should still be able to find the,, and elements in the XML document and produce the same output. XML documents are Extensible!

19 XML Elements have Relationships Elements are related as parents and children. To understand XML terminology, you have to know how relationships between XML elements are named, and how element content is described.

20 XML Elements have Relationships (Cont.) For Instance, this is the description of a book: Book title: My First XML Chapter 1: Introduction to XML What is HTML What is XML Chapter 2: XML Syntax Elements must have a closing tags Elements must be properly nested

21 XML Elements have Relationships (Cont.) Then, this is the XML document that describes the book: <book> My First XML My First XML Introduction to XML Introduction to XML What is HTML What is HTML What is XML What is XML </chapter> XML Syntax XML Syntax Elements must have a closing tags Elements must have a closing tags Elements must be properly nested Elements must be properly nested </chapter></book>

22 Explanation Book is the root element Book is the root element Title and chapter are child elements of book Title and chapter are child elements of book Book is the parent element of the title and chapter Book is the parent element of the title and chapter Title and chapter are siblings (or sister elements) because they have the same parent. Title and chapter are siblings (or sister elements) because they have the same parent.

23 Elements have contents Elements can have different content types An XML element is everything from (including) the element’s start tag to (including) element’s end tag. An element can have element content, mixed contend, simple content, or empty content. An element can also have attributes. In the previous example, book has element content, because it contains other elements. Chapter has mixed contents because it contains both text and other elements. Para has simple content (or text content) because it contains only text. Prod has empty content, because it carries no information.

24 Elements have contents (Cont.) In that example, only the prod element has attributes. The attribute named id has the value “33-657”. The attribute named media has the value “paper”.

25 Element Naming XML elements must follow these naming rules: Names can contain letters, numbers, and other characters. Names can contain letters, numbers, and other characters. Names must not start with a number or punctuation character. Names must not start with a number or punctuation character. Names must not start with the letters xml (or XML or Xml …) Names must not start with the letters xml (or XML or Xml …) Any name can be used, no words are reserved, but the idea is to make names Descriptive-Names with an underscore separator are nice. Examples:,.

26 XML Attributes XML Elements can have attributes. Attributes are used to provide additional information about the elements. Attributes often provide information that is not a part of the data. For example: computer.gif computer.gif In the example, the file type is irrelevant to the data, but important to the software that wants to manipulate the element.

27 Quote Styles Attribute values must always be enclosed in quotes, but either double or single quote can be used. E.g. or Double quotes are the most common, but sometimes (if the attribute value itself contains quotes) it is necessary to use single quotes.

28 Elements Vs Attributes E.g.1: Anna Smith E.g.1: Anna Smith E.g.2: female Anna Smith E.g.2: female Anna Smith In the first example, sex is an attribute. In the last, sex is a child element. Both examples provide the same information. There are no rules about when to use attributes and when to use elements.

29 Should we avoid using attributes? Some problem of using attributes are: - Attributes can’t contain multiple values (child element can) - Attributes are not easily extendable (for future changes) - Attributes can’t describe structure (child elements can) - Attributes are more difficult to manipulate by program code - Attribute values are not easy to test against a DTD Use attributes only to provide information that is not relevant to the data! Don’t end up like this: relevant to the data! Don’t end up like this:

30 Metadata Metadata (or data about data) should be stored as attributes, and the data itself should be stored as elements. For example: Tove Jani Remainder Don’t forget me this weekend! Jani Tove Re: Remainder I will not! Tove Jani Remainder Don’t forget me this weekend! Jani Tove Re: Remainder I will not! </message>

31 Metadata (Cont.) The ID in the previous example is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data.

32 XML DTD A DTD defines the legal elements of an XML elements. A DTD defines the legal elements of an XML elements. The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. A DTD can be declared inline in your XML document, or as an external reference. A DTD can be declared inline in your XML document, or as an external reference.

33 Internal DOCTYPE Declaration If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: wrapped in a DOCTYPE definition with the following syntax: Example XML document with a DTD: ]> Tove Jani Remainder Don’t forget me this weekend Example XML document with a DTD: ]> Tove Jani Remainder Don’t forget me this weekend

34 DTD Interpretation !DOCTYPE note (in line 2) defines that this is a document of the type note. !ELEMENT note (in line 3) defines the note element as having four elements: “to, from, heading, body”. !ELEMENT to (in line 4) defines the to element to be of type “#PCDATA”. !ELEMENT from (in line 5) defines the from element to be of type “PCDATA”. And so on...

35 External DOCTYPE declaration If the DTD is external to your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: following syntax: E.g. : Tove Jani Remainder Don’t forget me this weekend! E.g. : Tove Jani Remainder Don’t forget me this weekend!

36 “note.dtd” And then the copy of the file “note.dtd” will be like this:

37 Why use a DTD? With DTD, each of your XML file can carry a description of its own format with it. With a DTD, independent groups of people can agree to use a common DTD for interchanging data. Your application can use a standard DTD to verify that the data you received from the outside world is valid. You can also use DTD to verify your own data.

38 DTD-XML building blocks The building blocks of XML documents Seen from a DTD point of view, all XML documents are made up by following simple building blocks: - Elements - Tags - Attributes - Entities - PCDATA - CDATA

39 DTD-Elements In a DTD, XML elements are declared with a DTD element declaration. Declaring an element An element declaration has the following syntax: OR Declaring an element An element declaration has the following syntax: OR Empty elements Empty elements are declared with the category keyword EMPTY: Empty elements are declared with the category keyword EMPTY: example: example: in XML: in XML:

40 DTD-Elements (Cont.) Elements with only character data Elements with only character data are declared with #PCDATA inside parentheses: example: Elements with only character data Elements with only character data are declared with #PCDATA inside parentheses: example: Elements with any contents Elements declared with category keyword ANY, can contain any combination of parsable data: Elements with any contents Elements declared with category keyword ANY, can contain any combination of parsable data: example: example:

41 DTD-Elements (Cont.) Elements with children (sequences) Elements with one or more children are defined with the name of the children elements inside parentheses: When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children.

42 DTD-Elements (Cont.) The full declaration of the “note” element will be Declaring only one occurrence of the same element example: example: The + sign in the example declares that the child element message can occur zero or more times inside the “note” element.

43 DTD-Elements (Cont.) Declaring zero or more occurrences of the same element example: element example: The * sign in the example above declares that the child element message can occur zero or more times inside the element “note” element.

44 DTD-Elements (Cont.) Declaring zero or one occurrences of the same element example: The ? sign in the example above declares that the child element message can occur zero or one times inside the “note” element.

45 DTD-Elements (Cont.) Declaring either/or content Example: Example: The example declares that the “note” element must content a “to” element, a “from” element, a “header” element, and either a “message” or a “body” element.

46 DTD-Attributes In a DTD, Attributes are declared with an ATTLIST declaration. Declaring Attributes An attribute declaration has the following syntax: An attribute declaration has the following syntax: DTD example: DTD example: XML example: XML example:

47 DTD-Attributes (Cont.) The attribute-type can have the following values: ValueExplanation CDATAThe value is character data (en1|en2|..)The value must be one from an enumerated list IDThe value is an unique ID IDREFThe value is the ID of another element IDREFSThe value is a list of other ids NMTOKENThe value is a valid XML name NMTOKENSThe value is a list of valid XML names ENTITYThe value is an entity ENTITIESThe value is a list of entities NOTATIONThe value is a name of a notation xml:The value is a predefined xml value

48 DTD-Attributes (Cont.) The default-value can have the following values: ValueExplanation valueThe default value of the attribute #DEFAULT valueThe default value of the attribute #REQUIREDThe attribute value must be included in the element #IMPLIEDThe attribute does not have to be included #FIXED valueThe attribute value is fixed

49 Attribute Declaration Example DTD example: DTD example: XML example: XML example: The square element is defined to be an empty element with a “width” attribute of type CDATA. If no width specified, it has a default value of 0.

50 Default Attribute Value Syntax: Syntax: DTD example: DTD example: XML example: XML example: Specifying a default value for an attribute ensures that the attribute will get a value even if the author of the XML document does not include it.

51 Implied Attribute Syntax: Syntax: DTD example: DTD example: XML example: XML example: Use implied attribute if you don’t want to force the author to include an attribute, and you don’t have an option for default value.

52 Required Attribute Syntax: Syntax: DTD example: DTD example: XML example: XML example: Use a required attribute if you don’t have an option for a default value, but still want to force the attribute to be present.

53 Fixed Attribute Value Syntax: Syntax: DTD example: DTD example: XML example: Use a fixed attribute value when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will return an error.

54 Enumerated Attribute Values Syntax: Syntax: DTD example: DTD example: XML example: XML example: or Use enumerated attribute values when you want the attribute values to be one of a fixed set of legal values.

55 DTD-Entities Entities are variables used to define shortcuts to common text. o Entity references are references to entities. o Entity can be declared internal or external. Internal Entity Declaration Syntax: Syntax: DTD example: <!ENTITY copyright “Copyright W3Schools.” XML example: &writer;&copyright; XML example: &writer;&copyright;

56 DTD-Entities (Cont.) External Entity Declaration Syntax: Syntax: DTD example: <!ENTITY copyright SYSTEM “http://www.w3schools.com/entities/entities.dtd”> XML example: <author>&writer;&copyright;</author>

57 XML Schema Example in DTD: ]> Can be re-written in XML Schema:

58 XML Schema (Cont.)

59 XML Schema (Cont.) </xsd:element> </xsd:complexType></xsd:schema>

60 XML Schema (Cont.) The benefit of XMLSchema over DTDs are: * It allows user-defined types to be created. * It allows the text that appears in elements to be constrained to specific types, such as numeric types in specific formats or even more complicated types such as lists or union. * It allows types to be restricted to create specialized types, for instance by specifying max and min values. * It allows complex types to be extended by using a form of inheritance. * It is a superset of DTDs. * It allows uniqueness and foreign key constraints. However, the price for these features is that XMLSchema is significantly more complicated than DTDs.

61 Querying and Transformation Tools for querying and transformation of XML data are essential to extract information from large bodies of XML data, and to convert data between different representations (schemas) in XML. Several languages provide increasing degrees of querying and transformation capabilities: - XPath - XSLT - XQuery

62 XPATH XPATH addresses part of an XML document by means of path expressions. A path expression in XPATH is a sequence of location steps separated by “/”. The result of a path expression is a set of values. For example: /bank-2/customer/name would return: Joe Lisa Mary would return: Joe Lisa Mary The expression: /bank-2/customer/name/text( ) would return the same names, but w/out the enclosing tags.

63 XPATH (Cont.) XPATH supports a number of other features: /bank-2/account[balance > 400] returns account elements with a balance value greater than 400, while /bank-2/account[balance >400]/@account-number returns the account numbers of those accounts. /bank-2/account/[customer/count( )>2] returns accounts with more than 2 customers. /bank-2/account/id(@owner) returns all customers referred to from the owners attribute account elements.

64 XPATH (Cont.) /bank-2/account/id(@owner)|/bank-2/loan/id(@borrower) gives customers with either accounts or loans. However, the | operator cannot be nested inside other operators. /bank-2//name finds any name element anywhere under the /bank-2 element, regardless the element in which it is contained. Note: the “//” described above is a short form for specifying “all descendants”, while “..” specifies parent.

65 XSLT A style sheet is a representation of formatting option for a document, usually stored outside the document itself. XML Stylesheet Language (XSL) was originally designed for generating HTML from XML. It includes a general-purpose transformation mechanism, called XSL Transformations (XSLT), which can be used to transform one XML document into another XML document, or to other formats such as HTML. XSLT transformations are expressed as a series of recursive rules, called templates.

66 XSLT (Cont.) A simple template for XSLT consists of a match part and a select part. For instance: For instance: The xls:template match statement contains an XPath expression that selects one or more nodes. The first template matches customer elements that occur as children of the bank-2 root element. The xsl:value-of statement enclosed in the match statement outputs values from the nodes in the result of the XPath expression. The first template outputs the value of the customer-name subelement; note that the value does not contain the element tag.

67 XSLT (Cont.) The second template matches all nodes. This is required because the default behavior of XSLT on subtrees of the input document that do not match any template is to copy the subtrees to the output document. Structural recursion is a key part of XSLT-When the template matches an element in the tree structure, XSLT can use structural recursion to apply template rules recursively by the xls:apply-templates directive, which appears inside other templates.

68 XQuery

69 The Application Program Interface

70 Storage of XML Data


Download ppt "XML Extensible Markup Language. What is XML? o XML stands for EXtensible Markup Language o XML is a markup language much like HTML o XML was designed."

Similar presentations


Ads by Google