Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Roger L.

Similar presentations


Presentation on theme: "Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Roger L."— Presentation transcript:

1 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas http://www.w3.org/TR/xmlschema-1/ http://www.w3.org/TR/xmlschema-2/ Roger L. Costello XML Technologies

2 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 2 Thanks Special thanks to the following people for their help in answering my endless barrage of questions: –Henry Thompson –Andrew Layman –Noah Mendelsohn –David Beech –Rick Jelliffe Many thanks the following people who have carefully reviewed this tutorial and notified me of bugs, typos, etc. –Oliver Becker –Kit Lueder –David Wang

3 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 3 Motivation People are dissatisfied with DTDs –It's not XML So, you write your XML document in one language and you specify the grammar of that document using another language (DTD) --> bad, inconsistent –Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, say "I want this element to be of datatype "int" Desire a set of datatypes compatible with those found in databases –DTD supports 10 datatypes; XML Schemas supports 37+ datatypes

4 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 4 Highlights of XML Schemas XML Schemas are a tremendous advancement over DTDs: –Enhanced datatypes 37+ versus 10 Can create your own datatypes Can define the lexical representation –Example "This element can contain strings of this form: ddd-dddd, where 'd' represents a 'digit'". –Written in XML enables use of XML tools –Object-oriented Can extend or restrict a type (derive new type definitions on the basis of old ones) –Can express sets - the child elements may occur in any order –Can specify element content as being unique (keys on content) and uniqueness within a region –Can define multiple elements with the same name but different content –Can define elements with null content –Can create equivalent elements - e.g., the "subway" element is equivalent to the "train" element.

5 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 5 Problem Convert BookCatalogue.dtd to the XML Schema notation –for this first example we will make a straight, one-to-one conversion, i.e., Title, Author, Date, ISBN, and Publisher will hold strings, just like is done in the DTD (next slide) –We will gradually modify the example to give more custom types to these elements

6 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 6 BookCatalogue.dtd

7 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 7 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd xsd = Xml-Schema Definition (explanations on succeeding pages)

8 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 8 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd This (default) namespace declaration says that all these elements come from this namespace (the XML Schema namespace). By virtue of the fact that they are associated with an element that comes from the XML Schema namespace, these attributes also come from the XML Schema namespace. (Recall that a default namespace doesn’t apply to attributes.)

9 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 9 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd Says that the elements and types defined in this schema are in this namespace. This will be used in instance documents to indicate that the elements contained in the instance document come from this namespace.

10 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 10 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd Here we are referencing a Book element. Where is that Book element defined? In what namespace? The cat: prefix indicates what namespace this element is defined in. cat: has been set to be the same as the targetNamespace.

11 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 11 xmlns:cat The schema element has an attribute xmlns:cat. The cat: namespace prefix, as we noted, is used for one element referencing another element within the schema. Clearly, this attribute will be schema-dependent, e.g., if I write a schema for cars I might have xmlns:car. Obviously, there is no way for xml-schema.dtd to account for all the possibilities. Thus, in the schema we supplement xml-schema.dtd with the internal subset declaration: <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]>

12 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 12 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd The annotation/ info elements are optional. Their content is intended for human consumption, i.e., it's a comment.

13 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 13 Referencing a schema in an XML instance document <BookCatalogue xmlns ="http://www.somewhere.org/BookCatalogue" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.somewhere.org/BookCatalogue http://www.somewhere.org/BookCatalogue/BookCatalogue1.xsd"> My Life and Times Paul McCartney July, 1998 94303-12021-43892 McMillin Publishing... In the BookCatalogue element (the root element) I declare that the schemaLocation attribute comes from the XML Schema Instance namespace (xsi). The value of schemaLocation is a pair of values - a namespace and the URI to a schema. When the XML parser processes this XML document it will use the schemaLocation pair of values to determine the XML Schema that it conforms to. It will retrieve the schema at the URI specified in schemaLocation (in this example, BookCatalogue.xsd) and then it will open up this schema document to confirm that its targetNamespace value matches the namespace value shown in schemaLocation. In this case it does. I declare (using a default namespace) that all the elements in this XML instance document comes from the same namespace.

14 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 14 Referencing a schema in an XML instance document BookCatalogue.xml BookCatalogue.xsd targetNamespace="A" schemaLocation="A A/BookCatalogue.xsd" - defines elements for namespace A - uses elements from namespace A

15 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 15 Note about schemaLocation schemaLocation is just a hint to the XML Parser "The choice of which schema to use ultimately lies with the consumer. If you as a consumer wish to rely on the schemaLocation idiom, then you should purchase/use processors that will honor that for you. The reason that some other processors might not provide that service to you is that they are designed to run in environments where it is impractical or undesirable to allow the document author to force reference to and use of some particular schema document." (Noah Mendelsohn) For this tutorial I will assume that we are using an XML Parser which uses the schemaLocation idiom.

16 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 16 Note multiple levels of checking BookCatalogue.xmlBookCatalogue1.xsd xml-schema.dtd Does the xml document conform to the rules laid out in the xml-schema? Is the xml-schema a valid xml document, i.e., does it conform to the rules laid out in the xml-schema DTD? Do Lab1

17 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 17 Alternate Schema On the following slide is an alternate (equivalent) way of representing the schema shown previously.

18 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 18 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue2.xsd

19 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 19 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue2.xsd Anonymous type (no name) Do Lab 2

20 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 20 Named Types The following slide shows an alternate (equivalent) schema which uses named types.

21 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 21 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue3.xsd

22 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 22 Problem Defining the Date element to be of type string is unsatisfactory (it allows any string value to be input as the content of the Date element, including non-date strings). We would like to constrain the allowable content that Date can have. Modify the XML-Schema to restrict the content of the Date element to just date values. Similarly, constrain the content of the ISBN element to content of this form: ddddd-ddddd- ddddd or d-ddd-ddddd-d or d-dd-dddddd-d, where 'd' stands for 'digit'

23 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 23 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue4.xsd

24 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 24 "Elements of this datatype will have string content. The string must conform to one of the three patterns listed. - The first pattern is: 5 digits followed by a dash followed by 5 digits followed by another dash followed by 5 more digits. - The second pattern is: 1 digit followed by a dash followed by 3 digits followed by another dash followed by 5 digits followed by another dash followed by 1 more digit. - The third pattern is: 1 digit followed by a dash followed by 2 digits followed by another dash followed by 6 digits followed by another dash followed by 1 more digit." These patterns are specified using Regular Expressions. In a few slides we will see more of the Regular Expression syntax.

25 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 25 Built-in Datatypes Primitive Datatypes –string –boolean –float –double –decimal –timeInstant –timeDuration –recurringInstant –binary –uri Atomic, built-in –"Hello World" –{true, false} – 12.56E3, 12, 12560, 0, -0, INF, -INF, NAN –7.08 –1999-12-04T20:12:00.000 – 1Y2M3DT10H30M12.3S –same format as timeInstant –010010111001 –http://www.somewhere.org Note: 'T' is the date/time separator INF = infinity NAN = not-a-number

26 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 26 Built-in Datatypes (cont.) Generated datatypes –language –NMTOKEN –NMTOKENS –Name –NCName –QNAME –ID –IDREF –IDREFS –ENTITY –ENTITIES –NOTATION –integer –non-negative-integer –positive-integer –non-positive-integer –negative-integer –date –time Subtype of primitive datatype – any valid xml:lang value, e.g., EN, FR,... –"house" –"house barn yard" –"hello-there" –part (no namespace qualifier) –book:part –Token, must be unique –Token which matches an ID –List of IDREF –456 –zero to infinity –one to infinity –negative infinity to zero –negative infinity to negative one –1999-05-21 –13:20:00.000 Do Lab 3

27 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 27 Creating your own Datatypes We can create new datatypes from existing datatypes (called source or basetype) by specifying values for one or more of the optional facets Example. The string primitive datatype has nine optional facets - pattern, enumeration, length, maxlength, maxInclusive, maxExclusive, minlength, minInclusive and minExclusive.

28 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 28 Example This creates a new datatype called 'TelephoneNumber'. Elements of this type can hold string values, but the string length must be exactly 8 characters long and the string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'.

29 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 29 Facets of the Integer Datatype Facets: –maxInclusive –maxExclusive –minInclusive –minExclusive

30 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 30 Example This creates a new datatype called 'EarthSurfaceElevation'. Elements of this type can hold an integer. However, the integer must have a value between -1246 and 29028, inclusive.

31 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 31 General Form of Datatype/Facet Usage... Facets: - minInclusive - maxInclusive - minExclusive - maxExclusive - length - minlength - maxlength - pattern - enumeration... Sources: - string - boolean - float - double - decimal - timeInstant - timeDuration - recurringInstant - binary - uri... Do Lab 4

32 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 32 Regular Expressions Recall that the string datatype has a pattern facet. The value of a pattern facet is a regular expression. Below are some examples of regular expressions: Regular Expression - Chapter \d - a*b - [xyz]b - a?b - a+b - [a-c]x Example - Chapter 1 - b, ab, aab, aaab, … - xb, yb, zb - b, ab - ab, aab, aaab, … - ax, bx, cx

33 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 33 Regular Expressions (cont.) Regular Expression –[a-c]x –[-ac]x –[ac-]x –[^0-9]x –\Dx –Chapter\s\d –(ho){2} there –(ho\s){2} there –.abc –(a|b)+x Example –ax, bx, cx –-x, ax, cx –ax, cx, -x – any non-digit char followed by x – Chapter followed by a blank followed by a digit –hoho there – any char followed by abc –ax, bx, aax, bbx, abx, bax,...

34 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 34 Regular Expressions (cont.) a{1,3}x a{2,}x \w\s\w ax, aax, aaax aax, aaax, aaaax, … word (alphanumeric plus dash) followed by a space followed by a word

35 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 35 Example R.E. [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 0 to 99 100 to 199 200 to 249250 to 255 This regular expression restricts a string to have values between 0 and 255. … Such a R.E. might be useful in describing an IP address...

36 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 36 IP Datatype Definition <pattern value="(([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3} ([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"> Datatype for representing IP addresses. Examples, 129.83.64.255, 64.128.2.71, etc. This datatype restricts each field of the IP address to have a value between zero and 255, i.e., [0-255].[0-255].[0-255].[0-255] Note: in the value attribute (above) the regular expression has been split over two lines. This is for readability purposes only. In practice the R.E. would all be on one line.

37 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 37 Regular Expression Parser Want to test your skill in writing regular expressions? Go to: http://www.xfront.org/xml-schema/ –Dan Potter has created a nice tool which allows you to enter a regular expression and then enter a string. The parser will then determine if your string conforms to your regular expression. Do Lab 5

38 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 38 Derived Types We can do a form of subclassing the type definitions. We call this "derived types" –derive by extension: extend the parent type with more elements –derive by restriction: constrain the parent type by constraining some of the elements to have a more restricted range of values, or a more restricted number of occurrences.

39 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 39 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue5.xsd

40 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 40 Elements of type Book will have 5 child elements - Title, Author, Date, ISBN, and Publisher.

41 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 41 Using Derived Types If we declare an element to be of type Publication then in the XML instance document that element's content can be either a Publication or a Book (since Book is a Publication).

42 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 42 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Catalogue" xmlns:cat="http://www.somewhere.org/Catalogue"> BookCatalogue6.xsd

43 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 43 <Catalogue xmlns ="http://www.somewhere.org/Catalogue" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.somewhere.org/Catalogue http://www.somewhere.org/Catalogue/BookCatalogue6.xsd"> Staying Young Forever Karin Granstrom Jordan, M.D. December, 1999 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row BookCatalogue2.xml

44 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 44 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. "The CatalogueEntry element is declared to be of type Publication. Book is derived from Publication. Therefore, Book is a Publication. Thus, the content of CatalogueEntry can be a Book. However, to indicate that the content is not the source type, but rather a derived type, we need to specify the derived type that is being used. The attribute 'type' comes from the XML Schema Instance (xsi) namespace."

45 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 45 Why is xsi:type Needed? Why, in an instance document, do we need to indicate the derived type being used? Suppose that there are several types derived (by extension) from Publication, and some of the extended element declarations are optional. In the instance document, if xsi:type was not used, it might get very difficult for an XML Parser to determine which derived type is being used. Do Lab 6

46 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 46 Derive by Restriction Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date. There must be exactly one Author element.

47 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 47 Restricting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extensions of it, or restrictions of it. This type cannot be extended nor restricted This type cannot be restricted This type cannot be extended

48 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 48 exact Types If you define a type to be exact, then other types may derive from it. However, in the instance document derived types may not be used in its stead.

49 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 49 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. Schema: Instance doc: This permits Publication elements, as well as types derived from Publication to be used as a child of Catalogue, e.g., Book

50 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 50 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. Schema: Instance doc: This prohibits the use of types derived by extension to be used as children of CatalogueEntry, e.g., this is not allowed

51 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 51 exact Types exact="extension" –Prohibits types derived by extension to be used in its stead in instance documents exact="restriction" –Prohibits types derived by restriction to be used in its stead in instance documents exact="#all" –Prohibits all derived types to be used in its stead in instance documents

52 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 52 Equivalence Oftentimes in daily conversation there are several ways to express something. –In Boston we use the words "T" and "subway" interchangeably. For example, "we took the T into town", or "we took the subway into town". Thus, "T" and "subway" are equivalent. Which one is used may depend upon what part of the state you live in, what mood you're in, or any number of factors. We would like to be able to express this "equivalence" capability in XML Schemas. –We would like to be able to declare in the schema an element called "subway", an element called "T", and state that "T" is equivalent to "subway". Thus, instance documents can use either "subway" or "T", depending on their preference.

53 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 53 equivClass We can create an element (called the exemplar) and then create other elements which state that they are equivalent to the exemplar. subway is the exemplar T is equivalent. So what's the big deal? - Where ever the exemplar can be used in an instance document, any equivalent element can also be used!

54 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 54 Red Line Schema: Instance doc: Note: the type of every element of an equivalence class must be the same as or derived from the type of the exemplar element. Red Line Alternative Instance doc:

55 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 55 Enabling Interoperability One very hard problem in today's world is how to enable different domains to communicate effectively when they don't all speak the same language. The equivClass is a good step towards enabling interoperability. –It makes explicit the equivalence of one element with another element.

56 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 56 Enabling Interoperability Imagine that you create an application. The application is designed to input XML documents. Suppose that within the XML document the application expects to see a element. Now suppose that it receives an XML document, but it contains a element. The application can go to the XML Schema and see that is equivalent to. Thus, it accepts and understands the XML document, even though it is using a different vocabulary than what the application was written to. Wow! Further, over time new element declarations can be added to the schema. For example, Without any change to the application, the application will be able to process XML documents with a element by simply consulting the schema. Mike Los coined the term "interoperability schema" to describe schemas used in the fashion described above. Do Lab 7

57 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 57 Abstract Elements You can declare an element to be abstract –Example. An abstract element is a kind of template/placeholder. If an element is abstract then in the XML instance document that element may not appear. –Example. may not appear in an instance document. However, elements that are equivalent to the abstract type may appear in its place.

58 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 58 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Catalogue" xmlns:cat="http://www.somewhere.org/Catalogue"> BookCatalogue7.xsd Since the Publication element is abstract, only equivalent elements can appear as children of Catalogue. The Book and Magazine elements are equivalent to the Publication element.

59 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 59 <Catalogue xmlns ="http://www.somewhere.org/Catalogue" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.somewhere.org/Catalogue http://www.somewhere.org/Catalogue/BookCatalogue7.xsd"> Natural Health December, 1999 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row BookCatalogue3.xml An XML Instance Document of BookCatalogue7.xsd

60 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 60 Attributes On the next slide I show a version of the BookCatalogue DTD that uses attributes. On the following slide I show how this is implemented using XML Schemas.

61 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 61 <!-- A Book has three attributes - Category, InStock, and Reviewer. Category must be either "autobiography", "non-fiction", or "fiction". A value must be supplied for this attribute whenever a Book element is used within a document. InStock can be either ”true" or ”false". If no value is supplied it defaults to ”false". Reviewer contains the name of the reviewer. It defaults to "" if no value is supplied --> <!ATTLIST Book Category (autobiography | non-fiction | fiction) #REQUIRED InStock (true | false) ”false" Reviewer CDATA ""> BookCatalogue2.dtd

62 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 62 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> A book catalogue contains zero or more books A Book has a Title, one or more Authors, a Date, an ISBN, and a Publisher cont. --->

63 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 63 A Book has three attributes - Category, InStock, and Reviewer. Category must be either "autobiography", "non-fiction", or "fiction". A value must be supplied for this attribute whenever a Book element is used within a document. InStock can be either "yes" or "no". If no value is supplied it defaults to "no". Reviewer contains the name of the reviewer. It defaults to "" if no value is supplied. BookCatalogue8.xsd

64 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 64 "The default value for maxOccurs is 1. Thus, this attribute is REQUIRED."

65 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 65 Alternate Schema On the next slide is another way of expressing the last example.

66 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 66 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue9.xsd

67 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 67 Note about Attributes The attribute declarations always come last, after the element declarations. Do Lab 8

68 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 68 group Element The group element enables you to group together element declarations. Note: the group element is just for grouping together element declarations, no attribute declarations allowed.

69 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 69 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> BookCatalogue10.xsd

70 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 70 Expressing Alternates <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:sig="http://www.somewhere.org/Examples"> DTD: XML Schema:

71 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 71 Expressing Repeats <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:bi="http://www.somewhere.org/Examples"> DTD: XML Schema:

72 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 72 Expressing Any Order <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples"> XML Schema: Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this cannot be done with DTDs). order="all" means that Book must contain all five child elements, but they may occur in any order. Note: minOccurs, maxOccurs are both fixed to a value of "1". Do Lab 9

73 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 73 Empty Element <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples"> Schema: Instance doc: Do Lab 10

74 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 74 Uniqueness DTDs provided the ID attribute datatype for uniqueness (i.e., an ID value must be unique throughout the entire document, and the XML parser enforces this). XML Schema has much enhanced uniqueness capabilities: –enables you to define element content to be unique. –enables you to define non-ID attributes to be unique. –enables you to define a combination of element content and attributes to be unique. –enables you to distinguish between unique and key. –enables you to declare the range of the document over which something is unique

75 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 75 unique vs key Key: an element/attribute (or combination thereof) which is defined to be a key must –always be present (minOccurs must be greater than zero) –be non-nullable (i.e., nullable="false") –be unique Key implies unique, but unique does not imply key

76 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 76 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row Roger Costello <Book titleRef="Illusions The Adventures of a Reluctant Messiah" categoryRef="fiction"/> + = key + = keyref

77 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 77 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Library" xmlns:lib="http://www.somewhere.org/Library">

78 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 78 Library/BookCatalogue/Book Title @Category Library/CheckoutRegister/Book @titleRef @categoryRef BookCatalogue11.xsd

79 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 79 Defining a Key Library/BookCatalogue/Book Title @Category "I define that the combination of the contents of Title plus the value of the attribute Category is to be unique throughout an XML instance document. The Title element and Category attribute lie within the XPath expression shown in the selector element." Note: you can have one or more field elements. The key will be the combination of all the fields.

80 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 80 Defining a Reference to a Key Library/CheckoutRegister/Book @titleRef @categoryRef "I define that the values in the two attributes, titleRef and categoryRef, combined, are a reference to a key defined by bookKey. The two attributes are located at the XPath expression found in the selector element." Note: if there are 2 fields in the key, then there must be 2 fields in the keyref, if there are 3 fields in the key, then there must be 3 fields in the keyref, etc. Further, the fields in the keyref must match in type and position to the key.

81 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 81 Defining Uniqueness Library/BookCatalogue/Book Title @Category "I define that the combination of the contents of Title plus the value of the attribute Category is to be unique throughout an XML instance document. The Title element and Category attribute lie within the XPath expression shown in the selector element."

82 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 82 Specifying scope of uniqueness in XML Schemas? - The key/keyref/unique elements may be placed anywhere in your schema. - Where you place them determines the scope of the uniqueness. - In our example we placed the key/keyref elements at the top-level (direct child of the schema element). Thus, we are stating that in an instance document the uniqueness is with respect to the entire document. - On the other hand, if we were to place the key/keyref elements as a child of the Book element then the uniqueness will have a scope of just the Book element. Thus, over the entire instance document there may be repeats, but within any Book element it will be unique.

83 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 83 any Element The any element allows any well-formed XML. The free-form element can contain any well-formed XML (WFXML)." This is great!

84 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 84 anyAttribute The anyAttribute allows any attribute The free-form element can have any attribute.

85 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 85 (Almost) any Element/Attribute allows any well-formed XML element, provided the element is in another namespace than the one we're defining. allows any well-formed XML element, provided it's from the specified namespace. allows any well-formed XML element, provided it's from the namespace that we're defining. allows any attribute, provided the attribute is in another namespace than the one we're defining. allows any attribute, provided it's from the specified namespace. allows any attribute, provided it's from the namespace that we're defining.

86 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 86 Validating an XML Instance Document Validation can apply to the entire XML instance document, or to a single element.

87 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 87 <Library xmlns:book="http://www.somewhere.org/Examples" xmlns:person="http://www.somewhere-else.org/Person" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation= "http://www.somewhere.org/Examples http://www.somewhere.org/ Examples/Book.xsd http://www.somewhere-else.org/Person http://www.somewhere-else.org/Person/Person.xsd"> Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row John Doe 123-45-6789 Sally Smith 000-11-2345 Library.xml Validating using two schemas

88 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 88 Assembling a Schema from Multiple Schema Documents The include element allows you to bring in schema definitions from other schema –They must all have the same namespace –The net effect of include is as though you had typed all the definitions in directly into the containing schema … Book.xsd Person.xsd Library.xsd

89 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 89 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:lib="http://www.somewhere.org/Examples"> Library.xsd

90 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 90 import Element The import element allows you to reference elements in another namespace <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:html="http://www.w3.org/1999/XHTML"> <import namespace="http://www.w3.org/1999/XHTML" schemaLocation="http://www.w3.org/XHTML/xhtml.xsd"/>... …

91 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 91 Declaration vs Definition You make declarations of things that will be used in an XML instance document. You make definitions of things that are just used in the schema document; a definition creates a new type Declarations: - element declarations - attribute declarations Definitions: - type definitions - attribute group definitions - model group definitions

92 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 92 Symbol Space Each type definition creates a new symbol space Top-level element declarations are in the top-level symbol space

93 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 93 <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples"> 3 different titles

94 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 94 Version Management The schema element has an optional attribute, version, which you may use to indicate the version of your schema (for private version documentation of the schema) <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" version="1.0">...

95 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 95 or ? When do you use the type element and when do you use the datatype element? –Use the type element when there are elements and/or attributes –Use the datatype element when it is a primitive type (string, integer, etc)

96 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 96 null Content You can indicate in a schema that an element may be null in the instance document. Empty content vs null: –Empty: an element with a type of content="empty" is constrained to have no content. –null: an instance document element may indicate no value is available by setting an attribute - xsi:null - equal to 'true' John Doe XML Schema: XML instance document: The content of middle can be a NMTOKEN value or, we can indicate that its content is undefined.

97 Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 97 ur-type The ur-type is the source for all types which do not specify a value for the source attribute. It is the type for all elements which do not specify a type. –Example:


Download ppt "Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Roger L."

Similar presentations


Ads by Google