Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)

Similar presentations


Presentation on theme: "Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)"— Presentation transcript:

1 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas http://www.w3.org/TR/xmlschema-0/ (Primer) http://www.w3.org/TR/xmlschema-1/ (Structures) http://www.w3.org/TR/xmlschema-2/ (Datatypes) Roger L. Costello XML Technologies Course With changes by Thomas Krichel

2 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 2 Rogers acknowledgements Special thanks to the following people for their help in answering my unending questions and/or for finding errors and making suggestions: –Henry Thompson –Robert Melskens –Jonathan Rich –Francis Norton –Rick Jelliffe –Curt Arnold

3 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 3 Thomas acknowledgments Roger Costello Mitre Corp.

4 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 4 What is XML Schema? XML Schema is vocabulary for expressing constraints for the validity of an XML document. A piece of XML is valid if it satisfies the constraints expressed in another XML file, the schema file. The idea is to check if the XML file is fit for a certain purpose.

5 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 5 Example 32.904237 73.620290 2 To be valid, this XML snippet must meet all the following constraints: 1. The location must be comprised of a latitude, followed by a longitude, followed by an indication of the uncertainty of the lat/lon measurements. 2. The latitude must be a decimal with a value between -90 to +90 3. The longitude must be a decimal with a value between -180 to +180 4. For both latitude and longitude the number of digits to the right of the decimal point must be exactly six digits. 5. The value of uncertainty must be a non-negative integer 6. The uncertainty units must be either meters or feet.

6 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 6 Validating your data 32.904237 73.620290 2 -check that the latitude is between -90 and +90 -check that the longitude is between -180 and +180 - check that the fraction digits is 6 … Etc.. XML instance XML Schema validator Data is ok! XML Schema file software

7 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 7 History of Schema Once upon a time, there was SGML SGML has a schema language called a DTD. It is crap –Different syntax then SGML –Main focus on presence and absence of elements –Very limited capabilties to check contents of elements (datatypes)

8 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 8 XML Schemas can constrain the structure of instance documents –"this element contains these elements, which contains these other elements, etc the datatype of each element/attribute –"this element shall hold an integer with the range 0 to 12,000"

9 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 9 Highlights of XML Schemas 44 built-in datatypes Can create your own datatypes by extending or restricting existing datatypes Written in the same syntax as instance documents Can express sets, i.e., can define the child elements to occur in any order Can specify element content as being unique (keys on content) and uniqueness within a region Can define multiple elements with the same name but different content Can define elements with nil content Can define substitutable elements

10 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 10 important schema concepts simple types: types that can not have child elements –elements that only have text contents and no attributes –attributes complex type: type of anything that can have child attributes

11 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 11 important schema concepts global declarations are direct children of the root schema element. They are visible everywhere. all local declarations are local and are limited in scope to the element that they appear within

12 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 12 important schema concepts Value space. The range of values that the type can take Lexical space. The range litterals that represent the value Set of facets. The defining properties of a type. –Fundamental facets include equality, order, bounds, cardinality, numeric/non-numeric –Constraining facets include ranges for numbers, string lengths, or a regular expressions

13 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 13 Namespaces XML Schema file mixes vocabulary from the XML Schema language with own vocabulary to be created. Has to keep both separate using namespaces. Namespaces associate a URI with names.

14 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 14 element complexType schema sequence http://www.w3.org/2001/XMLSchema string integer boolean BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) This is the vocabulary that XML Schemas provide to define your new vocabulary This is the vocabulary for our book store xml description.

15 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 15 BookStore.xsd (see example01) xsd = Xml-Schema Definition (explanations on succeeding pages)

16 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 16

17 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 17 All XML Schemas have "schema" as the root element.

18 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 18 The elements and datatypes that are used to construct schemas - schema - element - complexType - sequence - string come from the http://…/XMLSchema namespace

19 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 19 element complexType schema sequence http://www.w3.org/2001/XMLSchema XMLSchema Namespace string integer boolean

20 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 20 Says that the elements defined by this schema - BookStore - Book - Title - Author - Date - ISBN - Publisher are to go in this namespace

21 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 21 BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) Book Namespace (targetNamespace)

22 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 22 This is referencing a Book element declaration. The Book in what namespace? The default namespace is http://www.books.org which is the targetNamespace!

23 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 23 This is a directive to any instance documents which conform to this schema: Any elements that are defined in this schema must be namespace-qualified when used in instance documents.

24 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 24 Referencing a schema in an XML instance document My Life and Times Paul McCartney July, 1998 94303-12021-43892 McMillin Publishing... 1. First, using a default namespace declaration, tell the schema-validator that all of the elements used in this instance document come from the http://www.books.org namespace. 2. Second, with schemaLocation tell the schema-validator that the http://www.books.org namespace is defined by BookStore.xsd (i.e., schemaLocation contains a pair of values). 3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one in the XML Schema-instance namespace. 1 2 3

25 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 25 schemaLocation type noNamespaceSchemaLocation http://www.w3.org/2001/XMLSchema-instance XMLSchema-instance Namespace nil

26 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 26 Referencing a schema in an XML instance document BookStore.xml BookStore.xsd targetNamespace="http://www.books.org" schemaLocation="http://www.books.org BookStore.xsd" - defines elements in namespace http://www.books.org - uses elements from namespace http://www.books.org A schema defines a new vocabulary. Instance documents use that new vocabulary.

27 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 27 Note multiple levels of checking BookStore.xmlBookStore.xsd XMLSchema.xsd (schema-for-schemas) Validate that the xml document conforms to the rules described in BookStore.xsd Validate that BookStore.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas

28 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 28 Default Value for minOccurs and maxOccurs The default value for minOccurs is "1" The default value for maxOccurs is "1" Equivalent! Do Lab1

29 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 29 Qualify XMLSchema, Default targetNamespace In the first example, we explicitly qualified all elements from the XML Schema namespace. The targetNamespace was the default namespace. BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) http://www.w3.org/2001/XMLSchema element complexType schema sequence string integer boolean

30 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 30 Default XMLSchema, Qualify targetNamespace Alternatively (equivalently), we can design our schema so that XMLSchema is the default namespace. BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) http://www.w3.org/2001/XMLSchema element complexType schema sequence string integer boolean

31 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 31 (see example02) Note that http://…/XMLSchema is the default namespace. Consequently, there are no namespace qualifiers on - schema - element - complexType - sequence - string

32 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 32 Here we are referencing a Book element. Where is that Book element defined? In what namespace?.

33 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 33 "bk:" References the targetNamespace BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) http://www.w3.org/2001/XMLSchema bk Do Lab1.1 element complexType schema sequence string integer boolean Consequently, bk:Book refers to the Book element in the targetNamespace.

34 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 34 Inlining Element Declarations In the previous examples we declared an element and then we refed to that element declaration. Alternatively, we can inline the element declarations. On the following slide is an alternate (equivalent) way of representing the schema shown previously, using inlined element declarations.

35 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 35 (see example03) Note that we have moved all the element declarations inline, and we are no longer ref'ing to the element declarations. This results in a much more compact schema!

36 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 36 Do Lab 2 (see example03) Anonymous types (no name)

37 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 37 Named Types The following slide shows an alternate (equivalent) schema which uses a named complexType.

38 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 38 (see example04) Named type The advantage of splitting out Book's element declarations and wrapping them in a named type is that now this type can be reused by other elements.

39 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 39 Please note that: is equivalent to: Element A references the complexType foo. Element A has the complexType definition inlined in the element declaration.

40 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 40 type Attribute or complexType Child Element, but not Both! An element declaration can have a type attribute, or a complexType child element, but it cannot have both a type attribute and a complexType child element. …

41 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 41 Summary of Declaring Elements (two ways to do it) A simple type (e.g., xsd:string) or the name of a complexType (e.g., BookPublication) … 1 2 A nonnegative integer A nonnegative integer or "unbounded" Note: minOccurs and maxOccurs can only be used in nested (local) element declarations.

42 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 42 and Use the complexType element when you want to define child elements and/or attributes of an element Use the simpleType element when you want to create a new type that is a refinement of a built-in type (string, date, gYear, etc)

43 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 43 Refining our data Defining the Date element to be of type string is unsatisfactory (it allows any string value to be input as the content of the Date element, including non-date strings). –Done in the next two slides Similarly, constrain the content of the ISBN element to content of this form: d-ddddd-ddd-d or d-ddd-ddddd-d or d-dd-dddddd-d, where 'd' stands for 'digit'

44 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 44 The gYear built-in Datatype A built-in datatype (Gregorian calendar year) Elements declared to be of type gYear must follow this form: CCYY –range for CC is: 00-99 –range for YY is: 00-99 –Example: 1999 indicates the gYear 1999

45 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 45 The date built-in Datatype Elements declared to be of type date must follow this form: CCYY-MM-DD –range for CC and YY is: 00-99 –range for MM is: 01-12 –range for DD is: 01-28 if month is 2 01-29 if month is 2 and the gYear is a leap gYear 01-30 if month is 4, 6, 9, or 11 01-31 if month is 1, 3, 5, 7, 8, 10, or 12 –Example: 1999-05-31 represents May 31, 1999

46 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 46 Creating your own Datatypes A new datatype can be defined from an existing datatype (called the "base" type) by specifying values for one or more of the optional facets for the base type. Example. The string primitive datatype has six optional facets: –length –minLength –maxLength –pattern –enumeration –whitespace (legal values: preserve, replace, collapse)

47 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 47 Primitive Built-in Datatypes string –boolean –decimal –float –double –duration –dateTime –time –date –gYearMonth –gYear –gMonthDay –"Hello World" –{true, false, 1, 0} –7.08 – 12.56E3, 12, 12560, 0, -0, INF, -INF, NAN – P1Y2M3DT10H30M12.3S – format: CCYY-MM-DDThh-mm-ss – format: hh:mm:ss.sss – format: CCYY-MM-DD – format: CCYY-MM – format: CCYY – format: --MM-DD Note: 'T' is the date/time separator INF = infinity NAN = not-a-number

48 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 48 Primitive Built-in Datatypes Primitive Datatypes –gDay –gMonth –hexBinary –base64Binary –anyURI –QName –NOTATION Atomic, built-in – format: ---DD (note the 3 dashes) – format: --MM-- –a hex string –a base64 string –http://www.xfront.com –a namespace qualified name –a NOTATION from the XML spec

49 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 49 Derived Built-in Datatypes (cont.) –normalizedString –Token –language –IDREFS –ENTITIES –NMTOKEN –NMTOKENS –Name –NCName –ID –IDREF –ENTITY –integer –nonPositiveInteger – A string without tabs, line feeds, or carriage returns – String w/o tabs, l/f, leading/trailing spaces, consecutive spaces –any valid xml:lang value, e.g., EN, FR,... –must be used only with attributes –part (no namespace qualifier) –must be used only with attributes –456 –negative infinity to 0

50 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 50 Built-in Datatypes (cont.) Derived types –negativeInteger –Long –int –short –byte –nonNegativeInteger –unsignedLong –unsignedInt –unsignedShort –unsignedByte –positiveInteger Subtype of primitive datatype – negative infinity to -1 – -9223372036854775808 to 9223372036854775807 – -2147483648 to 2147483647 – -32768 to 32767 – -127 to 128 – 0 to infinity – 0 to 18446744073709551615 – 0 to 4294967295 – 0 to 65535 –0 to 255 –1 to infinity Do Lab 3

51 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 51 (see example05) Here we are defining a new (user-defined) data- type, called ISBNType. Declaring Date to be of type gYear, and ISBN to be of type ISBNType (defined above)

52 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 52 Equivalent Expressions The vertical bar means "or"

53 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 53 Defining a new type through regular expressions. The contents of the element may satisfy any of the expressions that are enumerated.

54 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 54 Example of Creating a New Datatype by Specifying Facet Values 1. This creates a new datatype called 'TelephoneNumber'. 2. Elements of this type can hold string values, 3. But the string length must be exactly 8 characters long and 4. The string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'. 1 2 3 4

55 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 55 Another Example This creates a new type called shape. An element declared to be of this type must have either the value circle, or triangle, or square.

56 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 56 General Form of Creating a New Datatype by Specifying Facet Values … Facets: - length - minlength - maxlength - pattern - enumeration - minInclusive - maxInclusive - minExclusive - maxExclusive... Sources: - string - boolean - number - float - double - duration - dateTime - time...

57 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 57 Facets of the integer Datatype The integer datatype has 8 optional facets: –totalDigits –pattern –whitespace –enumeration –maxInclusive –maxExclusive –minInclusive –minExclusive

58 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 58 Example This creates a new datatype called 'EarthSurfaceElevation'. Elements declared to be of this type can hold an integer. However, the integer is restricted to have a value between -1290 and 29035, inclusive.

59 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 59 Multiple Facets - "and" them together, or "or" them together? An element declared to be of type TelephoneNumber must be a string of length=8 and the string must follow the pattern: 3 digits, dash, 4 digits. An element declared to be of type shape must be a string with a value of either circle, or triangle, or square. Patterns, enumerations => "or" them together, all others facets => "and" them

60 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 60 Creating a simpleType from another simpleType Thus far we have created a simpleType using one of the built-in datatypes as our base type. However, we can create a simpleType that uses another simpleType as the base. See next slide.

61 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 61

62 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 62 Fixing a Facet Value Sometimes when we define a simpleType we want to require that one (or more) facet have an unchanging value. That is, we want to make the facet a constant. simpleTypes which derive from this simpleType may not change this facet.

63 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 63 Error! Cannot change the value of a fixed facet!

64 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 64 Element Containing a User- Defined Simple Type Example. Create a schema element declaration for an elevation element. Declare the elevation element to be an integer with a range -1290 to 29035 5240 Here's one way of declaring the elevation element:

65 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 65 Element Containing a User- Defined Simple Type (cont.) Here's an alternative method for declaring elevation: The simpleType definition is defined inline, it is an anonymous simpleType definition. The disadvantage of this approach is that this simpleType may not be reused by other elements.

66 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 66 Summary of Declaring Elements (three ways to do it) … 1 2 … 3

67 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 67 Annotating Schemas The element is used for documenting the schema, both for humans and for programs. –Use for providing a comment to humans –Use for providing a comment to programs The content is any well-formed XML Note that annotations have no effect on schema validation The following constraint is not expressible with XML Schema: The value of element A should be greater than the value of element B. So, we need to use a separate tool (e.g., Schematron) to check this constraint. We will express this constraint in the appinfo section (below). A should be greater than B

68 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 68 Where Can You Put Annotations? You cannot put annotations at just any random location in the schema. Here are the rules for where an annotation element can go: –annotations may occur before and after any global component –annotations may occur only at the beginning of non-global components

69 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 69 Can put annotations only at these locations Suppose that you want to annotate, say, the Date element declaration. What do we do? See next page...

70 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 70 This is how to annotate the Date element! Inline the annotation within the Date element declaration.

71 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 71 Two Optional Attributes for the documentation Element In the previous example we showed with no attributes. Actually, it can have two attributes: –source: this attribute contains a URL to a file which contains supplemental information –xml:lang: this attribute specifies the language that the documentation was written in

72 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 72 One Optional Attribute for the appinfo Element In the previous example we showed with no attributes. Actually, it can have one attribute: –source: this attribute contains a URL to a file which contains supplemental information

73 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 73 Up for a Breath Wow! We have really been into the depths of XML Schemas. Let's back up for a moment and look at XML Schemas from a "big picture" point of view.

74 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 74 Code to check the structure and content of the data Code to actually do the work If your data is structured as XML, and there is a schema, then you can hand the data-checking task off to a schema validator. In a typical program, 60% of code is used to check the input.

75 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 75 Classic use of XML Schemas (Trading Partners - B2B) Supplier Consumer P.O. Schema Validator P.O. Schema Software to Process P.O. "P.O. is okay" P.O. (Schema at third-party, neutral web site)

76 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 76 Other aspects of XML schemas Organizations agree to structure their XML documents in conformance with an XML Schema. Thus, the XML Schema acts as a contract between the organizations. An XML Schema document contains lots of data about the data in the XML instance documents, such as the datatype of the data, the data's range of values, how the data is related to another piece of data (parent/child, sibling relationship), i.e., XML Schemas contain metadata

77 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 77 XML Schema Validate XML documents Automatic GUI generation Automatic API generation Semantic Web??? Smart Editor And more …

78 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 78 Describing Metadata using Schemas XML Schema Strategy - two documents are used to provide metadata: –a schema document specifies the properties (metadata) for a class of resources (objects); –each instance document provides specific values for the properties.

79 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 79 XML Schema: Specifies the Properties for a Class of Resources "For the class of Book resources, we identify five properties - Title, Author, Date, ISBN, and Publisher"

80 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 80 XML Instance Document: Specifies Values for the Properties Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. "For a specific instance of a Book resource, here are the values for the properties. Use schemaLocation to identify the companion document (i.e., the schema) which defines the Book class of resources." Do Lab 4

81 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 81 Regular Expressions Recall that the string datatype has a pattern facet. The value of a pattern facet is a regular expression. Below are some examples of regular expressions: Regular Expression - Chapter \d - Chapter \d - a*b - [xyz]b - a?b - a+b - [a-c]x Example - Chapter 1 - b, ab, aab, aaab, … - xb, yb, zb - b, ab - ab, aab, aaab, … - ax, bx, cx

82 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 82 Regular Expressions (cont.) Regular Expression –[a-c]x –[-ac]x –[ac-]x –[^0-9]x –\Dx –Chapter\s\d –(ho){2} there –(ho\s){2} there –.abc –(a|b)+x Example –ax, bx, cx –-x, ax, cx –ax, cx, -x – any non-digit char followed by x – Chapter followed by a blank followed by a digit –hoho there – any (one) char followed by abc –ax, bx, aax, bbx, abx, bax,...

83 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 83 Regular Expressions (cont.) a{1,3}x a{2,}x \w\s\w ax, aax, aaax aax, aaax, aaaax, … word character (alphanumeric plus dash) followed by a space followed by a word character [a-zA-Z-[Ol]]* A string comprised of any lower and upper case letters, except "O" and "l" \. The period "." (Without the backward slash the period means "any character")

84 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 84 Regular Expressions (cont.) \n \r \t \\ \| \- \^ \? \* \+ \{ \} \( \) \[ \] linefeed carriage return tab The backward slash \ The vertical bar | The hyphen - The caret ^ The question mark ? The asterisk * The plus sign + The open curly brace { The close curly brace } The open parenthesis ( The close parenthesis ) The open square bracket [ The close square bracket ]

85 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 85 Regular Expressions (concluded) \p{L} \p{Lu} \p{Ll} \p{N} \p{Nd} \p{P} \p{Sc} A letter, from any language An uppercase letter, from any language A lowercase letter, from any language A number - Roman, fractions, etc A digit from any language A punctuation symbol A currency sign, from any language "currency sign from any language, followed by one or more digits from any language, optionally followed by a period and two digits from any language" $45.99 ¥300

86 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 86 Example R.E. [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 0 to 99 100 to 199 200 to 249250 to 255 This regular expression restricts a string to have values between 0 and 255. … Such a R.E. might be useful in describing an IP address...

87 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 87 IP Datatype Definition Datatype for representing IP addresses. Note: in the value attribute (above) the regular expression has been split over two lines. This is for readability purposes only. In practice the R.E. would all be on one line.

88 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 88 Regular Expression Parser Want to test your skill in writing regular expressions? Go to: http://www.xfront.org/xml-schema/ –Dan Potter has created a nice Web page which allows you to type in a regular expression and then type in a string. Dan's parser will then determine if your string conforms to your regular expression. Do Lab 5

89 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 89 Derived Types derive by extension: extend the parent complexType with more elements derive by restriction: create a type which is a subset of the base type. –redefine a base type element to have a restricted range of values, or –redefine a base type element to have a more restricted number of occurrences.

90 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 90 Title Author Date Publication ISBN Publisher BookPublication Derived Types by extension

91 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 91 Note that BookPublication extends the Publication type, i.e., we are doing Derive by Extension (see example06)

92 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 92 Elements declared to be of type BookPublication will have 5 child elements - Title, Author, Date, ISBN, and Publisher. Note that the elements in the derived type are appended to the elements in the base type. Do lab 6

93 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 93 Derive by Restriction Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date. However, there must be exactly one Author element. Note that in the restriction type you must repeat all the declarations from the base type (except when the base type has an element with minOccurs="0" and the subtype wishes to delete it. See next slide).

94 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 94 Deleting an element in the base type Note that in this subtype we have eliminated the Author element, i.e., the subtype is just comprised of an unbounded number of Title elements followed by a single Date element. If the base type has an element with minOccurs="0", and the subtype wishes to not have that element, then it can simply leave it out.

95 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 95 Derive by Restriction (cont.) You might (legitimately) ask: –why do I have to repeat all the declarations from the base type? Why can't I simply show the delta (i.e., show those declarations that are changed)? –What's the advantage of doing derived by restriction if I have to repeat everything? I'm certainly not saving on typing. Answer: –Even though you have to retype everything in the base type there are advantages to explicitly associating a type with a base type. Later we will see that an elements content model may be substituted by the content model of derived types. Thus, the content of an element that has been declared to be of type Publication can be substituted with a SingleAuthorPublication content since SingleAuthorPublication derives from Publication. We will discuss this type substitutability in detail later.

96 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 96 Prohibiting Derivations Publication cannot be extended nor restricted Publication cannot be restricted Publication cannot be extended

97 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 97 Terminology: Declaration vs Definition You declare elements and attributes. Schema components that are declared are those that have a representation in an XML instance document. You define components that are used just within the schema document(s). Schema components that are defined are those that have no representation in an XML instance document. Declarations: - element declarations - attribute declarations Definitions: - type (simple, complex) definitions - attribute group definitions - model group definitions

98 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 98 Creating Lists There are times when you will want an element to contain a list of values, e.g., "The contents of the Numbers element is a list of numbers". Example: For a document containing a Lottery drawing we might have 12 49 37 99 20 67 How do we declare the element Numbers... (1) To contain a list of integers, and (2) Each integer is restricted to be between 1 and 99, and (3) The total number of integers in the list is exactly six.

99 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 99 July 1 21 3 67 8 90 12 July 8 55 31 4 57 98 22 July 15 70 77 19 35 44 11 Lottery.xml

100 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 100 Lottery.xsd

101 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 101 LotteryNumbers Need Stronger Datatyping The list in the previous schema has two problems: –It allows to contain an arbitrarily long list –The numbers in the list may be any positiveInteger We need to: –Restrict the list to length value="6" –Restrict the numbers to maxInclusive value="99"

102 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 102 NumbersList is a list where the type of each item is OneToNintyNine. LotteryNumbers restricts NumbersList to a length of six (i.e., an element declared to be of type LotteryNumbers must hold a list of numbers, between 1 and 99, and the length of the list must be exactly six). lottery.xsd snippet for the LotteryNumbers

103 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 103 Alternatively, This is read as: "We are creating a new type called LotteryNumbers. It is a restriction. At this point we can either use the base attribute or a simpleType child element to indicate the type that we are restricting (you cannot use both the base attribute and the simpleType child element). We want to restrict the type that is a list of OneToNintyNine. We will restrict that type to a length of 6."

104 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 104 Limits of list type You cannot create a list of lists You cannot create a list of complexType In the instance document, you must separate each item in a list with white space (blank space, tab, or carriage return) The only facets that you may use with a list type are: –length: use this to specify the length of the list –minLength: use this to specify the minimum length of the list –maxLength: use this to specify the maximum length of the list –enumeration: use this to specify the values that the list may have –pattern: use this to specify the values that the list may have Do Lab 11.d

105 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 105 Creating a simpleType that is a Union of Types simpleType 1 simpleType 2 simpleType 1 + simpleType 2 Note: you can create a union of more than just two simpleTypes

106 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 106

107 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 107 Cont. -->

108 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 108 Y2KFamilyReunion.xsd (see example 20)

109 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 109 Mary Pat Patti Christopher Elizabeth Judy Peter Tom Cheryl Marc Joe Roger Y2KFamilyReunion.xml (see example 20)

110 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 110 Alternative … Version 2 of Y2KFamilyReunion.xsd (see example 21) A union of anonymous simpleTypes The disadvantage of creating the union type in this manner is that none of the simpleTypes are reusable.

111 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 111 Review of Union simpleType Alternatively, … … …

112 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 112 "maxOccurs" is a Union type! The value space for maxOccurs is the union of the value space for nonNegativeInteger with the value space of a simpleType which contains only one enumeration value - "unbounded". See next slide for how maxOccurs is defined in the schema-for- schemas (not exactly how it's defined in the schema-for-schemas, but it gives you the idea of how the schemas-for-schemas might implement it)

113 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 113 (see example22)

114 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 114 Summary of Declaring simpleTypes 1. simpleType that uses a built-in base type: 2. simpleType that uses another simpleType as the base type:

115 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 115 Summary of Declaring simpleTypes 3. simpleType that declares a list type: where the datatype OneToNintyNine is declared as: 4. An alternate form of the above, where the list's datatype is specified using an inlined simpleType:

116 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 116 Summary: Declaring simpleTypes 5. simpleType that declares a union type: where the datatype UnboundedType is declared as: 6. An alternate form of the above, where the datatype UnboundedType is specified using an inline simpleType:

117 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 117 Terminology: Global versus Local Global element declarations, global type definitions: –These are element declarations/type definitions that are immediate children of Local element declarations, local type definitions: –These are element declarations/type definitions that are nested within other elements/types.

118 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 118 Global type definition Global element declaration Local element declarations Local type definition

119 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 119 Global vs Local … What's the Big Deal? So what if an element or type is global or local. What practical impact does it have? –Answer: only global elements/types can be referenced (i.e., reused). –Thus, if an element/type is local then it is effectively invisible to the rest of the schema (and to other schemas).

120 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 120 Element Substitution Oftentimes in daily conversation there are several ways to express something. –In Boston we use the words "T" and "subway" interchangeably. For example, "we took the T into town", or "we took the subway into town". We would like to be able to express this substitutability in XML Schemas. –That is, we would like to be able to declare in a schema an element called "subway", an element called "T", and state that "T may be substituted for "subway". Instance documents can then use either or, depending on their preference.

121 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 121 substitutionGroup We can define a group of substitutable elements (called a substitutionGroup) by declaring an element (called the head) and then declaring other elements which state that they are substitutable for the head element. subway is the head element T is substitutable for subway So what's the big deal? - Anywhere a head element can be used in an instance document, any member of the substitutionGroup can be substituted!

122 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 122 Red Line Schema: Instance doc: Linea Roja Alternative instance doc (customized for our Spanish clients): Going multilingual

123 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 123 using substitutionGroup The elements that are declared to be in the substitution group (e.g., subway and T) must be declared as global elements If the type of a substitutionGroup element is the same as the head element then you can omit it (the type)

124 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 124 using substitutionGroup (cont.) The type of every element in the substitutionGroup must be the same as, or derived from, the type of the head element. This type must be the same as "xxx" or, it must be derived from "xxx".

125 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 125 Element Substitution with Derived Types

126 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 126 BookType and MagazineType Derive from PublicationType PublicationType BookTypeMagazineType In order for Book and Magazine to be in a substitutionGroup with Publication, their type (BookType and MagazineType, respectively) must be the same as, or derived from Publication's type (PublicationType)

127 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 127

128 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 128 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. Natural Health 1999 The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row can contain any element in the substitutionGroup with Publication!

129 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 129 Blocking Element Substitution An element may wish to block other elements from substituting with it. This is achieved by adding a block attribute.

130 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 130 Red Line Schema: Instance doc: Red Line Not allowed!

131 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 131 One more Note about substitutionGroup 1.Transitive: if element A can substitute for element B, and element B can substitute for element C, then element A can substitute for element C. A --> B --> C then A --> C 2. Non-symmetric: if element A can substitute for element B, it is not the case that element B can substitute for element A. Do Lab 7

132 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 132 Declaring Attributes Declare a required attribute Category that can take the value autobiography fiction and non-fiction. Declare an optional attribute inStock that can take the values true or false, true by default. Declare an optional argument Reviewer.

133 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 133 (see example07) InStock Reviewer Category

134 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 134 Attributes are simpleTypes because they can not have child elements.

135 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 135 Summary of Declaring Attributes (two ways to do it) required optional prohibited Do not use the "use" attribute if you use either default or fixed. xsd:string xsd:integer xsd:boolean... … 1 2

136 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 136 use --> use it only with Local Attribute Declarations The "use" attribute only makes sense in the context of an element declaration. Example: "for each Book element, the Category attribute is required". When declaring a global attribute do not specify a "use"

137 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 137 … … Local attribute declaration. Use the "use" attribute here. Global attribute declaration. Must NOT have a "use" ("use" only makes sense in the context of an element)

138 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 138 Inlining Attributes On the next slide is another way of expressing the last example - the attributes are inlined within the Book declaration rather than being separately defined in an attributeGroup. (I only show a portion of the schema - the Book element declaration.)

139 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 139 (see example08)

140 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 140 Notes about Attributes The attribute declarations always come last, after the element declarations. The attributes are always with respect to the element that they are defined (nested) within. … "bar and boo are attributes of foo"

141 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 141 These attributes apply to the element they are nested within (Book) That is, Book has three attributes - Category, InStock, and Reviewer. Do Lab 8.a,

142 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 142 Element with Simple Content and Attributes Example. Consider this: 5440 The elevation element has these two constraints: - it has a simple (integer) content - it has an attribute called units How do we declare elevation?

143 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 143 1. elevation contains an attribute. - therefore, we must use 2. However, elevation does not contain child elements (which is what we generally use to indicate). Instead, elevation contains simpleContent. 3. We wish to extend the simpleContent (an integer)... 4. with an attribute. 1 2 3 4

144 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 144 elevation - use Stronger Datatype In the declaration for elevation we allowed it to hold any integer. Further, we allowed the units attribute to hold any string. Let's restrict elevation to hold an integer with a range 0 - 12,000 and let's restrict units to hold either the string "feet" or the string "meters"

145 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 145

146 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 146 Summary: declaring elements (on the next five slides)

147 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 147 1 Elements with simple contents Declaring an element using a built-in type: Declaring an element using a user-defined simpleType: An alternative formulation of the above shapes example is to inline the simpleType definition:

148 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 148 2 Elements that have children Defining the child elements inline: An alternate formulation of the above Person example

149 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 149 3. Extending another complexType

150 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 150 4. Restricting another complexType

151 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 151 5. Simple content with attributes Example. Large, green, sour

152 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 152 complexContent / simpleContent With complexContent you extend or restrict a complexType With simpleContent you extend or restrict a simpleType … X must be a complexType … Y must be a simpleType versus Do Lab 8.b, 8.c

153 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 153 xsd:group Element The xsd:group element enables you to group together element declarations. Note: the xsd:group element is just for grouping together element declarations, no attribute declarations allowed!

154 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 154 (see example09)

155 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 155 Another example showing the use of the element

156 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 156 Note about group Group definitions must be global...... Cannot inline the group definition. Instead, you must use a ref here and define the group globally.

157 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 157 Expressing Alternates Example: build an element transportation that contains either train, plane or car XML Schema: Note: the choice is an exclusive-or, that is, transportation can contain only one element.

158 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 158 Expressing Repeatable Choice XML Schema: Notes: 1. An element can fix its value, using the fixed attribute. 2. When you don't specify a value for minOccurs, it defaults to "1". Same for maxOccurs. See the last example (transportation) where we used a element with no minOccurs or maxOccurs. (see example 11) Example: define the element binary-string as a repetition of elements zero and one.

159 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 159 fixed/default Element Values When you declare an element you can give it a fixed or default value. –Then, in the instance document, you can leave the element empty. … 0 or equivalently: … red or equivalently:

160 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 160 Xsd:sequence and xsd:choice Repeat work and eat, then once work or play, the whole thing any number of times XML Schema:

161 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 161 Expressing Any Order XML Schema: Problem: create an element, Book, which contains Author, Title,…, in any order means that Book must contain all five child elements, but they may occur in any order. (see example 12)

162 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 162 Constraints on using Elements declared within must have a maxOccurs value of "1" (minOccurs can be either "0" or "1") If a complexType uses and it extends another type, then that parent type must have empty content. The element cannot be nested within either,, or another The contents of must be just elements. It cannot contain or Do Lab 9

163 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 163 Empty Element Schema: Instance doc (snippet): Do Lab 10 Problem: declare an empty element with an attribute (see example 13)

164 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 164 No targetNamespace (noNamespaceSchemaLocation) Sometimes you may wish to create a schema but without putting the elements within a namespace. The targetNamespace attribute is actually an optional attribute of. Thus, if you dont want to specify a namespace for your schema then simply dont use the targetNamespace attribute. Consequences of having no namespace –1. In the instance document dont namespace qualify the elements. –2. In the instance document, instead of using schemaLocation use noNamespaceSchemaLocation.

165 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 165 (see example14)

166 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 166 My Life and Times Paul McCartney 1998 1-56592-235-2 McMillin Publishing … (see example14) 1. Note that there is no default namespace declaration. So, none of the elements are associated with a namespace. 2. Note that we do not use xsi:schemaLocation (since it requires a pair of values - a namespace and a URL to the schema for that namespace). Instead, we use xsi:noNamespaceSchemaLocation.

167 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 167 Assembling an Instance Document from Multiple Schema Documents An instance document may be composed of elements from multiple schemas. Validation can apply to the entire XML instance document, or to a single element.

168 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 168 My Life and Times Paul McCartney 1998 1-56592-235-2 Macmillan Publishing Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row John Doe 123-45-6789 Sally Smith 000-11-2345 Library.xml (see example 15) Validating against two schemas The elements are defined in Book.xsd, and the elements are defined in Employee.xsd. The,, and elements are not defined in any schema! 1. A schema validator will validate each Book element against Book.xsd. 2. It will validate each Employee element against Employee.xsd. 3. It will not validate the other elements.

169 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 169 Lax Validation vs Strict Validation On the previous slide there were elements (Library, Books, and Employees) for which there was no schema to validate against. Lax validation is where the schema validator skips over elements for which no schema is available. Strict validation is where the schema validator requires validation of every element xsv performs lax validation. Thus, it will accept the instance document on the previous slide (but it will note validation="lax" in its output) All the other validators do strict validation. Consequently, they will reject the instance document on the previous slide.

170 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 170 Assembling a Schema from Multiple Schema Documents The include element allows you to access components in other schemas –All the schemas you include must have the same namespace as your schema (i.e., the schema that is doing the include) –The net effect of include is as though you had typed all the definitions directly into the containing schema … LibraryBook.xsd LibraryEmployee.xsd Library.xsd

171 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 171 Library.xsd (see example 16) These are referencing element declarations in the other schemas. Nice!

172 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 172 Assembling a Schema from a Schema with no targetNamespace A schema can another schema which has no targetNamespace. The included components take on the targetNamespace of the schema that is doing the. This is called the Chameleon Effect. The components in the no-namespace schema are called Chameleon components.

173 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 173 Product.xsd (see example17) Note that this schema has no targetNamespace!

174 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 174 Company.xsd (see example17) This schema s Product.xsd. Thus, the components in Product.xsd are namespace-coerced to the company targetNamespace. Consequently, we can reference those components just as though they had originally been declared in a schema with the same targetNamespace.

175 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 175 Assembling a Schema from Multiple Schema Documents with Different Namespaces The import element allows you to access elements and types in a different namespace … Namespace A A.xsd Namespace B B.xsd C.xsd

176 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 176 Camera Schema Camera.xsd Nikon.xsd Olympus.xsd Pentax.xsd

177 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 177 Camera.xsd (see example 18) These import elements give us access to the components in these other schemas. Here I am using the body_type that is defined in the Nikon namespace

178 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 178 Nikon.xsd Olympus.xsd Pentax.xsd

179 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 179 Olympus.xsd

180 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 180 Nikkon.xsd

181 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 181 Pentax.xsd

182 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 182 { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/2/718559/slides/slide_182.jpg", "name": "Copyright © [2002]. Roger L. Costello. All Rights Reserved.", "description": "182 Ergonomically designed casing for easy handling 300mm 1.2 1/10,000 sec to 100 sec Camera.xml.", "width": "800" }

183 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 183 Constraints on using Include and Import The and elements must come before any element declarations or type definitions. Do Labs 11.a, 11.b, 11.c

184 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 184 any Element The element enables the instance document author to extend his/her document with elements not specified by the schema. Now an instance document author can optionally extend (after ) the content of elements with any element.

185 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 185 SchemaRepository.xsd (see example23) Suppose that the instance document author discovers this schema repository, and wants to extend his/her elements with a element. He/she can do so! Thus, the instance document will be extended with an element never anticipated by the schema author. Wow!

186 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 186 My Life and Times Paul McCartney 1998 94303-12021-43892 McMillin Publishing Roger Costello Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. This instance document uses components from two different schemas.

187 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 187 Extensible Instance Documents The element enables instance document authors to create instance documents containing elements above and beyond what was specified by the schema. The instance documents are said to be extensible. Contrast this schema with previous schemas where the content of all our elements were always fixed and static. We are empowering the instance document author with the ability to define what data makes sense to him/her!

188 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 188 Specifying the Namespace of Extension Elements allows the instance document to contain a new element, provided the element comes from a namespace other than the one the schema is defining (i.e., targetNamespace). allows a new element, provided it's from the specified namespace Note: you can specify a list of namespaces, separated by a blank space. One of the namespaces can be ##targetNamespace (see next) allows a new element, provided it's from the namespace that the schema is defining. allows an element from any namespace. This is the default. the new element must come from no namespace

189 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 189 anyAttribute The element enables the instance document author to extend his/her document with attributes not specified by the schema. Now an instance document author can add any number of attributes onto a element (as well as extend the element content).

190 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 190 SchemaRepository.xsd (see example24) Suppose that the instance document author discovers this schema, and wants to extend his/her elements with an id attribute. He/she can do so! Thus, the instance document will be extended with an attribute never anticipated by the schema author. Wow!

191 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 191 My Life and Times Paul McCartney 1998 1-56592-235-2 McMillin Publishing Roger Costello Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. BookStore.xml (see example24)

192 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 192 Extensible Instance Documents The element enables instance document authors to create instance documents which contain attributes above and beyond what was specified by the schema. The instance documents are said to be extensible. Contrast this schema with previous schemas where the content of all our elements were always fixed and static. We are empowering the instance document author with the ability to define what data makes sense to him/her!

193 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 193 Specifying the Namespace of Extension Attributes allows the instance document to contain new attributes, provided the attributes come from a namespace other than the one the schema is defining (i.e., targetNamespace). allows new attributes, provided they're from the specified namespace. Note: you can specify a list of namespaces, separated by a blank space. One of the namespaces can be ##targetNamespace (see next) allows new attributes, provided they're from the namespace that the schema is defining. allows any attributes. This is the default. allows any unqualified attributes (i.e., the attributes comes from no namespace)

194 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 194 Smart Enough to Know you're not Smart Enough With the and elements we can design our schemas with the recognition that, as schema designers, we can never anticipate all the different kinds of data instance document authors will want to use in the instance document. That is, we are smart enough to know that we're not smart enough to know all the different data instance document authors will require.

195 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 195 Open Content Definition: an open content schema is one that allows instance documents to contain additional elements beyond what is declared in the schema. This is achieved by using the and elements in the schema. Sprinkling and elements liberally throughout your schema will yield benefits in terms of how evolvable your schema is. –See later slides for how open content enables the rapid evolution of schemas that is required in today's marketplace.

196 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 196 Global Openness There is a range of openness that a schema may support - anywhere from having instance documents where new elements can be inserted anywhere (global openness), to instance documents where new elements can be inserted only at specific locations (localized openness)... This schema is allowing expansion before and after every element. Further, it is allowing for attribute expansion on every element. Truly, this is the ultimate in openness!

197 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 197 Localized Openness With localized openness we design our schema to allow instance documents to extend only at specific points in the document With this schema we are allowing instance documents to extend only at the end of Book's content model.

198 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 198 In today's rapidly changing market static schemas will be less commonplace, as the market pushes schemas to quickly support new capabilities. For example, consider the cellphone industry. Clearly, this is a rapidly evolving market. Any schema that the cellphone community creates will soon become obsolete as hardware/software changes extend the cellphone capabilities. For the cellphone community rapid evolution of a cellphone schema is not just a nicety, the market demands it! Suppose that the cellphone community gets together and creates a schema, cellphone.xsd. Imagine that every week NOKIA sends out to the various vendors an instance document (conforming to cellphone.xsd), detailing its current product set. Now suppose that a few months after cellphone.xsd is agreed upon NOKIA makes some breakthroughs in their cellphones - they create new memory, call, and display features, none of which are supported by cellphone.xsd. To gain a market advantage NOKIA will want to get information about these new capabilities to its vendors ASAP. Further, they will have little motivation to wait for the next meeting of the cellphone community to consider upgrades to cellphone.xsd. They need results NOW. How does open content help? That is described next. Suppose that the cellphone schema is declared "open". Immediately NOKIA can extend its instance documents to incorporate data about the new features. How does this change impact the vendor applications that receive the instance documents? The answer is - not at all. In the worst case, the vendor's application will simply skip over the new elements. More likely, however, the vendors are showing the cellphone features in a list box and these new features will be automatically captured with the other features. Let's stop and think about what has been just described … Without modifying the cellphone schema and without touching the vendor's applications, information about the new NOKIA features has been instantly disseminated to the marketplace! Open content in the cellphone schema is the enabler for this rapid dissemination. Dynamic Schema Evolution using Open Content Continued -->

199 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 199 Dynamic Schema Evolution using Open Content (cont.) Clearly some types of instance document extensions may require modification to the vendor's applications. Recognize, however, that the vendors are free to upgrade their applications in their own time. The applications do not need to be upgraded before changes can be introduced into instance documents. At the very worst, the vendor's applications will simply skip over the extensions. And, of course, those vendors do not need to upgrade in lock-step To wrap up this example … suppose that several months later the cellphone community reconvenes to discuss enhancements to the schema. The new features that NOKIA first introduced into the marketplace are then officially added into the schema. Thus completes the cycle. Changes to the instance documents have driven the evolution of the schema. Do Lab 12

200 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 200 Strategy for Defining Semantics of your XML Elements (by Mary Pulvermacher) Capture the semantics in the XML Schema –Describe the semantics within the element –Adopt the convention that every element and attribute have an annotation which provides information on the meaning Advantages: –The XML Schema will capture the data structure, meta-data, and relationships between the elements –Use of strong typing will capture much of the data content –The annotations can capture definitions and other explanatory information –The structure of the "definitions" will always be consistent with the structure used in the schema since they are linked –Since the schema itself is an XML document, we can use XSLT to extract the annotations and transform the "semantic" information into a format suitable for human consumption

201 Copyright © [2002]. Roger L. Costello. All Rights Reserved. 201 Strategy for Defining Semantics of your XML Elements (by Mary Pulvermacher)


Download ppt "Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)"

Similar presentations


Ads by Google