Presentation is loading. Please wait.

Presentation is loading. Please wait.

THE DATATYPES OF XML SCHEMA A Practical Introduction

Similar presentations


Presentation on theme: "THE DATATYPES OF XML SCHEMA A Practical Introduction"— Presentation transcript:

1 THE DATATYPES OF XML SCHEMA A Practical Introduction
John Cowan Reuters Health Information

2 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Licensed under the GNU General Public License ABSOLUTELY NO WARRANTIES; USE AT YOUR OWN RISK Black and white for readability 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

3 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Abstract This is a brief description, useful for RELAX NG and XML Schema users, of the simple datatypes of XML Schema and their associated facets. A brief summary of XML Schema regular expression language is also given. 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

4 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Roadmap Types (11 slides) Facets (7 slides) Regular expression language (6 slides) 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

5 Copyright John Cowan 2001, 2002; licensed under GNU GPL
XML Schema Datatypes A type is a named set of values An XML Schema datatype provides a standardized, machine-checkable representation of a type XML Schema types can be grouped: numeric, date, boolean, string, misc. 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

6 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Numeric Types Decimal types Floating-point types 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

7 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Decimal Types decimal integer nonPositiveInteger negativeInteger nonNegativeInteger positiveInteger unsigned{Long, Int, Short, Byte} long, int, short, byte 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

8 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Decimal Types long, short, int, and byte are the same as in Java: 64, 32, 16, 8 bits unsignedLong, unsignedShort, unsignedInt, and unsignedByte are the obvious unsigned analogues All other numeric types are unbounded 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

9 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Floating-point Types Only two floating-point types float double IEEE ranges (same as Java, all modern hardware) 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

10 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Date Types duration date, time, dateTime gYear, gMonth, gDay, gYearMonth, gMonthDay 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

11 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Date Types Duration duration Single Time Interval dateTime, date, gYear, gYearMonth Recurring Time Interval time, gMonth, gDay, gMonthDay 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

12 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Date Type Examples 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

13 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Boolean Type Only two values are legal: true (which can also be written 1) false (which can also be written 0) 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

14 Copyright John Cowan 2001, 2002; licensed under GNU GPL
String Types string normalizedString token language NMTOKEN(S) Name NCName o ID, IDREF(S), ENTITY(IES) 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

15 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Miscellaneous Types Raw octet types hexBinary base64Binary anyURI QName NOTATION 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

16 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Facets Allow the creation of new datatypes by restricting the existing ones in one or more ways Called params in RELAX NG Facets can be grouped into families applicable to datatype families: length, value, pattern enumeration, whiteSpace 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

17 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Length Facets Applicable to string and miscellaneous types length facet gives exact length minLength and maxLength facets set limits; either or both may be used lengths of hexBinary and base64Binary types are measured in octets, not characters 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

18 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Value Facets Applicable to numeric and date types minExclusive and minInclusive specify a lower bound; either but not both may be used maxExclusive and maxInclusive specify an upper bound; either but not both may be used 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

19 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Value Facets totalDigits specifies the total number of significant digits in a decimal, integer, (non)PositiveInteger, or (non)NegativeInteger value fractionDigits specifies the number of fractional digits in a decimal value 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

20 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Pattern Facet Applicable to any type Specifies a regular expression that the data must match XML Schema: If multiple pattern facets are present, the data must match at least one of them RELAX NG: If multiple pattern facets are present, the data must match all of them 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

21 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Enumeration Facet XML Schema only Applicable to any type The instances of the enumeration facet specify individual values The data must be equal (according to the rules for the type) to one of the specified values 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

22 Copyright John Cowan 2001, 2002; licensed under GNU GPL
whiteSpace Facet XML Schema only Applicable to string types Legal values are: preserve: leave white space alone replace: tabs and newlines become spaces collapse: replace, then remove leading, trailing, and multiple spaces 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

23 XSD Regular Expressions
A subset of Perl regular expressions Supported constructs: choice quantifiers character classes parentheses for grouping All matches are anchored to both ends of the data (so no ^ or $ needed) 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

24 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Choice abc|def matches either abc or def Use parentheses to specify the scope of a choice example: abc(d|e) matches either abcd or abce 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

25 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Quantifiers (abc){2,4} matches abcabc or or abcabcabc or abcabcabcabc (abc){2,} matches 2 or more consecutive abc sequences (abc)* matches 0 or more sequences (abc)+ matches 1 or more sequences (abc)? matches 0 or 1 sequences 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

26 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Character Classes Character classes always match exactly one character, no matter how complex they look [abc] matches a or b or c [^abc] matches anything but a or b or c [a-z] matches any character between a and z inclusive 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

27 Single-Letter Classes
\n, \r, \t - newline, return, tab . - anything but newline or return \s, \S - whitespace, non-whitespace \i, \I - name initial, non-name initial \c, \C - name char, non-name char \d, \D - decimal digit, non-decimal digit \w, \W - word char, non-word char 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL

28 Copyright John Cowan 2001, 2002; licensed under GNU GPL
Unicode Classes \p{Xx}, \P{Xx} - matches anything in (not in) a Unicode General Category example: \p{Ll} matches lower case \p{IsXxxxxx}, \P{IsXxxxx} - matches anything in (not in) a Unicode block example: \P{IsCyrillic} matches any non-Cyrillic character 11/29/2018 Copyright John Cowan 2001, 2002; licensed under GNU GPL


Download ppt "THE DATATYPES OF XML SCHEMA A Practical Introduction"

Similar presentations


Ads by Google