Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Schema – Part 1 1.Introduction to XML-Schema 2.Schema basics 3.Mechanisms (strategies) for Designing Schema 4.Creating your own Datatypes.

Similar presentations


Presentation on theme: "XML Schema – Part 1 1.Introduction to XML-Schema 2.Schema basics 3.Mechanisms (strategies) for Designing Schema 4.Creating your own Datatypes."— Presentation transcript:

1 XML Schema – Part 1 1.Introduction to XML-Schema 2.Schema basics 3.Mechanisms (strategies) for Designing Schema 4.Creating your own Datatypes

2 Simple XML Alice Smith 123 Maple Street Mill Valley CA 90952 Lawnmower 1 148.95 Confirm this is electric Baby Monitor 1 39.98 1999-05-21

3 XML 1. XML is for structuring data spreadsheets, address books, configuration parameters, financial transactions, and technical drawings. XML is text format for representing structured data. XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous. XML is extensible, platform-independent, and it supports internationalization and localization. 2. XML looks a bit like HTML Like HTML, XML makes use of tags (words bracketed by ' ') and attributes (of the form name="value"). While HTML specifies what each tag and attribute means, and often how the text between them will look in a browser, XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. if you see " " in an XML file, do not assume it is a paragraph. Depending on the context, it may be a price, a parameter, a person, a p... (and who says it has to be a word with a "p"?). XML can keep data separated from your HTML

4 3. XML is a group of technologies XML 1.0 is the specification that defines what "tags" and "attributes" are. Beyond XML 1.0, "the XML family" is a growing set of modules that offer useful services to accomplish important and frequently demanded tasks. XPointer and XFragments are syntaxes in development for pointing to parts of an XML document. An XPointer is a bit like a URL, but instead of pointing to documents on the Web, it points to pieces of data inside an XML file. XSLT, a transformation language used for rearranging, adding and deleting tags and attributes. The DOM is a standard set of function calls for manipulating XML (and HTML) files from a programming language. XML Schemas help developers to precisely define the structures of their own XML-based formats. There are several more modules and tools available or under development. Keep an eye on W3C's technical reports page.XML 1.0XPointerXSLTDOMW3C's technical reports page

5 Purpose of XML Schemas (and DTDs) Specify: –the structure of instance documents "this element contains these elements, which contains these other elements, etc" –the datatype of each element/attribute "this element shall hold an integer with the range 0 to 12,000" (DTDs don't do too well with specifying datatypes like this)

6 What is a Schema? A piece of information marked up by presence of tags is called element. Elements may further be enriched by attaching name-value pairs called attributes. Like Data Type Definitions (DTDs) Define the document's structure. Elements and attributes definition Empty or text content elements. Default values for attributes and elements. More powerful and flexible than DTDs. XML syntax. Agreed upon Schema Exchanging XML data. Verify the received data against schema. Valid and well-formed.

7 <xs:element name="name"type="xs:string"/> Example 1

8 Structure of the Data BOOK title author character name dob isbn We want to define this structure in the schema Pets M.Cat Snoopy 1966

9 Pets M. Cat Snoopy 1950 Patty 1966 XML Schema Example 1

10 Explanations: Type ( xsd:string ) is prefixed by the namespace prefix associated with XML Schema, indicating a predefined XML Schema datatype: Specify both minOccurs and maxOccurs. unbounded value, default value (one). Only in local definition Facets …/… Attributes after element declarations. Compositors: Sequence (ordered sequence) All (no order but all) Choice

11 Simple/Complex Data type SimpleType is reserved for data types holding only values and no attribute or element sub-nodes. ComplexType - data types which aren’t simple (Book element has attributes and children elements)

12 Compositors : Using and <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.binary.org" xmlns="http://www.binary.org" elementFormDefault="qualified">

13 Compositors: Expressing Any Order <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> ……. Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this is very difficult and ugly with DTDs).

14 Namespaces A namespace is a collection of names used as element or attribute names in an XML document. to qualify element names to make them unique to avoid conflicts between elements with the same name. xmlns - keyword for a namespace declaration.. The idea is that when you are dealing with XML documents from ten different (external) sources, name collisions can occur. If you use namespaces, you are distinguishing one element from another based on the namespace prefix. myNameSpace:Employee is not the same as yourNameSpace:Employee. When you declare a namespace you also give it an unique URI ( Universal Resource). explicit or default declaration.

15 Explicit and Default namespace declaration Explicit declaration (bk – qualifier) Tourist guide 22.95 Default Tourist guide 22.95 Identified by a Universal Resource Identifier (URI) or by Uniform Resource Locator (URL) It doesn't matter what the URI points to. URIs are used because they are globally unique across the Internet.

16 The elements and datatypes that are used to construct schemas - schema - element - complexType - sequence - string come from the http://…/XMLSche ma namespace Example 1 Book.xsd

17 enc targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> The default namespace is http://www.books.org which is the targetNamespace! Example 1 “qualified“ This is a directive to any instance documents which conform to this schema: Any elements used by the instance document which were declared in this schema must be namespace qualified. The Book in what namespace? Since thereis no namespace qualifier it is referencing the Book element in the default namespace, which is the targetNamespace! Thus, this is a reference to the Book element declaration in this schema.

18 XML Schema Namespace element complexType schema sequence http://www.w3.org/2001/XMLSchema string integer boolean (schema-for-schemas)

19 Book Namespace(target namespace) ISBN Book Charac ter Title Auth or name http://www.books.org dob

20 <Book xmlns ="http://www.books.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.books.org Book.xsd"> … My Life and Times Paul McCartney... 1. First, using a default namespace declaration, tell the schema-validator that all of the elements used in this instance document come from the http://www.books.org namespace. 2. Second, with schemaLocation tell the schema-validator that the http://www.books.org namespace is defined by Book.xsd (i.e., schemaLocation contains a pair of values). 3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one in the XML Schema-instance namespace. 1 2 3 XMLSchema and XML instance document

21 Referencing a schema in an XML instance document Book.xml Book.xsd targetNamespace="http:// www.books.org" schemaLocation="http://w ww.books.org Book.xsd" - defines elements in namespace http://www.books.org - uses elements from namespace http://www.books.org A schema defines a new vocabulary. Instance documents use that new vocabulary.

22 Note multiple levels of checking Book.xmlBook.xsdXMLSchema.xsd (schema-for-schemas) Validate that the xml document conforms to the rules described in Book.xsd Validate that Book.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas

23 Russian Doll Design Example 1

24 Flat Catalog <xs:element ref="character" minOccurs="0 ” maxOccurs="unbounded"/> Example 2 First definition of the simple types These are global definitions Next definition of attributes Next, definition of complex types The definition of the cardinality is done when the elements are referenced

25 Summary Mechanisms of definitions Russian Doll Design : Tight structure Multiple occurrences of a same element name with different definitions. Depth in the embedded definitions Hardly readable and difficult to maintain when documents are complex. Flat Catalog : Catalog of all the elements. references to element and attribute definitions that need to be within the scope of the referencer. Using a reference to an element or an attribute is somewhat comparable to cloning an object. The element or attribute is defined first, and it can be duplicated at another place in the document structure by the reference mechanism, in the same way an object can be cloned. The two elements (or attributes) are then two instances of the same class.

26 Anonymous types (no name) <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Russian Doll Design Example 3

27 Named Types <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> …/….. …/.. Named type The advantage of splitting out Book's element declarations and wrapping them in a named type is that now this type can be reused by other elements. Example 4

28 Please note that: is equivalent to: Element A instantiates complexType foo. Element A has the complexType definition inlined in the element declaration.

29 Summary Definition Mechanisms Russian Doll Design: Tightly follows the structure Define each element and attribute within its context and to allow multiple occurrences of a same element name to carry different definitions. Hardly readable and difficult to maintain when documents are complex. Flat Catalog: Catalog of all the elements available in the instance document and, for each of them, lists of child elements and attributes. Use references to element and attribute definitions that need to be within the scope of the referencer. Somewhat comparable to cloning an object. The element or attribute is defined first, and it can be duplicated at another place in the document structure by the reference mechanism, in the same way an object can be cloned. The two elements (or attributes) are then two instances of the same class. Named Types Give a name to the simpleType and complexType elements. Comparable to defining a class and using it to create an object

30 Capture the semantics in the XML Schema The element is used for documenting the schema, both for humans and for programs. –Use for providing a comment to humans –Use for providing a comment to programs The content is any well-formed XML Note that annotations have no effect on schema validation Annotating Schemas

31 The following constraint is not expressible with XML Schema: The value of element A should be greater than the value of element B. So, we need to use a separate tool (e.g., Schematron) to check this constraint. We will express this constraint in the appinfo section (below). A should be greater than B

32 Code to check the structure and content (datatype) of the data Code to actually do the work "In a typical program, up to 60% of the code is spent checking the data!" Save time and money using XML Schemas Continued -->

33 Code to check the structure and content of the data Code to actually do the work If your data is structured as XML, and there is a schema, then you can hand the data-checking task off to a schema validator. Thus, your code is reduced by up to 60%!!! Big $$ savings! Save time and money using XML Schemas (cont.)

34 Classic use of XML Schemas (Trading Partners ) Supplier Consumer XML data Schema Validator XML Schema Software to Process D. “D. is okay" D (Schema at third-party, neutral web site)

35 XML Schema --> GUI Schema GUI Builder HTML Supplier Web Server

36 XML Schema --> Smart Editor Schema Smart Editor (e.g., XML Spy) Helps you build your instance documents. For example, it pops up a menu showing you what is valid next. It knows this by looking at the XML Schema!

37 Element Substitution We can define a group of substitutable elements (called a substitutionGroup) by declaring an element (called the head) and then declaring other elements which state that they are substitutable for the head element. <xsd:element name="T" substitutionGroup="subway" type="xsd:string"/> subway is the head element T is substitutable for subway

38 <xsd:element name="T" substitutionGroup="subway" type="xsd:string"/> Red Line Instance doc: Red Line Alternative instance doc (substitute T for subway): This example shows the element being substituted with the element. Schema

39 <xsd:element name="metro" substitutionGroup="subway" type="xsd:string"/> Red Line Schema Instance doc: Linea Roja Alternative instance doc (customized for Spanish clients): Substitution Groups (Example) Remarks : Transitive A, B, C Not Symmetric Blocking substitution:

40 Creating your own Datatypes Simple Types Derivation for example, you want to create your “special” string which will be at length 10 and etc. Complex Types Derivation

41 xs:restriction elements. The different kind of restrictions that can be applied on a datatype are called facets. Union of datatypes. White space separated lists. Creating your own Datatypes(Simple datatypes)

42 <xsd:simpleType name="TelephoneNumber“ type=“xsd:string”/> 1. This creates a new datatype called 'TelephoneNumber'. 2. Elements of this type can hold string values, 3. But the string length must be exactly 10 characters long and 4. The string must follow the pattern: dd-ddddddd, where 'd' represents a 'digit'. (Obviously, in this example the regular expression makes the length facet redundant.) patterns are specified using Regular Expressions 5. In general we use: restriction, facet, value. 6. Restriction with Complex Type. 1 2 3 4 Creating your own Datatypes

43 An element declared to be of type TelephoneNumber must be a string of length=10 and the string must follow the pattern: 2 digits, dash, 7 digits. An element declared to be of type shape must be a string with a value of either circle, or triangle, or square. Multiple Facets - "and" them together, or "or" them together?

44 General Form of Creating a Simple Datatype by Specifying Facet Values … Facets: - length - minlength - maxlength - pattern - enumeration - minInclusive - maxInclusive - minExclusive - maxExclusive... Sources: - string - boolean - number - float - double - duration - dateTime - time... The different kind of restrictions that can be applied on a datatype are called facets

45 Creating your own Datatypes A new datatype can be defined from an existing datatype (called the "base" type) by specifying values for one or more of the optional facets for the base type. Example. The string primitive datatype has six optional facets: –length –minLength –maxLength –pattern –enumeration –whitespace (legal values: preserve, replace, collapse)

46 Creating a simpleType from another simpleType Thus far we have created a simpleType using one of the built-in datatypes as our base type. However, we can create a simpleType that uses another simpleType as the base.

47 Example: creating simpleTypes from another simpleTypes 1. simpleType that uses a built-in base type: 2. simpleType that uses another simpleType as the base type:

48 Creating Simple Datatypes(list) 02-1234567 03-9876543 xs:list defines a whitespace-separated list of values. “ TelephoneNumber “ – from the previous example You cannot create list types from existing list types, nor from complex types Facets can be applied to list types,such as: length, minLength, maxLength It allows to contain an arbitrarily long list of TelephoneNumbers

49 Creating simpleType - Union.../…../… <xsd:union memberTypes="TomsFamily RogersFamily "/>

50 Creating simpleType - Union Alternatively, … … …

51 Creating Simple Datatypes (union) NMTOKEN simple type (like string and etc.)used to define only attributes US, Br é sil Now isbnType may receive the union of simple types NMTOKEN or string

52 Complex Datatype Derivation We can do a form of subclassing Complex Type definitions: "derived types“. –derive by restriction: create a type which is a subset of the base type. There are two ways to subset the elements: redefine a base type element to have a restricted range of values, or redefine a base type element to have a more restricted number of occurrences. –derive by extension: extend the parent complexType with more elements

53

54 Title Author Date Publication ISBN Publisher BookPublication

55 Derive by Restriction Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date. However, there must be exactly one Author element. Note that in the restriction type you must repeat all the declarations from the base type (except when the base type has an element with minOccurs="0" and the subtype wishes to delete it. ).

56 Prohibiting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extension derivations, or disallow restriction derivations. –Rationale: "For example, I may create a complexType and make it publicly available for others to use. However, I don't want them to extend it with their proprietary extensions or subset it to remove, say, copyright information." Publication cannot be extended nor restricted Publication cannot be restricted Publication cannot be extended


Download ppt "XML Schema – Part 1 1.Introduction to XML-Schema 2.Schema basics 3.Mechanisms (strategies) for Designing Schema 4.Creating your own Datatypes."

Similar presentations


Ads by Google