Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Validation III Schemas + RELAX NG Robin Burke ECT 360.

Similar presentations


Presentation on theme: "XML Validation III Schemas + RELAX NG Robin Burke ECT 360."— Presentation transcript:

1 XML Validation III Schemas + RELAX NG Robin Burke ECT 360

2 Outline Types Built-in Named Anonymous Type Derivation Schema Organization Break RELAX NG

3 Built-in types Part of the schema language Base types 19 fundamental types Examples: string, decimal Derived types 25 more types that use the base types Examples: ID, positiveInteger

4 Built-in types, cont'd

5 User-defined types Any use of complexType can be turned into a user-defined type usually called "standalone" Simple types can be derived from the built-in types

6 Standalone types A type can stand outside of an element definition must have a name Used in element definition

7 Mixed content Can specify that an element has mixed content

8 Mixed content, cont'd Schema cannot control where the text appears If this is legal text here thud grunt So is this thud more text grunt still more

9 Deriving types DTDs do not allow types restrictions beyond enumeration, CDATA, token for attributes PCDATA for content Schemas have built-in types also capability to create your own

10 Derivation operations list sequence of values union combine two types allowing either restriction placing limits on the legal values

11 List PN334-04 PN223-89 PQ1112-03 Must be separated by spaces probably more useful to do this with document structure partList -> partNo*

12 Union Allows data of either type to be used Example Database situation null is a possible value

13 Restriction Most useful Allow design to state exactly what values are legal prices must be non-negative SSN must follow a certain pattern in-stock must yes or no etc.

14 Restriction, cont'd Restrict a base type according to "facets" Different facets available for different data types

15 Facets

16 Example: enumeration

17 Example: numeric

18 Example: pattern Regular expressions again derived from perl

19 Inheritance facet restrictions are inherited new type derivations must honor them but can restrict them further but new derivations can alter other facets For example monetary type fractionDigits facet = 2 loan amount type monetary type + maxValue = 100000 car loan amount loan amount type + maxValue = 30000

20 Fixed Facets Possible to prevent users from changing certain facet in any way fixed="true" in facet declaration similar to "final" keyword in Java Example minInclusive cannot be changed when inherited lower would be illegal anyway the "fixed" attribute means it cannot be altered upward

21 Complex Types (not discussed in book) Possible to derive from complex types i.e. elements Use complexContent Possibilities extension restriction elements attributes

22 Complex Type Extension can add elements to existing complex type only at the end

23 Complex Type Restriction Adding additional attributes Odd syntax entire element definition must be repeated Not much benefit to inheritance validation checks for consistency with supertype

24 Example grades schema

25 Schema design Questions to ask what kind of document? narrative data-centric what kind of processing? web page output complex queries

26 Document modeling Get examples Get style guides / rules For each data element ask how many ask what legal values ask about sub-parts ask about exceptions

27 Design decisions Attribute vs element Level of granularity Naming Schema structure

28 Attribute vs element Some specific rules ID must be attribute General principle data vs metadata Element for document content Attribute for information about content Not always easy to tell!

29 Element Consists of document content Will be shown to a human user Contains substructure Sequence may be important Could be very long Presence depends on other values

30 Attribute (Opposite of above) Must be from an enumeration of values Also consistency

31 Level of granularity How detailed to model the data? Very detailed more work to markup more detail in expressing the schema exceptions must be handled Less detailed easier to mark up easier to schematize document contents less accessible

32 Element content granularity Fine grained model salutation, first name, middle name, last name, appellation Coarse grained model name Tradeoff search / sort / organized document creation

33 Levels vs recursion Named levels  Recursion  Tradeoff ability to rearrange transparency of markup

34 Naming Case convention uppercase is bad lowercase better Multiple words CapCase camelCase Underline_Convention

35 Structure Nested "russian doll" schema looks like the document small schema only Flat elements defined at global level references used in complex type definitions Type-based "venetian blind" all schema complex in type defintions one global element

36 Break

37 RELAX NG XML Schemas are big a lot of the page consists of / repeated element names RELAX NG created as an alternate validation language compact, non-XML syntax also XML syntax

38 Example element grades { element grade { element student { text }, element assigned-grade { text } }* } Equivalent to

39 Attributes element grades { element grade { element student { text, attribute id { text } }, element assigned-grade (text) }* attribute assignment { text } }

40 Types instead of { text } use appropriate built-in data type attribute age { xsd:positiveInteger } facets qualify with name / value pair attribute drinkingAge { xsd:positiveInteger { minInclusive="21" } }

41 What does this one say? element grade { element student...., { element assigned-grade { text { pattern="([A-D](\+|\-)?|F)" } | ( element assigned-grade { text "I" }, element reason { text } ) }

42 The point A schema language has two purposes lets the language designer state a design lets the system validate documents against that design Any language that serves this purposes can be used

43 Validation languages DTD SGML holdover ugly fairly simple to express Schema complete extensible baroque unreadable RELAX NG readable esp. compact syntax more expressive than Schema fewer tools

44 Next week Presentations


Download ppt "XML Validation III Schemas + RELAX NG Robin Burke ECT 360."

Similar presentations


Ads by Google