Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Course XML Validation In making up the slides for this lecture, I borrowed material from several very nice sources: “Data on the Web” Abiteboul,

Similar presentations


Presentation on theme: "Project Course XML Validation In making up the slides for this lecture, I borrowed material from several very nice sources: “Data on the Web” Abiteboul,"— Presentation transcript:

1 Project Course XML Validation In making up the slides for this lecture, I borrowed material from several very nice sources: “Data on the Web” Abiteboul, Buneman and Suciu “XML in a Nutshell” Harold and Means “The XML Companion” Bradley The validation examples were originally tested with an older parser and so the specific outputs may differ from those shown.

2 Project Course XML Validation A batch validating process involves comparing the DTD against a complete document instance and producing a report containing any errors or warnings. Software developers should consider batch validation to be analogous to program compilation, with similar errors detected. Interactive validation involves constant comparison of the DTD against a document as it is being created.

3 Project Course XML Validation The benefits of validating documents against a DTD include: Programmers can write extraction and manipulation filters without fear of their software ever processing unexpected input. Using an XML-aware word processor, authors and editors can be guided and constrained to produce conforming documents.

4 Project Course XML Validation Examples XML elements may contain further, embedded elements, and the entire document must be enclosed by a single document element. The degree to which an element’s content is organized into child elements is often termed its granularity. Some hierarchical structures are recursive. The Document Type Definition (DTD) contains rules for each element allowed within a specific class of documents.

5 Project Course Things the DTD does not do: Tell us the document root. Specify the number of instances of each kind of element. Describe the character data inside an element (precise syntax and semantics). The XML schema language is new and may replace the use of DTD’s

6 Project Course We’ll run this program against several xml files with DTD’s. We’ll study the code soon. // Validate.java using Xerces import java.io.*; import org.xml.sax.ErrorHandler; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.InputSource; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; This slide shows the imported classes.

7 Project Course public class Validate { public static boolean valid = true; public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java Validate filename.xml"); System.exit (1); } Here we check if the command line is correct.

8 Project Course try { // get a parser XMLReader reader = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); // request validation reader.setFeature("http://xml.org/sax/features/validation", true); // associate an InputSource object with the file name InputSource inputSource = new InputSource(argv[0]); // go ahead and parse reader.parse(inputSource); }

9 Project Course catch(org.xml.sax.SAXException e) { System.out.println("Error in parsing " + e); valid = false; } catch(java.io.IOException e) { System.out.println("Error in I/O " + e); System.exit(0); } System.out.println("Valid Document is " + valid); } // Catch any errors or fatal errors here. // The parser will handle simple warnings.

10 Project Course 100 5 3 6 XML Document DTD Valid document is true

11 Project Course 100 5 3 6 XML Document DTD on the Web? VERY NICE Valid document is true

12 Project Course <!DOCTYPE FixedFloatSwap [ ]> 100 5 3 6 XML Document with an internal subset Valid document is true

13 Project Course 100 5 3 6 XML Document DTD Valid document is false

14 Project Course 100 5 3 6 100 5 3 6 XML Document

15 Project Course DTD C:\McCarthy\www\examples\sax>java Validate FixedFloatSwap.xml Quantity Indicators ? 0 or 1 time + 1 or more times * 0 or more times Valid document is true

16 Project Course Is this a valid document? <!DOCTYPE person [ ]> Alan Turing computer scientist cryptographer Sure!

17 Project Course The locations where document text data is allowed are indicated by the keyword ‘PCDATA’ (Parsed Character Data). 100 5 2000 2002 6 XML Document

18 Project Course C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xml org.xml.sax.SAXParseException: Element "NumYears" does not allow "StartYear" -- (#PCDATA) org.xml.sax.SAXParseException: Element type "StartYear" is not declared. org.xml.sax.SAXParseException: Element "NumYears" does not allow "EndYear" -- (# PCDATA) org.xml.sax.SAXParseException: Element type "EndYear" is not declared. Valid document is false Output DTD

19 Project Course There are strict rules which must be applied when an element is allowed to contain both text and child elements. The PCDATA keyword must be the first token in the group, and the group must be a choice group (using “|” not “,”). The group must be optional and repeatable. This is known as a mixed content model. Mixed Content

20 Project Course DTD H 2 O is water. XML Document Valid document is true

21 Project Course Is this a valid document? <!DOCTYPE page [ ]> Alan Turing broke codes during World War II. He very precisely defined the notion of "algorithm". And so he had several professions: computer scientist cryptographer And mathematician Sure!

22 Project Course How about this one? java Validate mixed.xml org.xml.sax.SAXParseException: The content of element type "page" must match "(paragraph)+". Valid document is false <!DOCTYPE page [ ]> The following is a paragraph marked up in XML. Alan Turing broke codes during World War II. He very precisely defined the notion of "algorithm". And so he had several professions: computer scientist cryptographer And mathemetician

23 Project Course 100 5 3 6 will not be parsed for markup]]> <!ELEMENT FixedFloatSwap ( Notional, Fixed_Rate, NumYears, NumPayments, Note ) > XML Document DTD CDATA Section

24 Project Course Recursion <!DOCTYPE tree [ ]> A DTD is a context-free grammar java Validate recursive1.xml Valid document is true

25 Project Course How about this one? <!DOCTYPE tree [ ]> Alan Turing would like this Alan Turing would like this java Validate recursive1.xml org.xml.sax.SAXParseException: The content of element type "tree" must match "(node)". Valid document is false

26 Project Course Relational Databases and XML Consider the relational database r1(a,b,c), r2(c,d) r1: a b c r2: c d a1 b1 c1 c2 d2 a2 b2 c2 c3 d3 c4 d4 How can we represent this database with an XML DTD?

27 Project Course Relations <!DOCTYPE db [ ]> a1 b1 c1 c2 d2 c3 d3 c4 d4 java Validate Db.xml Valid document is true There is a small problem….

28 Project Course Relations <!DOCTYPE db [ ]> a1 b1 c1 c2 d2 c3 d3 c4 d4 The order of the relations should not count and neither should the order of columns within rows.

29 Project Course Attributes An attribute is associated with a particular element by the DTD and is assigned an attribute type. The attribute type can restrict the range of values it can hold. Example attribute types include : CDATA indicates a simple string of characters NMTOKEN indicates a word or token A named token group such as (left | center | right) ID an element id that holds a unique value (among other element ID’s in the document) IDREF attributes refer to an ID

30 Project Course DTD 100 5 3 6 XML Document C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xml org.xml.sax.SAXParseException: Attribute value for "currency" is #REQUIRED. Valid document is false

31 Project Course DTD 100 5 3 6 XML Document Valid document is true

32 Project Course DTD 100 5 3 6 XML Document Valid document is true #IMPLIED means optional

33 Project Course DTD 100 5 3 6 XML Document Valid document is true

34 Project Course DTD 100 5 3 6 XML Document Valid document is true #IMPLIED means optional

35 Project Course DTD 100 5 3 6 XML Document Valid document is true

36 Project Course ID and IDREF Attributes We can represent complex relationships within an XML document using ID and IDREF attributes.

37 Project Course An Undirected Graph u vw x y z edge vertex

38 Project Course A Directed Graph u w v y x

39 Project Course Math 100 Geom100 Calc100Calc200 Calc300 Philo45CS1 CS2 This is called a DAG (Directed Acyclic Graph)

40 Project Course Algebra I Students in this course study introductory algebra. This course has an ID But no prerequisites

41 Project Course Geometry I Students in this course study how to prove several theorems in geometry. The DTD will force this to be unique.

42 Project Course Calculus I Students in this course study the derivative. These are references to ID’s. (IDREFS)

43 Project Course Calculus II Students in this course study the integral. The DTD requires that this name be a unique id defined within this document. Otherwise, the document is invalid.

44 Project Course Calculus II Students in this course study the derivative and the integral (in 3-space). Prerequisites is an EMPTY element. It’s used only for its attributes.

45 Project Course Introduction to Computer Science I In this course we study Turing machines. IDREFID A One-to-one link

46 Project Course Introduction to Computer Science II In this course we study basic data structures. IDREFS ID One-to-many links

47 Project Course Ethical Implications of Information Technology TBA

48 Project Course The Course_Descriptions.dtd

49 Project Course General Entities & General entities are used to place text into the XML document. They may be declared in the DTD and referenced in the document. They may also be declared in the DTD as residing in a file. They may then be referenced in the document.

50 Project Course <!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ ] > &bankname; 100 5 3 6 <!ELEMENT FixedFloatSwap (Bank,Notional, Fixed_Rate, NumYears, NumPayments ) > DTD Document using a General Entity Validate is true

51 Project Course <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> XSLT Program The general entity is replaced before xslt sees it.

52 Project Course C:\McCarthy\www\46-928\examples\sax>java -Dcom.jclark.xsl.sax.parser=com.jclark. xml.sax.CommentDriver com.jclark.xsl.sax.Driver FixedFloatSwap.xml FixedFloatSwa p.xsl FixedFloatSwap.wml C:\McCarthy\www\46-928\examples\sax>type FixedFloatSwap.wml Mellon National Bank and Trust XSLT OUTPUT

53 Project Course <!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ ] > &bankname; 100 5 3 6 An external text entity

54 Project Course Mellon Bank And Trust Corporation Pittsburgh PA XSLT Output Mellon Bank And Trust Corporation Pittsburgh PA JustAFile.dat

55 Project Course Parameter Entities % While general entities are used to place text into the XML document parameter entities are used to modify the DTD. We want to build modular DTD’s so that we can create new DTD’s using existing ones. We’ll look at slide from www.fpml.org and the see some examples.www.fpml.org

56 Project Course FpML is a Complete Description of the Trade Pool of modular components grouped into separate namespaces Date Schedule Product Rate Adjustable Period Notional Party Trade Trade ID Product Rate Adjustable Period Notional Party Vanilla Swap Vanilla Fixed Float Swap Cancellable Swaption FX Spot FX Outright FX Swap Forward Rate Agreement... Money Date

57 Project Course <!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) > XML Document DTD Internal Parameter Entities 100 5 3 6

58 Project Course External Parameter Entities and DTD Components Kevin Dick 123 Anywhere Lane Apt 1b Palo Alto CA 94303 USA Order.xml

59 Project Course Kevin Dick 123 Not The Same Lane Work Place Palo Alto CA 94300 USA An order may have more than one address.

60 Project Course 440BX Motherboard 1 200 128 MB PC-100 DIMM 2 175 40x CD-ROM 1 50 Several products may be purchased.

61 Project Course Kevin S. Dick 11111-22222-33333 01/01 The payment is with a Visa card. We want this document to be validated.

62 Project Course order.dtd <!ATTLIST ORDER SOURCE (web | phone | retail) #REQUIRED CUSTOMERTYPE (consumer | business) "consumer" CURRENCY CDATA "USD" > Define an order based on other elements.

63 Project Course %anAddress; %aLineItem; %aPayment; External parameter entity declaration % External parameter entity reference %

64 Project Course address.dtd <!ELEMENT address (firstname, middlename?, lastname, street+, city, state,postal,country)> <!ATTLIST address ADDTYPE (bill | ship | billship) "billship"> <!ATTLIST street ORDER CDATA #IMPLIED>

65 Project Course lineitem.dtd <!ATTLIST lineitem ID ID #REQUIRED> <!ATTLIST product CAT (CDROM|MBoard|RAM) #REQUIRED>

66 Project Course <!ATTLIST card CARDTYPE (VISA|MasterCard|Amex) #REQUIRED> payment.dtd

67 Project Course XML Schemas are Coming “XML Schema” is the official name XSDL (XML Schema Definition Language) is the language used to create schema definitions Can be used to more tightly constrain a document instance Supports namespaces Permits type derivation

68 Project Course A Simple Purchase Order <purchaseOrder orderDate="07.23.2001" xmlns="http://www.cds-r-us.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.cds-r-us.com po.xsd" >

69 Project Course Dennis Scannel 175 Perry Lea Side Road Waterbury VT 15216

70 Project Course Purchase Order XSDL <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cds-r-us.com" targetNamespace="http://www.cds-r-us.com" >

71 Project Course

72 Project Course

73 Project Course

74 Project Course

75 Project Course FpML These notes were made from the FpML Architecture V1.0 Recommendation. Some of the material on these slides was taken from the “FpML Architecture V1.0 Recommendation” See http://www.fpml.orgwww.fpml.org

76 Project Course FpML FpML 1.0 is a DTD based standard. XML Schema will probably be incorporated in the future. FpML uses a content model based on Object-Oriented Design.

77 Project Course FpML Consider a Java class and its FpML representation class Money { String currency; decimal amount; }

78 Project Course FpML DTD <!-- Member list entity (used for inheritance and instantiation) -->

79 Project Course FpML DTD

80 Project Course FpML Document GBP 1000000.00

81 Project Course In the future - XML Schema

82 Project Course Inheritance // A Java-like class hierarchy class Shape { decimal x; decimal y; } class Square extends Shape { decimal width; } class Circle { decimal radius; }

83 Project Course Inheritance in FpML DTD

84 Project Course Inheritance in FpML DTD <!– Class type element  <!ATTLIST mySquare type NMTOKEN #FIXED “Square” base MNTOKEN #FIXED “Shape”> The FpML Document 4.5 2.3 100 Other OOD Structures defined in the specification

85 Project Course Enumerated Types Supported in XML with …. FpML has no business data in attributes. Schemes are used instead.

86 Project Course Schemes 1.1 Schemes In In the case of an element whose "legal" values are restricted to those of a specific domain, values in that domain should be handled as follows:  Elements containing domain values should be of string type and will hold a single valid domain value identifier.  A defaulted attribute, ‘ SchemeDefault’, will be provided on the FpML document’s root element to define the default Scheme URI. This will be used if no overriding URI is provided.  An optional attribute, ‘ Scheme’, will be provided on each domain valued element to override the URI reference for the domain.

87 Project Course <FpML currencySchemeDefault="http://www.fpml.org/ext/iso4217" businessCenterSchemeDefault= "http://www.fpml.org/spec/2000/business-center-1-0" > CHF <currency2 currencyScheme= "http://www.chase.com/ext/realmoney" > UK_Pence …

88 Project Course The DTD

89 Project Course Fixed/Float Swap Example <!-- DOCTYPE FpML PUBLIC "-//FpML//DTD Financial product Markup Language 2-0//EN" "" > --> <FpML version="2-0" businessCenterSchemeDefault= "http://www.fpml.org/spec/2000/business-center-1-0" businessDayConventionSchemeDefault=" http://www.fpml.org/spec/2000/business-day-convention-1-0"

90 Project Course currencySchemeDefault= "http://www.fpml.org/ext/iso4217" dateRelativeToSchemeDefault= "http://www.fpml.org/spec/2001/date-relative-to-2-0" dayCountFractionSchemeDefault= "http://www.fpml.org/spec/2000/day-count-fraction-1-0" dayTypeSchemeDefault= "http://www.fpml.org/spec/2000/day-type-1-0" floatingRateIndexSchemeDefault= "http://www.fpml.org/ext/isda-1991-definitions" partyIdSchemeDefault= "http://www.fpml.org/ext/iso9362"

91 Project Course payRelativeToSchemeDefault= "http://www.fpml.org/spec/2000/pay-relative-to-1-0" periodSchemeDefault= "http://www.fpml.org/spec/2000/period-1-0" resetRelativeToSchemeDefault= "http://www.fpml.org/spec/2000/reset-relative-to-1-0" rollConventionSchemeDefault= "http://www.fpml.org/spec/2000/roll-convention-1-0">

92 Project Course <tradeId tradeIdScheme= "http://www.chase.com/swaps/trade-id"> TW9235 <tradeId tradeIdScheme= "http://www.barclays.com/swaps/trade-id"> SW2000 1994-12-12

93 Project Course <!-- Chase pays the floating rate every 6 months, based on 6M DEM-LIBOR-BBA, on an ACT/360 basis --> 1994-12-14 NONE See Rest of Document on the course web page


Download ppt "Project Course XML Validation In making up the slides for this lecture, I borrowed material from several very nice sources: “Data on the Web” Abiteboul,"

Similar presentations


Ads by Google