XML Study-Session: Part II Validating XML Documents.
Published byModified over 4 years ago
Presentation on theme: "XML Study-Session: Part II Validating XML Documents."— Presentation transcript:
XML Study-Session: Part II Validating XML Documents
Objectives: By completing this study-session, you should be able to: n Validate XML documents against a DTD. n Understand basic DTD syntax. n Create simple DTDs of your own.
What is a DTD? Document Type Definition: n Standard originally developed for SGML. n Provides a description of the XML document’s structure, and serves as a grammar to specify what tags and attributes are valid in an XML document and in what context they are valid. n E.g. The following is an example DTD statement:
Why use a DTD? DTDs are used to allow an application to construct valid XML that conforms to that specification. Also: n Self documentation n Portability n Provides defaults for attributes Entity declaration
Using a DTD in an XML document An XML document may do any of the following: n Refer to a DTD, using its URI. n Include a DTD inline as part of the XML document. n Omit a DTD altogether. Without a DTD, an XML document can be checked for well-formedness, but not for validity. The DTD used by the XML document may be internal or external. An external DTD is stored as an ASCII text.dtd file.
Example: Using a DTD inline <!DOCTYPE Book [ <!ATTLIST Book ISBN CDATA #REQUIRED section (fiction|nonfiction) ‘fiction’> ]> To Kill a Mockingbird Harper Lee &Description;
Doctype declaration The Document Type (Doctype) declaration is used to indicate the DTD used for the document. Syntax may be in any of the following forms: n
Example: External DTD The following is an example of an XML document that uses an external DTD: Moby Dick Herman Melville The external DTD must be located in the same directory as the XML document.
Example: Using DTDs with URLS The following is an example of an XML document that references an external DTD with an URL: Moby Dick Herman Melville
Specifying Elements n In the DTD, this is done with the notation: where elemName is the actual element name, and elemDefinitionOrType indicates whether the content of the content is pure data or a compound type of data and other elements.
Some Element Types n The element type keyword ANY allows the element to contain textual data, nested elements, or any legal XML combination of the two. n The element type keyword #PCDATA indicates textual data, and can be used to store regular character data we want the XML document to handle normally. n The element type keyword EMPTY indicates that the element is always empty.
Nesting elements n To define the allowed nestings within a DTD, the following notation is used: where the order of elements is enforced as a validity constraint within an XML document. n By default, an element can appear exactly once when specified without any modifiers in the DTD.
Recurrence Operators: Recurrence operators can be used to indicate how many times an element must appear in an XML document: OperatorDescription [Default]Must appear exactly one time. ?Must appear once or not at all. +Must appear at least once (1..N times). *May appear any number of times (0..N times).
Grouping elements n Often, recurrence occurs for a block or group of elements rather than with a single element. n To signify a group, enclose a set of elements within parantheses. Nested parentheses are acceptable. n In this way, a recurrence operator can then be applied to the group. n E.g.
Either Or n In the DTD, an “OR” operator is signified by using |. This allows one thing or the other to occur, and can be used in conjunction with groupings. n E.g.
Defining Attributes n Attribute definitions are in the following form: The attributeType keyword CDATA allows an attribute to take on any value, and may represent a comment or additional information about an element. n Another attribute type is an enumeration, where any of the specified values may be used, but any other value for the attribute results in an invalid document. n E.g.
Attribute Modifiers n We can indicate in the attribute definition whether the attribute is required within an element. n The three modifier keywords are: #IMPLIED, #REQUIRED, and #FIXED. n An implied attribute may be given a value, or left unspecified. n A required attribute must be given a value. n A fixed attribute has a specified value that can never change. The notation for this is:
Parameter Entities in DTDs n Parameter entities are entities that can only be used in the DTD. n A simple internal parameter entity has the format: n E.g. <!DOCTYPE Book [ ”> %sum; ]> …
Parameter Entities in DTDs (contd.) n External parameter entitites can be declared using the following: or n E.g. The following ‘orders.dtd’ file could be created: %XHTML1-t.dtd
Using INCLUDE and IGNORE n We can customize our DTDs using the INCLUDE and IGNORE statements, which have the following syntax: n E.g. In the ‘orders.dtd’ file, add the following lines: …(same as before)… <![includer; [ ]]>
Example: Using the XHTML 1.1 DTD n The XHTML 1.1 DTD is a DTD driver which includes various XHTML 1.1 modules (i.e. DTD sections) using parameter entities. n E.g. <![%xhtml-table.module;[ %xhtml-table.mod;]]> The above allows us to customize the XHTML 1.1 DTD to include/exclude support for tables.
Next session: Parsing XML Documents n Parsing techniques n Writing your own XML applications