XML Module 1 Creating an XML Document. XP Objectives Session 1.1 – Describe the history of XML and the uses of XML documents – Understand XML vocabularies.

XML Module 1 Creating an XML Document

XP Objectives Session 1.1 – Describe the history of XML and the uses of XML documents – Understand XML vocabularies – Define well-formed and valid XML documents, and describe the basic structure of an XML document – Create an XML declaration – Work with XML comments – Work with XML parsers and understand how web browsers work with XML documents 2© 2015 Out of Bounds Technology

XP Objectives (continued) Session 1.2 – Create XML elements and attributes – Work with character and entity references – Describe how XML handles parsed character data, character data, and white space – Create an XML processing instruction to apply a style sheet to an XML document – Declare a default namespace for an XML vocabulary and apply the namespace to an element 3© 2015 Out of Bounds Technology

XP Introducing XML XML stands for Extensible Markup Language – It is markup language that can be extended and modified to match the needs of the document author and data being recorded – XML has some advantages in presenting structured content – Because it is extensible, XML can be used to create a wide variety of document types 6© 2015 Out of Bounds Technology

XP Introducing XML (continued) XML has its roots in Standard Generalized Markup Language (SGML) which was introduced in the 1980s SGML is device-independent and system- independent SGML is difficult to learn and to apply because of its power, scope, and flexibility XML is a language used to create vocabularies for other markup languages but does not have SGML’s complexity 7© 2015 Out of Bounds Technology

XP XML Today XML was originally created to structure, store, and transport information XML has become the most common tool for data transmission among various applications XML is used across a variety of industries XML is used in all major websites Many software applications (Excel and Word), and server languages (Java,.NET, Perl, PHP) can read and create XML files 9© 2015 Out of Bounds Technology

XP XML Today (continued) All major databases can read and create XML files On web pages, the structure of XML closely matches the structure used to display the same information in HTML Mobile device platforms (Google’s Android and Apple’s iOS) use XML in a variety of ways 10© 2015 Out of Bounds Technology

XP Well-Formed and Valid XML Documents An XML document is well-formed if it contains no syntax errors and satisfies the general specifications for XML code as laid out by the W3C A well-formed XML document that satisfies the rules of a DTD or schema is said to be a valid document 12© 2015 Out of Bounds Technology

XP The Structure of an XML Document, Part 1 XML documents consist of three parts – The prolog – The document body – The epilog The prolog provides information about the document itself – XML declaration – Processing instructions – Comments lines – Document type declaration (DTD) 13© 2015 Out of Bounds Technology

XP The Structure of an XML Document, Part 3 The document body contains the document’s content in a hierarchical tree structure The epilog is optional and contains any final comments or processing instructions 15© 2015 Out of Bounds Technology

XP The XML Declaration The XML declaration is always the first part of the prolog in an XML document; it signals to the program reading the file that the document is written in XML, and it provides information about how that code is to be interpreted by the program The syntax is: A sample declaration: 16© 2015 Out of Bounds Technology

XP Inserting Comments Comments can appear anywhere in the prolog after the XML declaration Comments provide additional information about what the document will be used for and how it was created The syntax for comments is This is the same syntax for HTML comments  17© 2015 Out of Bounds Technology

XP XML Parsers Parser (or processor) is a program that reads and interprets an XML document Parser interprets a document’s code and verifies that it satisfies all the XML specifications for document structure and syntax Parsers are strict All major web browsers include an XML parser 18© 2015 Out of Bounds Technology

XP Working with Elements Elements are the basic building blocks of XML An element can have text content and child element content The content is stored between an opening tag and a closing tag, just as in HTML The syntax of an XML element with text: content Example: SJB Pet Boutique 24© 2015 Out of Bounds Technology

XP Working with Elements (continued) Element names are case sensitive Element names must begin with a letter or the underscore and cannot contain blank spaces The element’s name in the closing tag must exactly match the name in the opening tag An empty element with a single tag: An empty element with a pair of tags: 25© 2015 Out of Bounds Technology

XP Nesting Elements An element contained within another element Nested elements also called child elements Child elements must be enclosed within their parent elements Example: Dog Shirt Gift Basket SJB Pet Boutique Something for every day of the week 35.99 1200, 1201, 1202, 1203, 1204 26© 2015 Out of Bounds Technology

XP The Element Hierarchy All elements in the body are children of a single element called the root or document element There can be only one root element The familial relationship of parent, child, and sibling extends throughout the entire document body 27© 2015 Out of Bounds Technology

XP Working with Attributes An attribute describes a feature or characteristic of an element Every element can contain one or more attributes Attributes are text strings and must be placed in single or double quotes. The syntax is: … or 31© 2015 Out of Bounds Technology

XP Using Character and Entity References, Part 1 Special characters, such as the € symbol, can be inserted into your XML document by using a character reference; the syntax is: &#nnn; Some symbols also can be identified using an entity reference; the syntax is: &entity; 33© 2015 Out of Bounds Technology

XP Using Character and Entity References, Part 2 nnn is a character reference number or name from the ISO/IEC character set entity is the name assigned to the symbol ISO/IEC character set is an international numbering system for referencing characters from virtually any language Character references in XML are the same as in HTML 34© 2015 Out of Bounds Technology

XP Parsed Character Data Parsed character data (PCDATA) consists of all those characters that XML treats as parts of the code of an XML document – The XML declaration – The opening and closing tags of an element – Empty element tags – Character or entity references – Comments 37© 2015 Out of Bounds Technology

XP Character Data and White Space Character data is not processed, but instead is treated as pure data content White space refers to nonprintable characters such as spaces (created by pressing the Spacebar), new line characters (created by pressing the Enter key), or tab characters (created by pressing the Tab key) HTML applies white space stripping, in which consecutive occurrences of white space are treated as a single space 38© 2015 Out of Bounds Technology

XP Creating a CDATA Section A CDATA section is a block of text that XML treats as character data only The syntax to create a CDATA section is: <! [CDATA [ character data ] ]> A CDATA section may contain most markup characters, such as, and & 39© 2015 Out of Bounds Technology

XP Creating CDATA Section (continued) This example shows an element named htmlcode that contains a CDATA section, which is used to store several HTML tags: <![CDATA[ SJB Pet Boutique Fashion for Pets and Their Humans ]]> 40© 2015 Out of Bounds Technology

XP Formatting XML Data with CSS XML documents do not include any information about how they should be rendered Rendering is determined solely by the parser Link the XML document to a style sheet to format the document The XML parser will combine the style sheet with the XML document and will render a single formatted document 42© 2015 Out of Bounds Technology

XP Applying a Style to an Element Cascading Style Sheets (CSS)—style sheet language To apply a style sheet to a document, use the following style declaration: selector { attribute1: value1; attribute2: value2; … } selector is an element (or set of elements, separated by comma) from the XML document attribute and value are the style attributes and attribute values to be applied to the element 44© 2015 Out of Bounds Technology

XP Applying a Style to an Element (continued) For example: author { color: red; font-weight: bold; } Will display the text of the author element in a red boldface type 45© 2015 Out of Bounds Technology

XP Inserting a Processing Instruction The link from the XML document to a style sheet is created using a processing instruction A processing instruction is a command that tells an XML parser how to process the document Processing instruction syntax: target identifies the program (or object) to which the processing instruction is directed and instruction is information that the document passes on to the parser for processing 46© 2015 Out of Bounds Technology

XP Working with Namespaces Namespace—a defined collection of element and attribute names Applying a namespace to an XML document involves two steps: 1.Declare the namespace 2.Identify the elements and attributes within the document that belong to that namespace 48© 2015 Out of Bounds Technology

XP Declaring a Namespace Syntax:... – element is the element in which the namespace is declared – prefix is a string of characters that you’ll add to element and attribute names to associate them with the declared namespace – uri is a Uniform Resource Identifier (URI)—a text string that uniquely identifies a resource Example:... 49© 2015 Out of Bounds Technology

XML Module 2 Validating Documents with DTDs

XP Objectives Session 2.1 – Review the principles of data validation – Create a DOCTYPE – Declare XML elements and define their content – Define the structure of child elements Session 2.2 – Declare attributes – Set rules for attribute content – Define optional and required attributes – Validate an XML document 52© 2015 Out of Bounds Technology

XP Objectives (continued) Session 2.3 – Place internal and external content in an entity – Create entity references – Understand how to store code in parameter entities – Create comments in a DTD – Understand how to create conditional sections – Understand how to create entities for non- character data – Understand how to validate standard vocabularies 53© 2015 Out of Bounds Technology

XP Creating a Valid Document You validate documents to make certain that necessary elements are never omitted For example, each customer order should include a customer name, address, and phone number; some elements and attributes may be optional ( i.e., an e-mail address) The document must be not only well-formed, but also valid An XML document can be validated using DTDs (Document Type Definitions) 56© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 1 DTD is a collection of rules that define the content and structure of an XML document A DTD can be used to: – Ensure that all required elements are present in the document – Prevent undefined elements from being used in the document – Enforce a specific data structure on document contents – Specify the use of element attributes and define their permissible values – Define default values for attributes – Describe how parsers should access non-XML or nontextual content 59© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 2 A DTD is attached to an XML document by using a statement called a document type declaration, which is more simply referred to as a DOCTYPE Each XML document can have only one DOCTYPE You can divide a DOCTYPE into two parts—an internal subset and an external subset 60© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 3 The internal subset contains the rules and declarations of the DTD placed directly into the document, using the form: Example: An external subset indicates the location of the file, using the form: Example: 61© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 4 The public identifier, which is optional, provides XML parsers with information about the DTD, including the owner or author of the DTD and the language in which the DTD is written The syntax of a DOCTYPE that has only an external subset and involves a public identifier is: Example: 62© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 5 In DOCTYPE, root is the document’s root element, id is a the public identifier, and uri is the system location of the DTD A DOCTYPE that combines both internal and external subsets and references a system identifier has the following form: <!DOCTYPE root SYSTEM “URI” [declarations] > 63© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 6 If the DTD has a public identifier, then the DOCTYPE has the following form: When a DOCTYPE contains both an internal and an external subset, the internal subset takes precedence over the external subset when conflict arises between the two 64© 2015 Out of Bounds Technology

XP Declaring a DTD, Part 7 The external subset would define some basic rules for all of the documents The internal subset would define rules that are specific to each document An XML environment composed of several documents and vocabularies might use both internal and external DTDs 65© 2015 Out of Bounds Technology

XP Declaring Document Elements Every element must be declared in the DTD An element type declaration, specifies an element’s name and indicates what content the element can contain The syntax of an element declaration: Example: 68© 2015 Out of Bounds Technology

XP Types of Element Content, Part 1 The content-model specifies what type of content the element contains: – ANY: The element can store any type of content or no content at all – EMPTY: The element cannot store any content – #PCDATA: The element can contain only parsed character data – Sequence: The element can contain only child elements – #PCDATA with sequence: The element can store both parsed character data and child elements 69© 2015 Out of Bounds Technology

XP Types of Element Content, Part 2 ANY content – Allows an element to store any type of content – The syntax is : EMPTY content – This model is reserved for elements that store no content – The syntax is: Parsed character data, or PCDATA, is text that is parsed by a parser 70© 2015 Out of Bounds Technology

XP Types of Element Content, Part 3 #PCDATA content – Value is reserved for elements that can store parsed character data – Does not allow for child elements – The syntax is: – Example: The declaration permits the following element: John Michael 71© 2015 Out of Bounds Technology

XP Working with Child Elements The syntax for declaring an element that contains only child elements is: element is the parent element and children is a listing of its child elements Example: indicates that the customer element can contain only a single child element named phone 73© 2015 Out of Bounds Technology

XP Specifying an Element Sequence A sequence is a list of elements that follow a defined order; the syntax is: child1, child2, and so on, represents the sequence of child elements within the parent The order of the child elements in an XML document must match the order defined in the element declaration Example: 74© 2015 Out of Bounds Technology

XP Specifying an Element Choice The element declaration can define a choice of possible elements; the syntax is: Example: This allows the customer element to contain either the name element or the company element The choice model allows only one of the child elements A sequence and a choice can be combined 75© 2015 Out of Bounds Technology

XP Modifying Symbols Modifying symbols are symbols appended to the content model to indicate the number of occurrences of each element There are three modifying symbols: – A question mark (?)—indicates that an element occurs zero times or one time – A plus sign (+)—indicates that an element occurs at least once – An asterisk (*)—indicates that an element occurs zero times or more 76© 2015 Out of Bounds Technology

XP Modifying Symbols (continued) Example: – – This allows the document to contain one or more customer elements to be placed within the customer element The three modifying symbols can also modify entire element sequences or choices, for example: – 77© 2015 Out of Bounds Technology

XP DTDs and Mixed Content If an element contains both, parsed character data and child elements, its content is known as mixed content; the syntax is: This declaration applies the * modifying symbol to a choice of parsed character data or child elements Because the * symbol is used with a choice list, the element can: – Contain any number of occurrences of child elements or text strings of parsed character data – Contain no content at all 79© 2015 Out of Bounds Technology

XP Mixed Content Example: – This declaration allows the title element to contain any number of text strings of parsed character data interspersed by subtitle elements – The subtitle elements themselves can contain only parsed character data 80© 2015 Out of Bounds Technology

XP Declaring Attributes, Part 1 To enforce attribute properties, you must add an attribute-list declaration to the document’s DTD for each element that includes attributes An attribute-list declaration: – Lists the names of all the attributes associated with a specific element – Specifies the data type of each attribute – Indicates whether each attribute is required or optional – Provides a default value for each attribute, if necessary 83© 2015 Out of Bounds Technology

XP Declaring Attributes, Part 2 The syntax for declaring a list of attributes is: – element is the name of the element associated with the attributes – attribute1, attribute2, etc., are the names of attributes – type1, type2, etc., are the attributes’ data types – default1, default2, etc., indicate whether each attribute is required and whether it has a default value 85© 2015 Out of Bounds Technology

XP Character Data Attribute values specified as character data (CDATA) can contain almost any data except characters reserved by XML for other purposes, such as, and &; the syntax is: 88© 2015 Out of Bounds Technology

XP Enumerated Types An attribute type that specifies a limited set of possible values; the syntax is: where value1, value2, and so on, are allowed values for the specified attribute Example: 89© 2015 Out of Bounds Technology

XP Tokenized Types Tokenized types are character strings that follow certain specified rules for format and content; these rules are known as tokens DTDs support four kinds of tokens: – ID – ID reference – Name token – Entity 90© 2015 Out of Bounds Technology

XP ID Token An ID token is used when an attribute value must be unique within a document Example: When an ID value is declared in a document, other attribute values can reference it An attribute declared using the IDREF token must have a value equal to the value of an ID attribute located somewhere in the same document 91© 2015 Out of Bounds Technology

XP Name Token The NMTOKEN, data type is used with character data whose values must meet almost all the qualifications for valid XML names NMTOKEN data types can contain: – Letters and numbers – The underscore ( _ ), hyphen ( - ), period (. ), and colon ( : ) symbols NMTOKEN data types cannot contain white space characters such as blank spaces or line returns 93© 2015 Out of Bounds Technology

XP Working with Attribute Defaults The final part of an attribute declaration is the attribute default, which defines whether an attribute value is required, optional, assigned a default, or fixed Figure 2-14 Attribute defaults 94© 2015 Out of Bounds Technology

XP Validating an XML Document To test for validity, an XML parser must be able to compare the XML document with the rules established in the DTD The web has many excellent sources for validating parsers, including Web sites in which you can upload an XML document for free to have it validated against an internal or external DTD 96© 2015 Out of Bounds Technology

XP Introducing Entities Entities are storage units for a document’s content XML supports the following five built-in entities: – & for the & character – < for the < character – > for the > character – ' for the ‘ character – " for the “ character When an XML parser encounters these entities, it can display the corresponding character symbol 102© 2015 Out of Bounds Technology

XP Working with General Entities To create a customized entity, you add it to the document’s DTD An entity is classified based on three factors: – Where it will be applied – Where its content is located – What type of content it references Entities that are used within an XML document are known as general entities 103© 2015 Out of Bounds Technology

XP More about Entities A parameter entity is used within a DTD Entities can reference content found either in an external file or within the DTD itself An entity that references content found in an external file is called an external entity An entity whose content is found within the DTD is known as an internal entity An entity that references content that either is nontextual or cannot be interpreted by an XML parser is an unparsed entity 104© 2015 Out of Bounds Technology

XP Creating Parsed Entities To create a parsed internal entity, you add the entity declaration to the DTD, where entity is the name assigned to the entity and value is the text string associated with the entity Example: 105© 2015 Out of Bounds Technology

XP Referencing a General Entity After a general entity is declared in a DTD, it can be referenced anywhere within the body of the XML document The syntax for referencing a general entity is the same as for referencing one of the five built-in XML entities, namely &entity; where entity is the entity’s name as declared in the DTD 106© 2015 Out of Bounds Technology

XP Working with Parameter Entities, Part 1 Use a parameter entity when you want to insert content into the DTD itself You can use parameter entities to break a DTD into smaller chunks, or modules, that are placed in different files For internal parameter entities, the syntax is: where entity is the name of the parameter entity and value is a text string of the entity’s value 108© 2015 Out of Bounds Technology

XP Working with Parameter Entities, Part 2 Example of internal parameter entity: ” > For external parameter entities, the syntax is: where uri is the name assigned to the parameter entity 109© 2015 Out of Bounds Technology

XP Working with Parameter Entities, Part 3 Parameter entity references can only be placed where a declaration would normally occur, such as an internal or external DTD Parameter entities used with an internal DTD do not offer any time or effort savings An external parameter entity can allow XML to use more than one DTD per document by combining declarations from multiple DTDs 110© 2015 Out of Bounds Technology

XP Creating Conditional Sections A conditional section is a section of the DTD that is processed only in certain situations The syntax for creating a conditional section is: <![keyword[ declarations ]]> where keyword is either: – INCLUDE (for a section of declarations that you want parsers to interpret) – IGNORE (for the declarations that you want parsers to pass over) 113© 2015 Out of Bounds Technology

XP Working with Unparsed Data, Part 1 For a DTD to validate either binary data, such as images or video clips, or character data that is not well-formed, you need to work with unparsed entities The first step is to declare a notation, which identifies the data type of the unparsed data A notation must supply a name for the data type and provide clues about how applications should handle the data 115© 2015 Out of Bounds Technology

XP Working with Unparsed Data, Part 2 Notations must reference external content and you must specify an external location There are two options: – Use system location: where notation is the notation’s name and uri is a system location – Specify a public location: where id is a public identifier recognized by XML parsers 116© 2015 Out of Bounds Technology

XP Working with Unparsed Data, Part 3 After a notation is declared, you can create an unparsed entity that references specific items that use that notation The syntax to declare an unparsed entity is: – entity is the name of the entity referencing the notation – uri is the URI of the unparsed data – notation is the name of the notation that defines the data type for the XML parser 117© 2015 Out of Bounds Technology

XP Working with Unparsed Data, Part 4 You can also provide a public location for the unparsed data if an XML parser supports it, using the following form: The following declaration creates an unparsed entity named WM100PLIMG that references the graphic image file WM100PL.png: This declaration references the png notation created above to provide the data type 118© 2015 Out of Bounds Technology

XML Module 3 Validating Documents with Schemas

XP Objectives Session 3.1 – Compare schemas and DTDs – Explore different schema vocabularies – Declare simple type elements and attributes – Declare complex type elements – Apply a schema to an instance document 122© 2015 Out of Bounds Technology

XP Objectives (continued) Session 3.2 – Work with XML Schema data types – Derive new data types for text strings, numeric values, and dates – Create data types for patterned data using regular expressions 123© 2015 Out of Bounds Technology

XP The Limits of DTDs DTDs are commonly used for validation largely because of XML’s origins as an offshoot of SGML One complaint about DTDs is their lack of data types DTDs also do not recognize namespaces, so they are not well suited to compound documents in which content from several vocabularies needs to be validated DTDs employ a syntax called Extended Backus–Naur Form (EBNF), which is different from the syntax used for XML 126© 2015 Out of Bounds Technology

XP Schemas and DTDs A schema is an XML document that contains validation rules for an XML vocabulary When applied to a specific XML file, the document to be validated is called the instance document Figure 3-2 Comparison of schemas and DTDs 127© 2015 Out of Bounds Technology

XP Schema Vocabularies A single standard does not exist for schemas A schema vocabulary is simply an XML vocabulary created for the purpose of describing schema content Support for a particular schema depends solely on the XML parser being used for validation 129© 2015 Out of Bounds Technology

XP Starting a Schema File A schema, is always placed in an external XML file XML Schema filenames end with the.xsd file extension The root element in any XML Schema document is the schema element The general structure of an XML Schema file is: content 131© 2015 Out of Bounds Technology

XP Starting a Schema File (continued) By convention, the namespace prefix xsd or xs is assigned to the XML Schema namespace to identify elements and attributes that belong to the XML Schema vocabulary The usual form of an XML Schema document is: content 132© 2015 Out of Bounds Technology

XP Understanding Simple and Complex Types, Part 1 XML Schema supports two types of content— simple and complex A simple type contains only text and no nested elements A complex type contains two or more values or elements placed within a defined structure 133© 2015 Out of Bounds Technology

XP Defining a Simple Type Element An element in the instance document containing only text and no attributes or child elements is defined in XML Schema using the tag: – name is the name of the element in the instance document – type is the type of data stored in the element If you use a different namespace prefix or declare XML Schema as the default namespace for the document, the prefix will be different 136© 2015 Out of Bounds Technology

XP Data Types The data type can be: – one of XML Schema’s built-in data types – defined by the schema author, or user data type The most commonly used data type in XML Schema is string, which allows an element to contain any text string Example: Another popular data type in XML Schema is decimal, which allows an element to contain a decimal number 137© 2015 Out of Bounds Technology

XP Defining an Attribute To define an attribute in XML Schema, you use the tag: Here name is the name of the attribute, type is the data type, default is the attribute’s default value, and fixed is a fixed value for the attribute The default and fixed attributes are optional 139© 2015 Out of Bounds Technology

XP Defining a Complex Type Element The basic structure for defining a complex type element with XML Schema is declarations – name is the name of the element – declarations represents declarations of the type of content within the element 141© 2015 Out of Bounds Technology

XP Defining a Complex Type Element (continued) This content could include nested child elements, basic text, attributes, or any combination of the three: – An empty element containing only attributes – An element containing text content and attributes but no child elements – An element containing child elements but no attributes – An element containing both child elements and attributes 142© 2015 Out of Bounds Technology

XP Defining an Element Containing Only Attributes The code to define the attributes of an empty element is: attributes – name is the name of the empty element – attributes is the set of simple type elements that define the attributes of the empty element 143© 2015 Out of Bounds Technology

XP Defining an Element Containing Attributes and Basic Text The definition needs to indicate that the element contains simple content and a collection of one or more attributes The structure of the element definition is: attributes 144© 2015 Out of Bounds Technology

XP Defining an Element Containing Attributes and Basic Text (continued) Example: The base attribute in the element sets the data type for the gpa element; it also sets the data type of the degree attribute to xs:string 145© 2015 Out of Bounds Technology

XP Referencing an Element or Attribute Definition XML Schema allows for a great deal of flexibility in writing complex types Rather than repeating that earlier attribute declaration, you can create a reference to it A reference to an element definition is where elemName is the name used in the element definition A reference to an attribute definition is where attName is the name used in the attribute definition 146© 2015 Out of Bounds Technology

XP Defining an Element with Nested Children Complex elements that contain nested child elements but no attributes or text: elements – name is the name of the element – compositor is a value that defines how the child elements appear in the document – elements is a list of the nested child elements 147© 2015 Out of Bounds Technology

XP Defining an Element with Nested Children (continued) The following compositors are supported: – sequence —requires the child elements to appear in the order listed in the schema – choice —allows any one of the child elements listed to appear in the instance document – all —allows any of the child elements to appear in any order in the instance document; however, each may appear only once, or not at all 148© 2015 Out of Bounds Technology

XP Defining an Element Containing Nested Elements and Attributes The code for a complex type element that contains both child elements and attributes is: elements attributes – name is the name of the element – compositor is either sequence, choice, or all – elements is a list of nested child elements – attributes is a list of attribute definitions 150© 2015 Out of Bounds Technology

XP Specifying Mixed Content An element is said to have mixed content when it contains both a text string and child elements XML Schema assumes that the element contains both text and child elements The structure of the child elements can be defined with the conventional method student Cynthia Berstein is enrolled in an IT degree program and has completed 12 credits since 01/01/2012. 152© 2015 Out of Bounds Technology

XP Indicating Required Attributes To indicate whether an attribute is required, the use attribute can be added to the statement that assigns the attribute to an element: element content 154© 2015 Out of Bounds Technology

XP Indicating Required Attributes (continued) use is one of the following three values: – required —the attribute must always appear with the element – optional —the use of the attribute is optional with the element – prohibited —the attribute cannot be used with the element Example: 155© 2015 Out of Bounds Technology

XP Specifying the Number of Child Elements To specify the number of times an element appears in the instance document, you can apply the minOccurs and maxOccurs attributes to the element definition: – The value of the minOccurs attribute defines the minimum number of times the element can occur – The value of the maxOccurs attribute defines the maximum number of times the element can occur 156© 2015 Out of Bounds Technology

XP Applying a Schema to an Instance Document To attach a schema to an instance document, you: – Declare the XML Schema instance namespace in the instance document – Specify the location of the schema file To declare the XML Schema instance namespace, you add the following attribute to the root element of the instance document: xmlns:xsi=”http://www.w3.org/2001/XMLSc hema-instance” 158© 2015 Out of Bounds Technology

XP Applying a Schema to an Instance Document (continued) You add a second attribute to the root element to specify the location of the schema file The attribute you use depends on whether the instance document is associated with a namespace If the document is not associated with a namespace, you add the attribute: xsi:noNamespaceSchemaLocation=”schema” to the root element, where schema is the location and name of the schema file 159© 2015 Out of Bounds Technology

XP Validating with Built-In Data Types XML Schema divides its built-in data types into two classes—primitive and derived – A primitive data type, also called a base type, is one of 19 fundamental data types that are not defined in terms of other types – A derived data type is one of 25 data types that are developed from one of the base types 162© 2015 Out of Bounds Technology

XP Deriving Customized Data Types, Part 1 The code to derive a new data type is: rules – name is the name of the user-defined data type – rules is the list of statements that define the properties of that data type This structure is also known as a named simple type You can also create a simple type without a name, which is known as an anonymous simple type 167© 2015 Out of Bounds Technology

XP Deriving Customized Data Types, Part 2 The following three components are involved in deriving any new data type: – Value space—the set of values that correspond to the data type – Lexical space—the set of textual representations of the value space – Facets—the properties that distinguish one data type from another 168© 2015 Out of Bounds Technology

XP Deriving Customized Data Types, Part 3 New data types are created by manipulating the properties of value space, lexical space, and facets It can be done by: 1.Creating a list based on preexisting data types 2.Creating a union of one or more of the preexisting data types 3.Restricting the values of a preexisting data type 169© 2015 Out of Bounds Technology

XP Deriving a List Data Type A list data type is a list of values separated by white space, in which each item in the list is derived from an established data type The syntax for deriving a customized list data type is: – name is the name assigned to the list data type – type is the data type from which each item in the list is derived 170© 2015 Out of Bounds Technology

XP Deriving a Union Data Type A union data type is based on the value and/or lexical spaces from two or more preexisting data types Each base data type is known as a member data type; the syntax is: where type1, type2, type3, etc., are the member types that constitute the union 171© 2015 Out of Bounds Technology

XP Deriving a Union Data Type (continued) XML Schema also allows unions to be created from nested simple types; the syntax is: rules1 rules2... where rules1, rules2, etc., are rules for creating different user-derived data types 172© 2015 Out of Bounds Technology

XP Constraining Facets Constraining facets are applied to a base type using the structure:... – type is the data type on which the restricted data type is based – facet1, facet2, etc., are constraining facets – value1, value2, etc., are values for the constraining facets 174© 2015 Out of Bounds Technology

XP Deriving Data Types Using Regular Expressions A regular expression is a text string that defines a character pattern Regular expressions can be created to define patterns for many types of data, including phone numbers, postal address codes, and e-mail addresses 175© 2015 Out of Bounds Technology

XP Deriving Data Types Using Regular Expressions (continued) To apply a regular expression in a data type, you create the simple type: where regex is a regular expression pattern Example: 176© 2015 Out of Bounds Technology

XP Regular Expression Character Types Character types are representations of different kinds of characters The general form of a character type is: \char Figure 3-33 Regular expression character types 177© 2015 Out of Bounds Technology

XP Common Regular Expression Character Sets Characters can also be grouped into lists called character sets that specify exactly what characters or ranges of characters are allowed in the pattern; the syntax of a character set is: [chars] Figure 3-34 Common regular expression character sets 178© 2015 Out of Bounds Technology

XP Regular Expression Quantifiers To specify the number of occurrences for a particular character or group of characters, a quantifier can be appended to a character type or set Figure 3-35 Regular expression quantifiers 179© 2015 Out of Bounds Technology

XML Module 4 Working with Advanced Schemas

XP Objectives Session 4.1 – Explore the Flat Catalog schema design – Explore the Russian Doll schema design – Explore the Venetian Blind schema design Session 4.2 – Attach a schema to a namespace – Apply a namespace to an instance document – Import one schema file into another – Reference objects from other schemas 182© 2015 Out of Bounds Technology

XP Objectives (continued) Session 4.3 – Declare a default namespace in a style sheet – Specify qualified elements by default in a schema – Integrate a schema and a style sheet with an instance document 183© 2015 Out of Bounds Technology

XP Designing a Schema, Part 1 The building blocks of any schema are the XML elements that define the structure; these are known collectively as objects The way you design the layout of your schema file can impact how that schema is interpreted and applied to the instance document One important issue in schema design is determining the scope of the different objects declared within the schema 186© 2015 Out of Bounds Technology

XP Designing a Schema, Part 2 XML Schema recognizes two types of scope— global and local Objects with global scope are direct children of the root schema element and can be referenced throughout the schema document Objects with local scope can be referenced only within the object in which they are defined 187© 2015 Out of Bounds Technology

XP Flat Catalog Design, Part 1 Sometimes referred to as a Salami Slice design All element and attribute definitions have global scope Every element and attribute definition is a direct child of the root schema element and thus has been defined globally The developer can then use references to the set of global objects to build the schema 189© 2015 Out of Bounds Technology

XP Russian Doll Design, Part 1 A Russian Doll design has only one global element with everything else nested inside of it The root element of the instance document becomes the top element declaration in the schema All child elements within the root element are similarly nested in the schema 192© 2015 Out of Bounds Technology

XP Venetian Blind Design, Part 1 A Venetian Blind design is similar to a Flat Catalog except that instead of declaring objects globally, it creates: – Named types – Named element groups – Named attribute groups A Venetian Blind design references the types within a single global element A Venetian Blind design represents a compromise between Flat Catalogs and Russian Dolls 195© 2015 Out of Bounds Technology

XP Understanding Name Collision (continued) The duplication of element names on the previous slide is an example of name collision, which occurs when the same element name from different XML vocabularies is used within a compound document Name collisions are often unavoidable A benefit of XML vocabularies is the ability to use simple element names to describe data Creating complex element names to avoid name collisions eliminates this benefit 205© 2015 Out of Bounds Technology

XP Working with Namespaces in an Instance Document A namespace is a defined collection of element and attribute names Applying a namespace to an XML document involves two steps: – Declare the namespace – Identify the elements and attributes within the document that belong to that namespace 206© 2015 Out of Bounds Technology

XP Declaring and Applying a Namespace to a Document To declare and apply a namespace to a document, you add the attributes xmlns=”uri” xsi:schemaLocation=”uri schema” to an element in the document, where uri is the URI (Uniform Resource Identifier)of the namespace and schema is the location and name of the schema file The number of namespace attributes that can be declared within an element is unlimited 207© 2015 Out of Bounds Technology

XP Applying a Namespace to an Element, Part 1 In an instance document containing elements from more than one namespace, after you declare the namespaces, you must indicate which elements in the document belong to each namespace This process involves two steps: – Associate the namespace declaration with a prefix – Add the prefix to the tags for each element in the namespace 209© 2015 Out of Bounds Technology

XP Applying a Namespace to an Element, Part 2 To apply an XML namespace to an element, you qualify the element’s name A qualified name, or qname, is an element name consisting of two parts: – The namespace prefix that identifies the namespace – The local part or local name that identifies the element or attribute within that namespace The general form is:... where prefix is the namespace prefix and element is the local part 210© 2015 Out of Bounds Technology

XP Applying a Namespace to an Element, Part 3 An element name without a namespace prefix is referred to as unqualified name Namespaces have a scope associated with them The scope of a namespace declaration extends from the beginning of the opening tag to the end of the corresponding closing tag The namespace declared in a parent element is connected with the defined prefix for that element as well as for all of its child elements 211© 2015 Out of Bounds Technology

XP Working with Attributes Like an element name, an attribute can be qualified by adding a namespace prefix The syntax to qualify an attribute is... where prefix is the namespace prefix and attribute is the attribute name Unlike element names, there is no default namespace for attribute names Default namespaces apply to elements, but not to attributes 213© 2015 Out of Bounds Technology

XP Associating a Schema with a Namespace, Part 1 Declare the namespace of the instance document in the schema element and then make that namespace the target of the schema using the targetNamespace attribute The code to set the schema namespace is:... where prefix is the prefix of the namespace and uri is the URI of the namespacehttp://www.w3.org/2001/XMLSchema 214© 2015 Out of Bounds Technology

XP Including Schemas You include a schema file when you want to combine schema files from the same namespace To include a schema, you add the element: as a child of the root schema element, where schema is the name of the schema file to be included 217© 2015 Out of Bounds Technology

XP Importing Schemas The other way to combine schemas is through importing, which is used when the schemas come from different namespaces The syntax of the import element is: where uri is the URI of the namespace for the imported schema and schema is the name of the schema file 218© 2015 Out of Bounds Technology

XP Referencing Objects from Other Schemas After a schema is imported into another schema file, any objects it contains with global scope can be referenced in that file To reference an object from an imported schema, you must declare the namespace of the imported schema in the schema element You can then reference the object using the ref attribute or the type attribute for customized simple and complex types 219© 2015 Out of Bounds Technology

XP Combining Standard Vocabularies The standard vocabularies that are shared throughout the world, such as XHTML, RSS, and MathML, can also be combined within a single compound document Figure 4-20 Namespace URIs for standard vocabularies 221© 2015 Out of Bounds Technology

XP Declaring a Namespace in a Style Sheet To declare a namespace in a style sheet, you add the rule: @namespace prefix “uri”; to the CSS style sheet, where prefix is the namespace prefix and uri is the URI of the namespace Both the prefix and the URI must match the prefix and URI used in the XML document 224© 2015 Out of Bounds Technology

XP Applying a Namespace to a Selector After you have declared a namespace in a style sheet, you can associate selectors with that namespace by adding the namespace prefix to each selector name separated with the | symbol, as follows: prefix|selector {attribute1: value1; attribute2: value2;...} Example: Style declaration: stu|lastname {width: 150px} applies a width value of 150 px to all lastname elements that belong to the stu namespace 225© 2015 Out of Bounds Technology

XP Qualifying Elements and Attributes, Part 1 You can force all elements and attributes to be qualified, regardless of their scope, by adding the elementFormDefault and attributeFormDefault attributes:... to the root schema element in the schema file, where qualify is either qualified or unqualified 226© 2015 Out of Bounds Technology

XP Qualifying Elements and Attributes, Part 2 The default value of both of these attributes is unqualified except for globally defined elements and attributes, which must always be qualified To require all elements to be qualified but not all attributes, you enter the following code into the schema element:... 227© 2015 Out of Bounds Technology

XP Qualifying Elements and Attributes, Part 3 You can set the qualification for individual elements or attributes by applying the form attribute: to the definitions in the schema, where qualify is either qualified or unqualified 228© 2015 Out of Bounds Technology

XML Module 5 Transforming XML with XSLT and XPath

XP Objectives Session 5.1 – Learn the history and theory of XSLT – Understand XPath and examine a node tree – Create and attach an XSLT style sheet – Create a root template – Generate a result document from an XSLT style sheet 230© 2015 Out of Bounds Technology

XP Objectives (continued) Session 5.2 – Create and apply templates to different nodes – Extract and display the value of an element – Extract and display the value of an attribute – Explore XSLT’s built-in templates Session 5.3 – Set the value of an attribute in a result document – Create conditional output using the if and choose elements – Create an XPath expression using predicates – Use XSLT to generate elements and attributes 231© 2015 Out of Bounds Technology

XP Introducing XSL and XSLT, Part 1 Extensible Stylesheet Language or XSL is one way of presenting data in an easily readable format XSL is used to transform the contents of a source XML document containing data into a result document written in a new format 234© 2015 Out of Bounds Technology

XP Introducing XSL and XSLT, Part 2 XSL is organized into two languages: – XSL-FO (Extensible Stylesheet Language – Formatting Objects) is used for the layout of paginated documents – XSLT (Extensible Stylesheet Language Transformations) is used to transform the contents of an XML document into another document format 235© 2015 Out of Bounds Technology

XP Introducing XSL and XSLT, Part 3 Once a style sheet is written, an XSLT processor is used to transform the contents of the source document into a new format which appears as the result document In a server-side transformation, a server receives a request from a client to generate the result document In a client-side transformation, a client requests retrieval of both a source document and a style sheet from the server 236© 2015 Out of Bounds Technology

XP Introducing XSL and XSLT, Part 5 An XSLT style sheet is attached to an XML document by adding the following processing instruction near the top of the XML document prior to the root element: where url is the URL pointing to the location of the XSLT style sheet file 238© 2015 Out of Bounds Technology

XP Introducing XSL and XSLT, Part 6 Because XSLT style sheets are XML documents, XSLT documents start with an xml declaration and a root element named stylesheet The stylesheet element needs to be placed in the http://www.w3.org/1999/XSL/Transform namespace 239© 2015 Out of Bounds Technology

XP Introducing XSL and XSLT, Part 7 Every XSLT stylesheet has the following basic structure: style sheet contents – value is the XSLT version – style sheet contents are the elements and attributes specific to the style sheet 240© 2015 Out of Bounds Technology

XP Introducing XPath, Part 1 The XPath language is used to access and navigate the contents of an XML data tree XPath operates by expressing the contents of the source document in terms of nodes A node is any item within the tree structure of the document A collection of nodes is called a node set 242© 2015 Out of Bounds Technology

XP Introducing XPath, Part 2 An element node refers to an element from the source document An attribute node refers to an element’s attribute The various nodes from the source document are organized into a node tree, with the root node or document node at the top of the tree 243© 2015 Out of Bounds Technology

XP Introducing XPath, Part 4 A node that contains other nodes is called a parent node, and the nodes contained in a parent node are called child nodes Nodes that share a common parent are called sibling nodes Any node found at a level below another node is referred to as a descendant of that node The node at the top of the branch is referred to as the ancestor of all nodes that lie beneath it 245© 2015 Out of Bounds Technology

XP Introducing XPath, Part 5 One of the functions of XPath is to translate an XML hierarchical structure into an expression called a location path that references a specific node or node set from the source document The location path can be written in either absolute or relative terms 246© 2015 Out of Bounds Technology

XP Introducing XPath, Part 6 An absolute path is a path that always starts from the root node and descends down through the node tree to a particular node or node set An absolute path has the general form: /child1/child2/child3/... where child1, child2, child3, and so forth are the descendants of the root node 247© 2015 Out of Bounds Technology

XP Introducing XPath, Part 7 Most locations are written using relative paths in which the location path starts from a particular node (not necessarily the root node) called the context node Figure 5-8 XPath expressions for relative paths 248© 2015 Out of Bounds Technology

XP Introducing XSLT Templates, Part 1 The basic building block of an XSLT style sheet is the template A template is a collection of styles that are applied to a specific node set within the source document 250© 2015 Out of Bounds Technology

XP Introducing XSLT Templates, Part 2 The general syntax of an XSLT template is styles – node set is an XPath expression that references a node set from the source document – styles are the XSLT styles applied to those nodes 251© 2015 Out of Bounds Technology

XP Introducing XSLT Templates, Part 3 The fundamental template in the XSLT style sheet is the root template, which defines styles for the source document’s root node The syntax for the root template is styles 252© 2015 Out of Bounds Technology

XP Introducing XSLT Templates, Part 4 Content is written to the result document through the use of XSLT elements and literal result elements An XSLT element is any element that is part of the XSLT vocabulary A literal result element is any element that is not part of the XSLT vocabulary but is sent directly into the result document as raw text 253© 2015 Out of Bounds Technology

XP Transforming a Document, Part 1 256 The simplest way to view a web page generated by an XSLT 1.0 style sheet is to open the source document in your web browser Another way to view the result document is to generate the document as a separate file using an XSLT processor © 2015 Out of Bounds Technology

XP Transforming a Document, Part 2 The Saxon XSLT processor is a commonly used XSLT processor You can apply a transformation in Saxon Java command line mode by running the following command within a command prompt window: java net.sf.saxon.Transform -s:source -xsl:style -o:output – source is the XML source file – style is the XSLT style sheet file – output is the result file 257© 2015 Out of Bounds Technology

XP Extracting Element Values, Part 1 To display a data value from a node in the source document, XSLT employs the following value-of element: where node is a location path that references a node from the source document’s node tree 261© 2015 Out of Bounds Technology

XP Extracting Element Values, Part 3 If there are multiple nodes that match the location path, you can create a style for each matching node using the following for-each instruction: styles – node set is a location path that returns a set of one or more nodes – styles are the XSLT styles applied to each node in the node set 263© 2015 Out of Bounds Technology

XP Working with Templates, Part 1 Templates can be defined for any node set specified by an XPath expression A template that displays the value of the sName element could be entered as: 265© 2015 Out of Bounds Technology

XP Working with Templates, Part 2 To apply a template, use the following apply- templates instruction: where node set is a location path that references a node set in the source document The XSLT processor then searches the XSLT style sheet for a template matching that node set 266© 2015 Out of Bounds Technology

XP Displaying Attribute Values Attributes can be included in a location path using the XPath expression: node@attribute – node is an element node – attribute is the name of an attribute for that node For example, the sName element has a single attribute named symbol The absolute reference to this attribute is: /portfolio/stock/sName/@symbol 268© 2015 Out of Bounds Technology

XP Combining Node Sets, Part 1 Multiple nodes sets can be combined into a single location path using the union ( | ) operator. For example, the expression /portfolio/date | /portfolio/time defines a location path that matches both the date and time elements nested within the portfolio element Similarly, the expression @open|@high|@low|@current|@vol matches the open, high, low, current, and vol attributes 270© 2015 Out of Bounds Technology

XP Combining Node Sets, Part 3 XSLT supports several built-in templates that specify how the values of different nodes are displayed, by default For example, the following built-in template defines how the values of all text nodes and all attribute nodes from the source document are displayed: 272© 2015 Out of Bounds Technology

XP Inserting a Value into an Attribute You can use XSLT to write values in the attributes of elements by enclosing an XPath expression within a set of curly braces using the general form: – element is the name of the element written to the result document – attribute is the element’s attribute – expression is an XPath expression that sets the attribute’s value 275© 2015 Out of Bounds Technology

XP Sorting Node Sets, Part 1 Nodes are displayed in the result document in the same order in which they appear in the source document’s node tree To sort the nodes in a different order, you can apply the following sort instruction in the style sheet: <xsl:sort select=”node set” data- type=”text|number|qname” order=”ascending|descending” case-order=”upper-first|lower-first” lang=”language” /> 277© 2015 Out of Bounds Technology

XP Sorting Node Sets, Part 2 In the previous example: – node set is an XPath expression that returns a set of nodes – the data-type attribute specifies the type of data to be sorted (text, number, or qname for qualified XML names) – the order attribute defines whether to sort in ascending or descending order – the case-order attribute specifies whether uppercase or lowercase characters are to be sorted first – and the lang attribute defines the language used to determine sort order 278© 2015 Out of Bounds Technology

XP Sorting Node Sets, Part 3 The sort instruction is always used within an or tag When sorting is applied to a template, the tag is entered as a two-sided tag If you do not include a select attribute with the sort instruction, the XSLT processor will assume that you want to sort based on the value of the context node 279© 2015 Out of Bounds Technology

XP Conditional Processing, Part 1 Conditional processing is a programming technique that applies different styles based on the values from the source document One way of accomplishing this is with the following if instruction: styles – expression is an XPath expression that is either true or false – styles are XSLT styles that are applied if the expression is true 281© 2015 Out of Bounds Technology

XP Conditional Processing, Part 2 XPath supports several different comparison operators to compare one value to another The most commonly used comparison operator is the equals symbol (=), which is used to test whether the two values are equal For example, the following if statement tests whether the symbol attribute of the sName element is equal to “BA”: if test=”sName/@symbol='BA'” 282© 2015 Out of Bounds Technology

XP Conditional Processing, Part 4 If you want to test for multiple conditions and display different outcomes, you need to apply the following choose structure: styles styles... styles where expression1, expression2, and so forth are expressions that are either true or false 284© 2015 Out of Bounds Technology

XP Filtering XML with Predicates, Part 1 A predicate is part of a location path that restricts the node set to only those nodes that fulfill a specified condition The general syntax for a predicate is: node-set[condition] – node-set is an XPath expression that references a particular node set – condition is an expression for a condition that any nodes in the node set must fulfill 286© 2015 Out of Bounds Technology

XP Filtering XML with Predicates, Part 2 A predicate can also indicate the position of a node in the node tree The general syntax is: node-set[position] – position is an integer indicating the position of the node For example, the expression: stock[3] selects the third stock element from the source document 287© 2015 Out of Bounds Technology

XP Filtering XML with Predicates, Part 3 A predicate can also contain an XPath function The last() function returns the last node in the node tree The position() function returns the position value of the node 288© 2015 Out of Bounds Technology

XP Constructing Elements and Attributes with XSLT, Part 1 A result tree is composed of the element, attribute, text, and other nodes To construct an element node in the result tree, XSLT uses the following tag: styles – name attribute assigns a name to the element – namespace attribute provides a namespace 290© 2015 Out of Bounds Technology

XP Constructing Elements and Attributes with XSLT, Part 2 Attributes are constructed in XSLT using the following tag: styles – name attribute specifies the name of the attribute – namespace attribute indicates the namespace 291© 2015 Out of Bounds Technology

XP Constructing Elements and Attributes with XSLT, Part 3 Rather than nesting the entire collection of attributes, those attributes can be grouped within an attribute set, which allows you to add several attributes to the same element without having a long nested statement 292© 2015 Out of Bounds Technology

XP Constructing Elements and Attributes with XSLT, Part 4 To create an attribute set you apply the following attribute-set element: <xsl:attribute-set name=”text” use-attribute-sets=”name-list”> styles... where name attribute contains the name of the set and then the names of the individual attributes created within that set 293© 2015 Out of Bounds Technology

XP Constructing Elements and Attributes with XSLT, Part 5 XSLT also includes elements to write comments and processing instructions to the result tree To construct a comment node, use the element comment text where comment text is the text that should be placed within a comment tag 294© 2015 Out of Bounds Technology

XP Constructing Elements and Attributes with XSLT, Part 6 To create a processing instruction node, use the element attributes – name attribute provides the name of the processing instruction – attributes are attributes contained within the processing instruction 295© 2015 Out of Bounds Technology

XML Module 1 Creating an XML Document. XP Objectives Session 1.1 – Describe the history of XML and the uses of XML documents – Understand XML vocabularies.

Similar presentations

Presentation on theme: "XML Module 1 Creating an XML Document. XP Objectives Session 1.1 – Describe the history of XML and the uses of XML documents – Understand XML vocabularies."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

XML Module 1 Creating an XML Document. XP Objectives Session 1.1 – Describe the history of XML and the uses of XML documents – Understand XML vocabularies.

Similar presentations

Presentation on theme: "XML Module 1 Creating an XML Document. XP Objectives Session 1.1 – Describe the history of XML and the uses of XML documents – Understand XML vocabularies."— Presentation transcript:

Similar presentations

About project

Feedback