Download presentation
Presentation is loading. Please wait.
Published byArron Lamb Modified over 9 years ago
1
Structuring XML Using DTD and Schema Ching-Long Yeh, PhD, 葉 慶 隆 Department of Computer Science and Engineering Tatung University Taipei 104, Taiwan Email: chingyeh@cse.ttu.edu.tw URL: http://www.cse.ttu.edu.tw/~chingyeh
2
Structuring XML2 Content XML Document Basics DTD Syntax Review An Introduction to XML Schema Types of Interaction with Document DTD in Electronic Business Conclusion
3
XML Document Basics
4
Structuring XML4 Structure, Content, and Format Central to XML is the concept that documents have structure, content, and format. These three ingredients combine to form a document. They interrelate in subtle ways, and you can easily confuse them as you work with your documents.
5
Structuring XML5 What is Structure? The structure defines how the document is laid out and in what order elements are assembled For example, a bicycle assembly manual might consist of of the following section in this order: –an introduction that described the document and lists the manufacturer’s address, –assembly instructions, –a part list, –instruction for order replacement parts, –troubleshooting advice, and –index.
6
Structuring XML6 What is Content? Content is the actual data within a document. The words and illustrations that make up a bicycle assembly manual are its contents.
7
Structuring XML7 What is Format? Format consists of how the words, sentences, and paragraphs are visually presented and distinguished from one another within a document. Boldface for title, italics for special terms, and blank lines between sections are examples of document formats. People often confuse format with structure.
8
Structuring XML8 Why Structure, Content, and Format Are Important in XML? XML defines the structure and separate the content from the delivery-specific format. Through this approach, the actual document — its content and structure — becomes mobile.
9
Structuring XML9 Indicating Structure Through Visual Cues PRODUCT ADVISORY Number: 146 Type: Parts Date: 8/15/95 Subject: Revised Replacement Parts... Model 501 User Replaceable Parts The parts list identified in the AnyCorp Model 501... New Parts List 1. 345-234 (Filter, cooling fan) 2. 148-745 (Fuse, power: 1.5amp) 3... Product Advisory Number: 146 Type: Parts Date: 8/15/95 Revised: Subject: Revised Replacement... Model 501 User-Replaceable Parts The parts list identified in the... New Parts List 1. 345-234 (Filter, cooling fan) 2. 148-745 (Fuse, power: 1.5amp) 3....
10
Structuring XML10 Defining Structures in XML The structure of a document its type is defined by a document type definition, or DTD. The DTD lays out the rules for a document through the use of elements, attributes, and entities.
11
Structuring XML11 Defining Structures in XML <!DOCTYPE advisory [ <!ATTLIST graphic filename CDATA #REQUIRED artno CDATA #IMPLIED> ]>
12
Structuring XML12 Using Structures in XML Number: 146 Type: 146 Date: 8/15/95 Revised: 9/29/95 Model 501 Nebulation Subject: Revised Replacement Parts List (AnyCorp Model 501) Model 501 User-Replaceable Parts The parts list identified in the AnyCorp Model 501 User's Maintenance Guide has been superseded, effective immediately. User- Replaceable parts are identified in the revised part list below. Parts orders which reference items o12n the previous list (dated 2/5/94) will be honored up to 3/14/96. Customers are advised to order from this revised list in order that they may achieve higher reliability at a lower unit cost. Questions on this subject should be directed to the Central Spares Organization. New Parts List 345-234(Filler, coolingfan) 148-745(Fuse, power, 1.5amp) 345-712(Lamp, Indicator) 2346-92(Disk, cleaning) 347-622(Swabs, cleaning)
13
Structuring XML13 Well-Formed and Valid Documents XML has two different notions of “correct.” Valid documents –Declaring conformance to a DTD in a document type declaration –“Using the right words in the right place” –Type-valid Well-formed documents –Markup is intelligible. –“Getting the pronunciation right” –Non-type-valid
14
Structuring XML14 Example — Table
15
Structuring XML15 Example — Table
16
Structuring XML16 Example — Table
17
Structuring XML17 Example — Database Publishing
18
Structuring XML18 Example: A DTD for B2B EC RosettaNet PIP 3 A2 Price And Availability Query Version 1.2 Available at http://www.rosettanet.org
19
DTD Syntax Review
20
Structuring XML20 DTD Syntax Seven major headings: –document type declarations –element types –attributes –entities –notations –conditional sections –processing instructions
21
Structuring XML21 Document Type Declaration A document type declaration defines constraints on the logical structure and to support the use of predefined storage units. The XML document type declaration contains or points to markup declarations that provide a grammar for a class of documents.
22
Structuring XML22 Document Type Declaration <!DOCTYPE label[ ]> Rock N. Robyn Jay Bird Street Baltimore MD USA 43214
23
Structuring XML23 Document Type Declaration <!DOCTYPE LABEL SYSTEM http://www.sgmlsource.com/dtds/label.dtd>...
24
Structuring XML24 Elements Type Declaration Elements provide the basic logical structure for XML documents. Element Type Declaration [45] elementdecl ::= ' ' [46] contentspec ::= 'EMPTY' | 'ANY' | Mixed | children Element-content Models [47] children ::= (choice | seq) ('?' | '*' | '+')? [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')? [49] choice ::= '(' S? cp ( S? '|' S? cp )* S? ') [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'
25
Structuring XML25 Elements Type Declaration
26
Structuring XML26 Attributes Attributes provide meta-data for elements, such as a security level, a revision status, or a unique identifier. Use an attribute list declaration to declare attributes for an element <!ATTLIST sample idID #IMPLIED nCDATA #REQUIRED status (draft|final) “final”> attribute name attribute typedefault value
27
Structuring XML27 Entities There are two types of entities: –general entities: apply within the top-level and its attribute values. –parameter entities: apply within the internal and external DTD subsets.
28
Structuring XML28 Entities: General Entities The &xml; is derived from ISO 8879, an International Standard The Extensible Markup Language is derived from ISO 8879, an International Standard
29
Structuring XML29 Entities: Parameter Entity <!ELEMENT para (%inline;)*
30
Structuring XML30 Notations Notations are used to include non-XML contents ─ like graphics, sounds, video, or source-code listing ─ in XML documents. While the XML parser knows nothing about the specific notations, it can pass them on to the processing software to let it know what kinds of data to handle. <!NOTATION TeX PUBLIC “+//ISBN 0-201-13448-9::Knuth//NOTATION The TeXbook//EN”>
31
Structuring XML31 Conditional Sections In the external DTD subsets and external parameter entities, XML allows conditional sections that the parser can include or ignore, depending on the value of the keywords at the start. <![IGNORE [ ]]> <![%include-para;[ ]]> <!DOCTYPE book SYSTEM “book.dtd”[ ]> overriding a parameter entity
32
Structuring XML32 Processing Instructions XML parser will pass PIs on to your application, but will be up to you to do something useful with them.
33
Introduction to XML Schema
34
Structuring XML34 Introduction The new XML Schema system aims at providing a rich grammatical structure for XML documents that overcomes the limitations of the DTD.
35
Structuring XML35 What is a Schema? A schema is a model for describing the structure of information. In the context of XML, a schema describes a model for a whole class of documents. A schema might also be viewed as an agreement on a common vocabulary for a particular application that involves exchanging documents.
36
Structuring XML36 What is a Schema? In schemas, models are described in terms of constraints. Two kinds of constraints that you can give: –content model constraints describe the order and sequence of elements and –datatype constraints describe valid units of data.
37
Structuring XML37 What is a Schema? For example, a schema might describe a valid with the content model constraint that –it consists of a element, followed by –one or more elements, followed by –exactly one,, and element. –The content of a might have a further datatype constraint that it consist of either a sequence of exactly five digits or a sequence of five digits, followed by a hyphen, followed by a sequence of exactly four digits. No other text is a valid ZIP code. Namron H. Slaw 256 Eight Bit Lane East Yahoo MA CT blue invalid
38
Structuring XML38 Limitations of DTD XML inherited DTDs from SGML. DTDs can be used to define content models and, to a limited extent, the datatypes of attributes, but they have a number of obvious limitations: –different (non-XML) syntax –no support for namespaces –extremely limited datatyping –a complex and fragile extension mechanism based on little more than string substitution (no explicit relationship)
39
Structuring XML39 Features of Schema Richer datatypes –booleans, numbers, dates and times, URIs, integers, decimal numbers, real numbers, intervals of time, etc. User defined types Attribute grouping Refinable archetypes Namespace support
40
Structuring XML40 Validity Reasons why need to validate documents: –EC: received is exactly what you expect. –B2B: validating before inserting into your database. –XML document for control purpose Content model validity tests whether the order and nesting of tags is correct. Datatype validity is the ability to test whether specific units of information are of the correct type and fall within the specified legal values.
41
Structuring XML41 Illustrations of XML Schema An XML document fragment 123456789 J123456 DTD fragment describing the above elements XML Schema fragment describing the above elements
42
Structuring XML42 Using Namespaces in XML Schema One person may be processing documents from many other parties and the different parties may want to represent their data elements differently. Moreover, in a single document, they may need to separately refer to elements with the same name that are created by different parties. How can you distinguish between such different definitions with the same name? XML Schema allows the concept of namespaces to distinguish the definitions.
43
Structuring XML43 Using Namespaces in XML Schema A given XML Schema defines a set of new names. The names defined in a schema are said to belong to its target namespace. Definitions and declarations in a schema can refer to names that may belong to other namespaces. We refer to those namespaces as source namespaces. Each schema has one target namespace and possibly many source namespaces. In fact, every name in a given schema belongs to some namespace. The names for the namespaces can be fairly long, but they can be abbreviated with the syntax of xmlns declaration in the XML Schema document.
44
Structuring XML44 Using Namespaces in XML Schema Target and source namespaces <xsd:schema targetNamespace='http://www.SampleStore.com/Account' xmlns:xsd='http://www.w3.org/1999/XMLSchema' xmlns:ACC='http://www.SampleStore.com/Account'>
45
Structuring XML45 Using Namespaces in XML Schema Multiple source namespaces, importing a namespace <schema targetNamespace='http://www.SampleStore.com/Account' xmlns='http://www.w3.org/1999/XMLSchema' xmlns:ACC= 'http://www.SampleStore.com/Account' xmlns:PART='http://www.PartnerStore.com/PartsCatalog'> <import namespace='http://www.PartnerStore.com/PartsCatalog' schemaLocation= 'http://www.ProductStandards.org/repository/alpha.xsd'/>
46
Structuring XML46 Defining Elements To define an element is to define its name and content model. In XML Schema, the content model of an element is defined by its type. Then, the instance elements in an XML document can have only values that fit the types defined in its schema.
47
Structuring XML47 Defining Elements A type can be simple or complex. A simple type cannot contain elements or attributes in its value. A complex type can create the effect of embedding elements in other elements or it can associate attributes with an element. The XML Schema spec also includes predefined simple types A derived simple type constrains the values of its base type.
48
Structuring XML48 Defining Elements Simple, non-nested elements have a simple type –An element that does not contain attributes or other elements can be defined to be of a simple type, predefined or user-defined, such as string, integer, decimal, time, ProductCode, etc.
49
Structuring XML49 Defining Elements Elements with attributes must have a complex type –If you want to add an attribute, you must define price as a complex type. –We have defined what is called an anonymous type, where no explicit name is given to the complex type. In other words, the name attribute of the complexType element is not defined.
50
Structuring XML50 Defining Elements Elements that embed other elements must have a complex type Cool XML Cool Guy XML Document XML DTD XML Schema
51
Structuring XML51 Defining Elements A complex type defined with global simple types
52
Structuring XML52 Defining Elements Hiding BookType as a local type
53
Structuring XML53 Defining Elements Expressing sophisticated constraints on elements –XML Schema offers greater flexibility than DTD for expressing constraints on the content model of elements. –At the simplest level, as in DTD, you can associate attributes with an element declaration and indicate that a sequence of one only (1), zero or more (*), or one or more (+) elements from a given set of elements can occur in it. –You can express additional constraints in XML Schema using, for example, minOccurs and maxOccurs attributes of element element and using choice, group, and all elements.
54
Structuring XML54 Purchase order schema for Example.com. Copyright 2000 Example.com. All rights reserved. The Purchase Order Schema (1)
55
Structuring XML55 The Purchase Order Schema (2)
56
Structuring XML56 Simple Types
57
Structuring XML57 Simple Types 20003 15037 95977 95945
58
Structuring XML58 Simple Types PA NY CA NY LA AK sixStates is declared to be a SixUSStates element.
59
Types of Interaction with Document
60
Structuring XML60 Types of Interaction with Documents Most documents stored in XML forms are created for the purpose of conveying information or keeping track of information. Types of interactions people have with documents: –creation and modification –management, storage, and archiving –utilization.
61
Structuring XML61 Printing Import Exchange Searching and viewing Creation Types of Interaction with Document Update Review/ validation Conversion/ transformation Document classification Document assembly Document archival Document storage Useful database information Document creation and modification Document management and storage Document utilization Building alternate documents Online searching viewing, exchange, export Extraction, analysis
62
DTD in Electronic Business
63
Structuring XML63 RosettaNet: An EB Standard RosettaNet is a consortium of major information technology (IT), electronic components (EC) and semiconductor manufacturing (SM) companies working to create and implement industry-wide EB process standards. –Perfect real-time information. –Efficient e-business processes. –Dynamic trading-partner relationships. –New business opportunities.
64
Structuring XML64 Alphabet Grammar Dialog Words XML Framework Dictionary SoundInternet Business Process Telephone RosettaNet Telephone DIALOG PIP eBusiness Process Ecom Application human-to-human business exchange Partner-to-Partner eBusiness exchange Business Process RosettaNet Focus
65
Structuring XML65 Conclusion Schemas greatly improves over DTDs. Certain kinds of applications can be made more interoperable by XML Schema. DTDs are well understood and they do offer a good way to describe the structure of an document for interchange. It will take some time before XML Schema are as well understood.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.