Introduction to XML: DTD

Slides:



Advertisements
Similar presentations
XML I.
Advertisements

Defining XML The Document Type Definition. Document Type Definition text syntax for defining –elements of XML –attributes (and possibly default values)
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
Document Type Definition DTDs CS-328. What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML.
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
A Technical Introduction to XML Transparency No. 1 XML quick References.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
XML Verification Well-formed XML document  conforms to basic XML syntax  contains only built-in character entities Validated XML document  conforms.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
VALIDATING AN XML DOCUMENT
Introduction to XML This material is based heavily on the tutorial by the same name at
Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Validating DOCUMENTS with DTDs
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
Chapter 4: Document Type Definitions. Chapter 4 Objectives Learn to create DTDs Validate an XML document against a DTD Use DTDs to create XML documents.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Syntax - Writing XML and Designing DTD's
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
XML (2) DTD Sungchul Hong.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
 2002 Prentice Hall, Inc. All rights reserved. Chapter 6 – Document Type Definition (DTD) Outline 6.1Introduction 6.2Parsers, Well-formed and Valid XML.
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML - DTD Week 4 Anthony Borquez. What can XML do? provides an application independent way of sharing data. independent groups of people can agree to.
CP3024 Lecture 9 XML: Extensible Markup Language.
XML Extensible Markup Language Aleksandar Bogdanovski Programing Enviroment LABoratory
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
1 Dr Alexiei Dingli XML Technologies DTD. 2 Document Type Definition Defines –the legal building blocks of an XML document –the document structure –The.
1/11 ITApplications XML Module Session 3: Document Type Definition (DTD) Part 1.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
Introduction to XML February 07, From HTML to XML As mentioned in previous classes, if you know HTML, then you already know XML… really! In this.
Beginning XML 3 rd Edition. Chapter 4: Document Type Definitions.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
Well Formed XML The basics. A Simple XML Document Smith Alice.
Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list.
Document Type Definition (DTD) Eugenia Fernandez IUPUI.
XML CORE CSC1310 Fall XML DOCUMENT XML document XML document is a convenient way for parsers to archive data. In other words, it is a way to describe.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
Session III Chapter 6 – Creating DTDs
New Perspectives on XML
Session II Chapter 6 – Creating DTDs
Allyson Falkner Spokane County ISD
XML IST 421.
Presentation transcript:

Introduction to XML: DTD Jaana Holvikivi

Document type definition: structure Topics: Elements Attributes Entities Processing instructions (PI) DTD design Jaana Holvikivi

DTD <!– Document type description (DTD) example (part) --> <!ELEMENT university (department+)> <!ELEMENT department (name, address)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> Document type description, structural description one rule /element name content a grammar for document instances ”regular clauses" (not necessary) Jaana Holvikivi

DTD: advantages validating parsers check that the document conforms to the DTD enforces logical use of tags there are existing DTD standards for many application areas common vocabulary Jaana Holvikivi

Well-formed documents An XML document is well-formed if its elements are properly nested so that it has a hierarchical tree structure, and all elements have an end tag (or are empty elements) it has one and only one root element complies with the basic syntax and structural rules of the XML 1.0 specification: rules for characters, white space, quotes, etc. and its every parsed entity is well-formed Jaana Holvikivi

Validity An XML-document is valid if it is well-formed it has an attached DTD (or schema) it conforms to the DTD (or schema) Validity is checked with a validating parser, either the whole document at once (”batch") interactively Jaana Holvikivi

Document type declaration Shared <!DOCTYPE catalog PUBLIC ”-//ORG_NAME//DTD CATALOG//EN"> - flag(-/+) indicates a less important standard ISO –standards start with ”ISO ORG_NAME the owner of the DTD DTD file type CATALOG document name EN language the document type definition can be included in the internal database of the processor (no connection needed) <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> Jaana Holvikivi

External or internal DTD Document type declaration format: <!DOCTYPE document_element source location [internal subset of DTD] > internal DTD, DTD and instance in the same file: <!DOCTYPE catalog SYSTEM [<!ELEMENT catalog … and so on …]> simple example: <!DOCTYPE Mymessage SYSTEM [<!ELEMENT Mymessage (#PCDATA)]> external DTD, examples <!DOCTYPE dictionary SYSTEM ”dictionary.dtd”> <!DOCTYPE dictionary SYSTEM ”http://www.evtek.fi/DTD/dictionary.dtd”> Jaana Holvikivi

External or internal DTD internal and external: <!DOCTYPE Mymessage SYSTEM ”myDTD.dtd” [<!ELEMENT Mymessage … and so on… … (#PCDATA)]> Shared document type: <!DOCTYPE Dictionary PUBLIC ”http://www.evtek.fi/DTD/dictionary.dtd”> Jaana Holvikivi

Element type declaration <!ELEMENT country (capital)> element name element content content declaration, content model delimiters (<!, >, (, )) and keyword (ELEMENT) Jaana Holvikivi

Sub-elements, children Children in specified order: <!ELEMENT country (cname, capital, population)> choice of a child (”pipe” |): <!ELEMENT country (cname | official_name)> optional singular element ? only one or zero: <!ELEMENT country (cname, capital, population?)> Jaana Holvikivi

Cardinality operators Number of occurrences * zero or more, optional <!ELEMENT country (cname, capital, city*)> + one or more child elements, required <!ELEMENT country (cname, neighbour_country+)> ? optional singular element [none] one required singular element repeating group: <!ELEMENT country (cname, (city, city_population)*)> Jaana Holvikivi

Element content model Data <!ELEMENT cname (#PCDATA)> Elements "parsed character data" Elements sub-elements ( child elements) Mixed content data and elements <!ELEMENT para (#PCDATA | sub | super)*> #PCDATA must be first in content model, group has options child element sequence, choices and cardinality cannot be specified Jaana Holvikivi

Empty element and ANY <!ELEMENT image EMPTY> in document instance must be <image/> not allowed: <image></image> a regular declaration <!ELEMENT im (...)> allows both ways <im/>, <im></im> or <im>...</im> <!ELEMENT some ANY> element can contain any declared element, flexible, maybe too flexible? Jaana Holvikivi

Short example: Dictionary <!ELEMENT dictionary (word_article)*> <!ELEMENT word_article (head_word, pronunciation, sense+)> <!ELEMENT head_word (#PCDATA)> <!ELEMENT pronunciation (#PCDATA)> <!ELEMENT sense (definition, example*)> <!ELEMENT definition (#PCDATA)> <!ELEMENT example (#PCDATA)> Jaana Holvikivi

Dictionary XML <?xml version="1.0" ?> <!DOCTYPE dictionary SYSTEM "dict.dtd"> <dictionary> <word_article> <head_word> carry </head_word> <pronunciation> kaeri </pronunciation> Jaana Holvikivi

support the weight of and move from place to place </definition> <sense> <definition> support the weight of and move from place to place </definition> <example> Railways and ships carry goods. </example> He carried the news to everyone. </sense> wear, possess </definition> I never carry much money with me. </word_article> Jaana Holvikivi

<word_article> <head_word>gossamer </head_word> <pronunciation> gosomo </pronunciation> <sense> <definition>fine, silky substance of webs made by small spiders </definition> </sense> </word_article> </dictionary> Jaana Holvikivi

Content models Try to make definitions unambiguous (clear) wrong: (item?, item) right: (item, item?) wrong ((surname, employee) | (surname, customer)) right: (surname, (employee | customer)) <!ELEMENT BookCatalog (Catalog, Publisher+, Book*)> document must have <BookCatalog> top level element contains always one <Catalog> and at least one <Publisher> child <Book> elements may not be present Jaana Holvikivi

Attribute declarations ATTLIST Attributes can be used to describe the metadata or properties of the associated element attributes are also an alternative way to markup data <!ATTLIST country population NMTOKEN #IMPLIED language CDATA #REQUIRED continent (Europe | America | Asia ) "Europe"> CDATA character data, any text enumerated values (choice list) <!ATTLIST country continent (Europe | America | Asia ) "Europe"> remember that XML is case sensitive default value given above the parser may supply a default value if it is not given Jaana Holvikivi

Attribute defaults #REQUIRED #IMPLIED the attribute must appear in every instance of the element #IMPLIED optional enumerated values can have a default no default for implied/required <!ATTLIST catalog type CDATA #REQUIRED> <!ATTLIST catalog type NMTOKEN #IMPLIED> <!ATTLIST catalog type (phone | e-mail)> <!ATTLIST catalog type (phone | e-mail) "phone"> Jaana Holvikivi

Attribute types NMTOKEN NMTOKENS name token <country population = "100"> NMTOKENS a list of name tokens delimited by white space These types are useful primarily to processing applications. The types are used to specify a valid name(s). You might use them when you are associating some other component with the element, such as a Java class or a security algorithm <!ATTLIST DATA AUTHORIZED_USERS NMTOKENS #IMPLIED> <DATA SECURITY="ON" AUTHORIZED_USERS = "IggieeB SelenaS GuntherB"> element content </DATA> Jaana Holvikivi

Attribute types ID IDREF IDREFS attribute value is the unique identifier for this element instance, must be a valid XML name IDREF reference to the element that has the same value as that of the IDREF IDREFS a list of IDREFs delimited by white space Jaana Holvikivi

Attribute defaults <!ATTLIST country position #FIXED "independent"> attribute must match the default value why: for example to supply a value for an application reserved attributes, xml:lang xml:space prefix 'xml:' Jaana Holvikivi

Element vs attribute When to mark up with an element, when to use attributes? Element to describe structures, expandable when shown in the output contents cannot be defined as strictly as with attributes Attribute no structure, no multiple values “internal” information default values possible Jaana Holvikivi

Entities Each XML document is an entity, could comprise of several entities document entity ”subdocuments" = entities general entities: internal or external parsed or unparsed (external only) a parsed entity can include any well-formed content (replacement text) entity declaration entity reference all unparsed entities must have an associated notation Jaana Holvikivi

Internal text entities Predefined string <!ENTITY evitech ”Espoo Vantaa Institute of Technology"> I study at the &evitech; Single versus double quotes <!ENTITY sent 'His foot is 12" long'> <!ENTITY sent "His foot is 12" long"> Entity reference character string > > < < " " &apos; ' &amps; & < < A A < < (hexadecimal) ￸ ... (Unicode) Jaana Holvikivi

CDATA: special characters in XML <action> <script language ='Javascript'> <![CDATA[ function Fhello() { if (n >1 && m > 8) alert ("Hello"); } ]]> </script> </action> Jaana Holvikivi

External entity Outside the document entity itself within the same resource: <!ENTITY myfile SYSTEM "extra_files/file.xml"> public location <!ENTITY myfile PUBLIC "... description..."> needs an index an unparsed external entity is a reference to an external resource (I.e. an image file) Binary file file type has to be declared <!ENTITY myphoto SYSTEM "/figures/photo.gif" NDATA GIF> Use: Take a look at my photo <picture name="myphoto"/>. Jaana Holvikivi

ENTITY ENTITIES Notation <!ENTITY mypicture "123.jpg"> <!ELEMENT pic EMPTY> <!ATTLIST pic picfile ENTITY mypicture> in the document instance: <pic picfile="mypicture”/> ENTITIES a list of ENTITY names Notation <!ATTLIST image format NOTATION (TeX | TIFF)> Jaana Holvikivi

Entity references, summary General parsed entity reference in the document instance not in the DTD entity hierarchy unparsed entity no references from the text given as attribute values parameter entity in DTD, not in the document instance Jaana Holvikivi

Parameter entity Only usable within a DTD (not in an XML document) <!ENTITY % parapart "(emph | supersc | subsc)"> <!ELEMENT paragraph (%parapart | bold)> <!ELEMENT list (%parapart | item)*> <!ELEMENT paragraph (emph | supersc | subsc | bold)> Jaana Holvikivi

Notation declaration <!NOTATION PIXI SYSTEM ""> <!NOTATION TIFF SYSTEM "C:\APPS\Show_tiff.exe"> Entity declaration refers to notation: <!ENTITY Logo SYSTEM "logo.tif" NDATA TIFF> Notation provides information for an application how to process unparsed entities Jaana Holvikivi

Without a DTD: Attributes have no default values attributes are always text type CDATA all attributes are optional entities cannot be declared only standard entities are possible (&apos;) element contents are not clearly defined elements, data or mixed Jaana Holvikivi

DTD design XML often replaces a previous system when transforming to XML a standard DTD could be selected (with possible modifications) partners, affiliations a new DTD is designed DTD design based on existing document models in the company (representative) model documents other designers consulted Jaana Holvikivi

Document analysis Document features name, could it be without a name? how many occurrences preceding/ following information, regularity parts of the document standard contents (automatically generated) XML document (or parts of it) maybe generated from a data base use data base relation, descriptions and models (UML) when designing DTD Jaana Holvikivi

DTD design Standard DTD or new? Element order, granularity, structure Compatibility and data exchange processing needs, applications future needs, linking consistent names Element order, granularity, structure element vs. attribute? Rules? Order of rules ? comments? modularity? Naming style: short or descriptive, upper or lower case? Jaana Holvikivi

Tree diagrams farm farmer farm_hand farm dairy_farm forestry_farm <!ELEMENT farm (farmer, farm_hand)> <!ELEMENT farm (dairy_farm | forestry_farm)> dairy_farm name #PCdata Jaana Holvikivi

Standard DTDs http://www.xml.com/pub/rg/DTDs http://www.xml.org/ http://xml.coverpages.org/ http://www.ebxml.org/specs/ebBPSS.dtd MathML, CML (chemistry), UXF (UML eXchange Format), SMIL (multimedia), RDF (Resource Description Framework), HumanML (natural language), DocBook, etc. Jaana Holvikivi