2003.10.09 - SLIDE 1IS 202 – FALL 2003 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003

Slides:



Advertisements
Similar presentations
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
Advertisements

 Fundamentals of Web Design.  Describe the history and theory of XHTML  Understand the rules for creating valid XHTML documents  Apply a DTD to an.
1 XML DTD & XML Schema Monica Farrow G30
An Introduction to XML Schema CSCI 7818 by Ming Rutar.
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
11/15/2001Information Organization and Retrieval Information Structures and Metadata University of California, Berkeley School of Information Management.
XML Schemas Microsoft XML Schemas W3C XML Schemas.
XML Schemas Lecture 10, 07/10/02. Acknowledgements A great portion of this presentation has been borrowed from Roger Costello’s excellent presentation.
Lecture 14 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
DECO 3002 Advanced Technology Integrated Design Computing Studio Tutorial 6 – XML Schema School of Architecture, Design Science and Planning Faculty of.
1 XML Schemas Marco Mesiti This Presentation has been extracted from Roger L. Costello (XML Technologies Course)
XML Schemas and Namespaces Lecture 11, 07/10/02. BookStore.dtd.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Unit 4 – XML Schema XML - Level I Basic.
Introduction to XML This material is based heavily on the tutorial by the same name at
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
Introduction to XML: Part I By Sandeep Jangity CS 157B, Section 2 Dr. Lee.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Lecture 15 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XP The University of Akron Summit College Business Technology Department Computer Information Systems 2440: 140 Internet Tools Instructor: Enoch E. Damson.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
XML Schema Vinod Kumar Kayartaya. What is XML Schema?  XML Schema is an XML based alternative to DTD  An XML schema describes the structure of an XML.
1 XML Schemas. 2 Useful Links Schema tutorial links:

Dr. Azeddine Chikh IS446: Internet Software Development.
Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
ITR3 lecture 3: Namespaces, XML Schema & XSL Thomas Krichel
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
MIS 315 Bsharah An Introduction to XML 1MIS Bsharah.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
Cornell CS 502 More XML XHTML, namespaces, DTDs CS 502 – Carl Lagoze – Cornell University.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Ceng 520 XML Schemas IntroductionXML Schemas 2 Part 0: Introduction Why XML Schema?
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
XP Tutorial 9 1 Working with XHTML. XP SGML 2 Standard Generalized Markup Language (SGML) A standard for specifying markup languages. Large, complex standard.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
SDPL 2005Notes 2.5: XML Schemas1 2.5 XML Schemas n Short introduction to XML Schema –W3C Recommendation, 1 st Ed. May, 2001; 2 nd Ed. Oct, 2004: »XML Schema.
New Perspectives on XML, 2nd Edition
More XML namespaces, DTDs CS 431 – Carl Lagoze – Cornell University.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 5 XML Schema (Based on Møller and Schwartzbach,
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Tutorial 13 Validating Documents with Schemas
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Tutorial 2: XML Working with Namespaces. 2 Name Collision This figure shows two documents each with a Name element.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
1 XML and XML in DLESE Katy Ginger November 2003.
XML QUESTIONS AND ANSWERS
Presentation transcript:

SLIDE 1IS 202 – FALL 2003 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall SIMS 202: Information Organization and Retrieval Lecture 14: Metadata and Markup

SLIDE 2IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 3IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 4IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 5IS 202 – FALL 2003 XML as a common syntax XML (and SGML) provide a way of expressing the structure of documents that can be verified and validated by document processing systems “Documents” can be metadata structures –Such as the description of a particular photograph in our Phone project XML thus provides a way of representing metadata descriptions as well as the content that they describe

SLIDE 6IS 202 – FALL 2003 XML as a common syntax All XML documents follow some simple rules that make them interchangeable and usable across different systems –All data and markup is in UNICODE –All elements are marked by begin and end tags –All markup is case-sensitive –XML DTD’s and/or Schemas define the valid structure (and sometimes content) of the documents

SLIDE 7IS 202 – FALL 2003 Example – METS METS – the Metadata Encoding and Transmission Standard is a new Schema intended to provide: –“a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium” METS can be used to “wrap” complex sets of data (the actual data, with rules for encoding binary forms), the metadata describing the parts of that data, and the sequence and conditions under which the data can or should be presented or displayed

SLIDE 8IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 9IS 202 – FALL 2003 SGML/XML Structure An SGML document consists of three parts: –The SGML Declaration –The Document Type Definition (DTD) –The Document Instance An XML document REQUIRES only the document instance, but for effective processing a DTD is very important XML Schema (later) provides an alternative to DTDs for XML applications

SLIDE 10IS 202 – FALL 2003 Document Type Definitions The DTD describes the structural elements and "shorthand" markup for a particular document type and defines: –Names of "legal" elements –How many times elements can appear –The order of elements in a document –Whether markup can be omitted (SGML only) –Contents of elements (i.e., nested structures) –Attributes associated with elements –Names of "entities" –Short-hand conventions for element tags (SGML only)

SLIDE 11IS 202 – FALL 2003 DTD Components The major components of a DTD are: –Entity Declarations –Element Declarations –Attribute Declarations

SLIDE 12IS 202 – FALL 2003 Document Type Definitions Entity Declarations are a "macro" definition facility for both DTD and Document instance parts –General Internal Entity Definitions referenced by &name; –General External Entity Definitions referenced by &name; –Parameter Entity Definitions (used only inside DTDs) or referenced by %name; or %name

SLIDE 13IS 202 – FALL 2003 Document Type Definitions SGML Element Declarations define the structural elements of a document and its associated markup –Omitted tag minimization indicates whether start-tags or end-tags can be omitted in the markup (o) or (-) are required in SGML but can NOT be used in XML

SLIDE 14IS 202 – FALL 2003 Document Type Definitions Content model provides a nested structural description of the elements that make up this element, e.g.:... –ANY (in SGML) may be used to indicate a content model of any elements in the DTD, in any order

SLIDE 15IS 202 – FALL 2003 Document Type Definitions Same content model in XML <!DOCTYPE memo [ … ]> –Note the XML processing instruction “Prolog” –Note that & in previous page is not legal XML

SLIDE 16IS 202 – FALL 2003 Document Type Definitions Declared content can be: PCDATA, CDATA, RCDATA, EMPTY Inclusion and Exclusion lists can be used to indicate elements that can occur or are forbidden to occur in any sub-elements of the content model (NOT in XML), e.g.: –Says that element fn can appear anyplace in the memo

SLIDE 17IS 202 – FALL 2003 Document Type Definitions Attribute Declarations define attributes associated with (potentially) each element of a document and provide the acceptable values for those attributes

SLIDE 18IS 202 – FALL 2003 Attributes Example –In markup of a document: also, because of the default set: would be the same as There are a variety of special defaults and data types that can be given in attribute definitions

SLIDE 19IS 202 – FALL 2003 Sample SGML DTD <!doctype ELIB-TEXTS [ <!-- This is a DTD for bibliographic records extracted from the elib/rfc1357 simple bibliographic format. --> <!ELEMENT ELIB-BIB - - (BIB-VERSION, ID, ENTRY?, DATE?, TITLE*, ORGANIZATION*, (SERIES | TYPE | REVISION | REVISION-DATE | AUTHOR-PERSONAL | AUTHOR-INSTITUTIONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-INSTITUTIONAL | CONTACT AUTHOR | PROJECT | PAGES | BIOREGION | CERES-BIOREGION | TEXTSOUP | LOCATION | ULTIMATE-CLIENT | URL | KEYWORDS | NOTES | ABSTRACT)*, (TEXT-REF | PAGED-REF)* )> … etc… ]>

SLIDE 20IS 202 – FALL 2003 XML Version <!doctype ELIB-TEXTS [ <!-- This is a DTD for bibliographic records extracted from the elib/rfc1357 simple bibliographic format. --> <!ELEMENT ELIB-BIB (BIB-VERSION, ID, ENTRY?, DATE?, TITLE*, ORGANIZATION*, (SERIES | TYPE | REVISION | REVISION-DATE | AUTHOR-PERSONAL | AUTHOR-INSTITUTIONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-INSTITUTIONAL | CONTACT AUTHOR | PROJECT | PAGES | BIOREGION | CERES-BIOREGION | TEXTSOUP | LOCATION | ULTIMATE-CLIENT | URL | KEYWORDS | NOTES | ABSTRACT)*, (TEXT-REF | PAGED-REF)* )> … etc… ]>

SLIDE 21IS 202 – FALL 2003 Document Using That DTD ELIB-v1.0 6 February March 1, 1993 Water Conditions in California Report 2 California Department of Water Resources bulletin California Department of Water Resources 17 /elib/data/disk/disk5/documents/6/HYPEROCR/hyperocr.html /elib/data/disk/disk5/documents/6/OCR-ASCII-NOZONE

SLIDE 22IS 202 – FALL 2003 Dublin Core Review… Simple metadata for describing internet resources For “Document-Like Objects” 15 Elements

SLIDE 23IS 202 – FALL 2003 Dublin Core Elements Title Creator Subject Description Publisher Other Contributors Date Resource Type Format Resource Identifier Source Language Relation Coverage Rights Management

SLIDE 24IS 202 – FALL 2003 DC XML DTD Implementation There have been various versions This one is the one recommended (required) by the Open Archives Initiative Metadata Harvesting Protocol (OAI-MHP) Uses XML Name Spaces Available at

SLIDE 25IS 202 – FALL 2003 DC Element and Attribute Definitions <!-- An entity primarily responsible for making the content of the resource. --> <!-- An entity responsible for making contributions to the content of the resource. -->

SLIDE 26IS 202 – FALL 2003 DC Element Definitions (cont.)

SLIDE 27IS 202 – FALL 2003 A More Complex SGML DTD <!DOCTYPE USMARC [ <!ATTLIST USMARC Material (BK|AM|CF|MP|MU|VM|SE) "BK" id CDATA #IMPLIED> <!-- Author's Note: the id attribute for the USMARC element is intended to hold a unique record number for each MARC record in the local database. That is to say, it is intended ONLY as an aid in maintaining the local database of MARC records --> <!ELEMENT Leader - O (LRL, RecStat, RecType, BibLevel, UCP, IndCount, SFCount, BaseAddr, EncLevel, DscCatFm, LinkRec, EntryMap)> …etc…

SLIDE 28IS 202 – FALL 2003 More Complex DTD (cont.) <!ELEMENT VarDFlds - O (NumbCode, MainEnty?, Titles, EdImprnt?, PhysDesc?, Series?, Notes?, SubjAccs?, AddEnty?, LinkEnty?, SAddEnty?, HoldAltG?, Fld9XX?)> <!ELEMENT NumbCode - O (Fld010?, Fld011?, Fld015?, Fld017*, Fld018?, Fld019*, Fld020*, Fld022*, Fld023*, Fld024*, Fld025*, Fld027*, Fld028*, Fld029*, Fld030*, Fld032*, Fld033*, Fld034*, Fld035*, Fld036?, Fld037*, Fld039*, Fld040?, Fld041?, Fld042?, Fld043?, Fld044?, Fld045?, Fld046?, Fld047?, Fld048*, Fld050*, Fld051*, Fld052*, Fld055*, Fld060*, Fld061*, Fld066?, Fld069*, Fld070*, Fld071*, Fld072*, Fld074*, Fld080?, Fld082*, Fld084*, Fld086*, Fld088*, Fld090*, Fld096*)> <!ELEMENT Titles - O (Fld210?, Fld211*, Fld212*, Fld214*, Fld222*, Fld240?, Fld242*, Fld243?, Fld245, Fld246*, Fld247*)> <!ELEMENT EdImprnt - O (Fld250?, Fld254?, Fld255*, Fld256?, Fld257?, Fld260?, Fld261?, Fld262?, Fld263?, Fld265?)> <!ELEMENT PhysDesc - O (Fld300*, Fld305*, Fld306?, Fld310?, Fld315?, Fld321*, Fld340*, Fld350?, Fld351*, Fld355*, Fld357*, Fld362*)> …etc…

SLIDE 29IS 202 – FALL 2003 Complex DTD (cont.) <!ATTLIST Fld245 AddEnty (No|Yes|Blank) #IMPLIED NFChars (0|1|2|3|4|5|6|7|8|9|Blnk) #IMPLIED> …etc…

SLIDE 30IS 202 – FALL 2003 Document Markup All document markup is derived from the DTD for the particular document type In SGML the DTD should be referenced in the document using the DOCTYPE declaration: or or The doctype_declaration_subset can be any combination of elements, entity, and attribute declarations

SLIDE 31IS 202 – FALL 2003 HTML HTML was not originally "real" SGML, the DTD was invented after the language It is often more concerned with the form of the output on the screen than with the structural contents of the HTML docs Relies on the application (such as Netscape) to implement interesting actions like hypertext linking XHTML is now a W3C “recommendation” that applies XML conventions to HTML, and provides a growing set of capabilities within an XML framework (our phones use XHTML)

SLIDE 32IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 33IS 202 – FALL 2003 What are XML Schemas? An XML vocabulary for expressing your data's structure AND content types, and even the business rules involved in processing the data Written in XML themselves Support namespaces for combining multiple schemas in the same documents –The slides in this section are based on an XML tutorial by Roger L. Costello

SLIDE 34IS 202 – FALL 2003 Example Is this data valid? To be valid, it must meet these constraints (data business rules): 1. The location must be comprised of a latitude, followed by a longitude, followed by an indication of the uncertainty of the lat/lon measurements. 2. The latitude must be a decimal with a value between -90 to The longitude must be a decimal with a value between -180 to For both latitude and longitude the number of digits to the right of the decimal point must be exactly six digits. 5. The value of uncertainty must be a non-negative integer 6. The uncertainty units must be either meters or feet. We can express all these data constraints using XML Schemas

SLIDE 35IS 202 – FALL 2003 Validating your data check that the latitude is between -90 and +90 -check that the longitude is between -180 and check that the fraction digits is 6 for lat and lon... XML Schema validator Data is ok!

SLIDE 36IS 202 – FALL 2003 Purpose of XML Schemas Specify: –the structure of instance documents "this element contains these elements, which contains these other elements, etc" –the datatype of each element/attribute "this element shall hold an integer with the range 0 to 12,000" (DTDs don't do too well with specifying datatypes like this)

SLIDE 37IS 202 – FALL 2003 Motivation for XML Schemas Why Schemas? People are dissatisfied with DTDs –It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent –Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the element to hold an integer with a range of 0 to 12,000" –Desire a set of datatypes compatible with those found in databases DTD supports 10 datatypes; XML Schemas supports 44+ datatypes

SLIDE 38IS 202 – FALL 2003 Highlights of XML Schemas XML Schemas are a tremendous advancement over DTDs: –Enhanced datatypes 44+ versus 10 Can create your own datatypes –Example: "This is a new type based on the string type and elements of this type must follow this pattern: ddd-dddd, where 'd' represents a digit". –Written in the same syntax as instance documents less syntax to remember –Object-oriented'ish Can extend or restrict a type (derive new type definitions on the basis of old ones) –Can express sets, i.e., can define the child elements to occur in any order

SLIDE 39IS 202 – FALL 2003 Highlights of XML Schemas Can specify element content as being unique (keys on content) and uniqueness within a region Can define multiple elements with the same name but different content Can define elements with nil content Can define substitutable elements - e.g., the "Book" element is substitutable for the "Publication" element.

SLIDE 40IS 202 – FALL 2003 BookStore.dtd

SLIDE 41IS 202 – FALL 2003 ATTLIST ELEMENT ID #PCDATA NMTOKEN ENTITY CDATA BookStore Book Title Author Date ISBN Publisher This is the vocabulary that DTDs provide to define your new vocabulary

SLIDE 42IS 202 – FALL 2003 element complexType schema sequence string integer boolean BookStore Book Title Author Date ISBN Publisher (targetNamespace) This is the vocabulary that XML Schemas provide to define your new vocabulary One difference between XML Schemas and DTDs is that the XML Schema vocabulary is associated with a name (namespace). Likewise, the new vocabulary that you define must be associated with a name (namespace). With DTDs neither set of vocabulary is associated with a name (namespace) [DTDs pre-dated namespaces].

SLIDE 43IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> BookStore.xsd xsd = Xml-Schema Definition

SLIDE 44IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> <!ELEMENT Book (Title, Author, Date, ISBN, Publisher)>

SLIDE 45IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> All XML Schemas have "schema" as the root element.

SLIDE 46IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> The elements and datatypes that are used to construct schemas - schema - element - complexType - sequence - string come from the namespace

SLIDE 47IS 202 – FALL 2003 element complexType schema sequence string integer boolean XMLSchema Namespace

SLIDE 48IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> Says that the elements defined by this schema - BookStore - Book - Title - Author - Date - ISBN - Publisher are to go in this namespace

SLIDE 49IS 202 – FALL 2003 BookStore Book Title Author Date ISBN Publisher (targetNamespace) Book Namespace (targetNamespace)

SLIDE 50IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> This is referencing a Book element declaration. The Book in what namespace? Since there is no namespace qualifier it is referencing the Book element in the default namespace, which is the targetNamespace! Thus, this is a reference to the Book element declaration in this schema. The default namespace Is which is the targetNamespace!

SLIDE 51IS 202 – FALL 2003 <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> This is a directive to any instance documents which conform to this schema: Any elements used by the instance document which were declared in this schema must be namespace qualified.

SLIDE 52IS 202 – FALL 2003 Referencing a schema in an XML instance document <BookStore xmlns =" xmlns:xsi=" xsi:schemaLocation=" BookStore.xsd"> My Life and Times Paul McCartney July, McMillin Publishing First, using a default namespace declaration, tell the schema-validator that all of the elements used in this instance document come from the namespace. 2. Second, with schemaLocation tell the schema-validator that the namespace is defined by BookStore.xsd (i.e., schemaLocation contains a pair of values). 3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one in the XMLSchema-instance namespace

SLIDE 53IS 202 – FALL 2003 schemaLocation type noNamespaceSchemaLocation nil XMLSchema-instance Namespace

SLIDE 54IS 202 – FALL 2003 Referencing a schema in an XML instance document BookStore.xml BookStore.xsd targetNamespace=" schemaLocation=" BookStore.xsd" - defines elements in namespace - uses elements from namespace A schema defines a new vocabulary. Instance documents use that new vocabulary.

SLIDE 55IS 202 – FALL 2003 Note multiple levels of checking BookStore.xmlBookStore.xsd XMLSchema.xsd (schema-for-schemas) Validate that the xml document conforms to the rules described in BookStore.xsd Validate that BookStore.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas

SLIDE 56IS 202 – FALL 2003 Default Value for minOccurs and maxOccurs The default value for minOccurs is "1" The default value for maxOccurs is "1" Equivalent!

SLIDE 57IS 202 – FALL 2003 Much More to XMLSchema! This was an overview of some basics There are many other features, such as: –The ability to import other schemas or parts of schemas –Ability to specify many data types –Etc. XMLSchema definitions are at W3C – is a good place to start

SLIDE 58IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 59IS 202 – FALL 2003 Other Protocols and Metadata Systems Using XML SOAP (Simple Object Access Protocol) DAV/DASL (Distributed Authoring and Versioning) SDLIP (Simple Digital Library Interoperability Protocol) RDF (Resource Description Framework) ADL Gazetteer Protocol OAI-MHP (already discussed) MPEG-7 (more next time) METS Also versions of MARC and other formats in XML

SLIDE 60IS 202 – FALL 2003 SGML and XML Sources and Resources Books: –van Herwijnen, Eric. Practical SGML. (2nd Ed.) Boston: Kluwer Academic Publishers, –Goldfarb, Charles F. The SGML Handbook. Oxford: Clarenden Press, (and MANY XML books) Web Sites: –The W3C web site (all XML standards documents) –Robin Cover’s SGML/XML Site

SLIDE 61IS 202 – FALL 2003 Lecture Overview Review –XML and Document Engineering Metadata And Markup –XML As A Metadata Lingua Franca METS –SGML vs. XML DTD Construction –XML Schemas –XML For Protocols And Metadata Languages Readings/Discussion

SLIDE 62IS 202 – FALL 2003 Discussion – Vam Makam Kirk covers examples of DTDs for books and newspapers. Many individuals and corporations have been creating numerous DTDs for themselves and general purposes. What are some innovative and useful ideas for areas where designing DTDs might be useful? For ideas that may have already been thought of, how could they be improved or extended?

SLIDE 63IS 202 – FALL 2003 Discussion – Vam Makam However, recent XML DTDs have emerged, newer ideas such as XML schemas have presented themselves as a better option. Given the thought process and work gone into designing existing DTDs, at what point is it worth modifying an existing DTD to an XML schema? Now that you have learned how to design a dtd and have basic knowledge about XML, what are some existing technologies that combined with XML become more useful?

SLIDE 64IS 202 – FALL 2003 Discussion – Annie Yeh Kirk addresses the advantages of using external DTDs, the reusability of public DTDs, the ability to focus on content rather than structure, easier management or multiple documents, and easier data error checking. What are some of the existing repositories in which we can store these DTDs? What are some of the ways with which we can facilitate this process? What are their pros and cons? What are some of the more ideal interfaces with which to facilitate this?

SLIDE 65IS 202 – FALL 2003 Discussion – Annie Yeh What are the differences between DTDs and Schemas, and what are the pros and cons of each?

SLIDE 66IS 202 – FALL 2003 Next Time Metadata for Motion Pictures: MPEG-7 Readings/Discussion –MPEG-7 (Part 1) (J. M. Martinez, R. Koenen, F. Pereira) –MPEG-7 (Part 2) (J. Martinez)

SLIDE 67IS 202 – FALL 2003