Document Type Definitions XML Schema

Slides:



Advertisements
Similar presentations
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
Advertisements

XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
1 XML DTD & XML Schema Monica Farrow G30
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model.
2/6/05Salman Azhar: Database Systems1 XML Salman Azhar Semi-structured Data XML (Extensible Markup Language) Well-formed and Valid XML Document Type Definitions.
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
Semistructured-Data Model Sept. 2014Yangjun Chen ACS Semistructured-Data Model Semistructured data XML Document type definitions XML schema.
CSE 636 Data Integration XML Semistructured Data Document Type Definitions.
1 XPath Path Expressions Conditions. 2 Paths in XML Documents uXPath is a language for describing paths in XML documents. uReally think of the semistructured.
1 CS145 Introduction About CS145 Relational Model, Schemas, SQL Semistructured Model, XML.
Winter 2002Arthur Keller – CS 18018–1 Schedule Today: Mar. 12 (T) u Semistructured Data, XML, XQuery. u Read Sections Assignment 8 due. Mar. 14.
1 XML Document Type Definitions XML Schema. 2 Well-Formed and Valid XML uWell-Formed XML allows you to invent your own tags. uValid XML conforms to a.
Semi-structured Data. Facts about the Web Growing fast Popular Semi-structured data –Data is presented for ‘human’-processing –Data is often ‘self-describing’
Fall 2001Arthur Keller – CS 18017–1 Schedule Nov. 27 (T) Semistructured Data, XML. u Read Sections Assignment 8 due. Nov. 29 (TH) The Real World,
1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
1 XML Query Languages XPATH XQUERY. 2 XPATH and XQUERY uXPATH is a language for describing paths in XML documents. wReally think of the semistructured.
1 XQuery Values FLWR Expressions Other Expressions.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Applied Component-Based Software Engineering XML Basics
Introduction to XML This material is based heavily on the tutorial by the same name at
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
XML Document Type Definitions XML Schema. Motivation for Semistructured data Serves as a model suitable for integration of databases Notations such as.
4/20/2017.
XML – Data Model, DTD and Schema
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Database Systems Part VII: XML
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
XML Semi-structured data XML Document Type Definitions (DTD)
1 Lecture 5: XML and XQuery. 2 Semistructured Data uAnother data model, based on trees. uMotivation: flexible representation of data. wOften, data comes.
CSCE 520- Relational Data Model Lecture 2. Relational Data Model The following slides are reused by the permission of the author, J. Ullman, from the.
August Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Session IV Chapter 9 – XML Schemas
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
1 CE223 Database Systems Introduction DBMS Overview, Relational Model, Schemas, SQL Semistructured Model, XML.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
1 CS1368 Introduction* Relational Model, Schemas, SQL Semistructured Model, XML * The slides in this lecture are adapted from slides used in Standford's.
1 Introduction Relational Model, Schemas, SQL Semistructured Model, XML The slides were made by Jeffrey D. Ullman for the Introduction to Databases course.
Jeff Ullman: Introduction to XML 1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
An Introduction to XML Sandeep Bhattaram
Semistructured Data Extensible Markup Language Document Type Definitions Zaki Malik November 04, 2008.
Tutorial 13 Validating Documents with Schemas
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
Exam II Syllabus uStorage & Buffer Management uIndexing: Btrees & Hash uMulti-dimensional Indexing uQuery processing (relational ops) uQuery optimization.
Semistructured-Data Model. Lu Chaojun, SJTU 2 Semistructured Data Structured data has a separate schema to describe its structure. –Advantage: efficient.
XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.
CSCE 520- Relational Data Model Lecture 2. Oracle login Login from the linux lab or ssh to one of the linux servers using your cse username and password.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Query Languages XPATH XQUERY Zaki Malik November 11, 2008.
XML Validation II Advanced DTDs + Schemas Robin Burke ECT 360.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
CITA 330 Section 4 XML Schema. XML Schema (XSD) An alternative industry standard for defining XML dialects More expressive than DTD Using XML syntax Promoting.
Extensible Markup Language (XML) Pat Morin COMP 2405.
Semistructured-Data Model
Semistructured-Data Model
Introduction to Database Systems, CS420
Web Programming Maymester 2004
Lecture 9: XML Monday, October 17, 2005.
CE223 Database Systems Introduction
Semi-Structured data (XML)
Presentation transcript:

Document Type Definitions XML Schema

Semistructured Data Another data model, based on trees. Motivation: flexible representation of data. Motivation: sharing of documents among systems and databases.

Graphs of Semistructured Data Nodes = objects. Labels on arcs (like attribute names). Atomic values at leaf nodes (nodes with no arcs out). Flexibility: no restriction on: Labels out of a node. Number of successors with a given label.

Example: Data Graph Notice a new kind of data. root beer beer bar The beer object for Bud manf manf prize name A.B. name year award servedAt Bud M’lob 1995 Gold The bar object for Joe’s Bar name addr Joe’s Maple

Example: HTML <!DOCTYPE html> <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html>

XML XML = Extensible Markup Language. While HTML uses tags for formatting (e.g., “italic”), XML uses tags for semantics (e.g., “this is an address”). Key idea: create tag sets for a domain (e.g., economics), and translate all data into properly tagged XML documents.

<. xml version = "1. 0" encoding = "utf-8" <?xml version = "1.0" encoding = "utf-8" ?> <db> <person id="01"> <name>John</name> <sal>8000</sal> </person> <bar> <owner person_id="01"/> <addr> <city>N. Y.</city> <street>Apple</street> <number>6</number> </addr> <tel>213-234</tel> <tel>213-233</tel> </bar> </db>

db person bar id name sal owner addr tel tel person_id 01 John 80000 213-234 213-233 01 city street number N. Y. Apple 6

Well-Formed and Valid XML Well-Formed XML allows you to invent your own tags. Valid XML conforms to a certain DTD.

Well-Formed XML Start the document with a declaration, surrounded by <?xml … ?> . Normal declaration is: <?xml version = ”1.0” standalone = ”yes” ?> “standalone” = “no DTD provided.” Balance of document is a root tag surrounding nested tags.

Tags Tags are normally matched pairs, as <FOO> … </FOO>. Unmatched tags also allowed, as <FOO/> Tags may be nested arbitrarily. XML tags are case-sensitive.

Example: Well-Formed XML <?xml version = “1.0” standalone = “yes” ?> <BARS> <BAR><NAME>Joe’s Bar</NAME> <BEER><NAME>Bud</NAME> <PRICE>2.50</PRICE></BEER> <BEER><NAME>Miller</NAME> <PRICE>3.00</PRICE></BEER> </BAR> <BAR> … </BARS> A NAME subelement Root tag Tags surrounding a BEER element A BEER subelement

DTD Structure <!DOCTYPE <root tag> [ <!ELEMENT <name>(<components>)> . . . more elements . . . ]>

DTD Elements The description of an element consists of its name (tag), and a parenthesized description of any nested tags. Includes order of subtags and their multiplicity. Leaves (text elements) have #PCDATA (Parsed Character DATA ) in place of nested tags.

Example: DTD <!DOCTYPE BARS [ <!ELEMENT BARS (BAR*)> A BARS object has zero or more BAR’s nested within. <!DOCTYPE BARS [ <!ELEMENT BARS (BAR*)> <!ELEMENT BAR (NAME, BEER+)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT BEER (NAME, PRICE)> <!ELEMENT PRICE (#PCDATA)> ]> A BAR has one NAME and one or more BEER subobjects. NAME and PRICE are text. A BEER has a NAME and a PRICE.

Element Descriptions Subtags must appear in order shown. A tag may be followed by a symbol to indicate its multiplicity. * = zero or more. + = one or more. ? = zero or one. Symbol | can connect alternative sequences of tags.

Example: Element Description A name is an optional title (e.g., “Prof.”), a first name, and a last name, in that order, or it is an IP address: <!ELEMENT NAME ( (TITLE?, FIRST, LAST) | IPADDR )>

Use of DTD’s Set standalone = “no”. Either: Include the DTD as a preamble of the XML document, or Follow DOCTYPE and the <root tag> by SYSTEM and a path to the file where the DTD can be found.

Example: (a) The DTD The document <?xml version = “1.0” standalone = “no” ?> <!DOCTYPE BARS [ <!ELEMENT BARS (BAR*)> <!ELEMENT BAR (NAME, BEER+)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT BEER (NAME, PRICE)> <!ELEMENT PRICE (#PCDATA)> ]> <BARS> <BAR><NAME>Joe’s Bar</NAME> <BEER><NAME>Bud</NAME> <PRICE>2.50</PRICE></BEER> <BEER><NAME>Miller</NAME> <PRICE>3.00</PRICE></BEER> </BAR> <BAR> … </BARS> The DTD The document

Example: (b) Assume the BARS DTD is in file bar.dtd. <?xml version = “1.0” standalone = “no” ?> <!DOCTYPE BARS SYSTEM ”bar.dtd”> <BARS> <BAR><NAME>Joe’s Bar</NAME> <BEER><NAME>Bud</NAME> <PRICE>2.50</PRICE></BEER> <BEER><NAME>Miller</NAME> <PRICE>3.00</PRICE></BEER> </BAR> <BAR> … </BARS> Get the DTD from the file bar.dtd

Attributes Opening tags in XML can have attributes. In a DTD, <!ATTLIST E . . . > declares attributes for element E, along with its datatype.

Example: Attributes Bars can have an attribute kind, a character string describing the bar. <!ELEMENT BAR (NAME BEER*)> <!ATTLIST BAR kind CDATA #IMPLIED> Character string type; no tags Attribute is optional opposite: #REQUIRED

Example: Attribute Use In a document that allows BAR tags, we might see: <BAR kind = ”sushi”> <NAME>Homma’s</NAME> <BEER><NAME>Sapporo</NAME> <PRICE>5.00</PRICE></BEER> ... </BAR>

ID’s and IDREF’s Attributes can be pointers from one object to another. Compare to HTML’s NAME = ”foo” and HREF = ”#foo”. Allows the structure of an XML document to be a general graph, rather than just a tree.

Creating ID’s Give an element E an attribute A of type ID. When using tag <E > in an XML document, give its attribute A a unique value. Example: <E A = ”xyz”>

Creating IDREF’s To allow elements of type F to refer to another element with an ID attribute, give F an attribute of type IDREF. Or, let the attribute have type IDREFS, so the F -element can refer to any number of other elements.

Example: ID’s and IDREF’s A new BARS DTD includes both BAR and BEER subelements. BARS and BEERS have ID attributes name. BARS have SELLS subelements, consisting of a number (the price of one beer) and an IDREF theBeer leading to that beer. BEERS have attribute soldBy, which is an IDREFS leading to all the bars that sell it.

The DTD <!DOCTYPE BARS [ <!ELEMENT BARS (BAR*, BEER*)> Bar elements have name as an ID attribute and have one or more SELLS subelements. <!DOCTYPE BARS [ <!ELEMENT BARS (BAR*, BEER*)> <!ELEMENT BAR (SELLS+)> <!ATTLIST BAR name ID #REQUIRED> <!ELEMENT SELLS (#PCDATA)> <!ATTLIST SELLS theBeer IDREF #REQUIRED> <!ELEMENT BEER EMPTY> <!ATTLIST BEER name ID #REQUIRED> <!ATTLIST BEER soldBy IDREFS #IMPLIED> ]> SELLS elements have a number (the price) and one reference to a beer. Explained next Beer elements have an ID attribute called name, and a soldBy attribute that is a set of Bar names.

Example: A Document <BARS> <BAR name = ”JoesBar”> <SELLS theBeer = ”Bud”>2.50</SELLS> <SELLS theBeer = ”Miller”>3.00</SELLS> </BAR> … <BEER name = ”Bud” soldBy = ”JoesBar SuesBar …” /> … </BARS>

Empty Elements We can do all the work of an element in its attributes. Like BEER in previous example. Another example: SELLS elements could have attribute price rather than a value that is a price.

Example: Empty Element In the DTD, declare: <!ELEMENT SELLS EMPTY> <!ATTLIST SELLS theBeer IDREF #REQUIRED> <!ATTLIST SELLS price CDATA #REQUIRED> Example use: <SELLS theBeer = ”Bud” price = ”2.50” /> Note exception to “matching tags” rule

XML Schema A more powerful way to describe the structure of XML documents. XML-Schema declarations are themselves XML documents. They describe “elements” and the things doing the describing are also “elements.”

Structure of an XML-Schema Document <? xml version = … ?> <xs:schema xmlns:xs = ”http://www.w3.org/2001/XMLschema”> . . . </xs:schema> So uses of ”xs” within the schema element refer to tags from this namespace. Defines ”xs” to be the namespace described in the URL shown. Any string in place of ”xs” is OK.

The xs:element Element Has attributes: name = the tag-name of the element being defined. type = the type of the element. Could be an XML-Schema type, e.g., xs:string. Or the name of a type defined in the document itself.

Example: xs:element <xs:element name = ”NAME” type = ”xs:string” /> Describes elements such as <NAME>Joe’s Bar</NAME>

Complex Types To describe elements that consist of subelements, we use xs:complexType. Attribute name gives a name to the type. Typical subelement of a complex type is xs:sequence, which itself has a sequence of xs:element subelements. Use minOccurs and maxOccurs attributes to control the number of occurrences of an xs:element.

Example: a Type for Beers <xs:complexType name = ”beerType”> <xs:sequence> <xs:element name = ”NAME” type = ”xs:string” minOccurs = ”1” maxOccurs = ”1” /> <xs:element name = ”PRICE” type = ”xs:float” minOccurs = ”0” maxOccurs = ”1” /> </xs:sequence> </xs:complexType> Exactly one occurrence Like ? in a DTD

An Element of Type beerType <xxx> <NAME>Bud</NAME> <PRICE>2.50</PRICE> </xxx> We don’t know the name of the element of this type.

Example: a Type for Bars <xs:complexType name = ”barType”> <xs:sequence> <xs:element name = ”NAME” type = ”xs:string” minOccurs = ”1” maxOccurs = ”1” /> <xs:element name = ”BEER” type = ”beerType” minOccurs = ”0” maxOccurs = ”unbounded” /> </xs:sequence> </xs:complexType> Like * in a DTD

xs:attribute xs:attribute elements can be used within a complex type to indicate attributes of elements of that type. attributes of xs:attribute: name and type as for xs.element. use = ”required” or ”optional”.

Example: xs:attribute <xs:complexType name = ”beerType”> <xs:attribute name = ”name” type = ”xs:string” use = ”required” /> <xs:attribute name = ”price” type = ”xs:float” use = ”optional” /> </xs:complexType>

An Element of This New Type beerType <xxx name = ”Bud” price = ”2.50” /> We still don’t know the element name. The element is empty, since there are no declared subelements.

Restricted Simple Types xs:simpleType can describe enumerations and range-restricted base types. name is an attribute xs:restriction is a subelement.

Restrictions Attribute base gives the simple type to be restricted, e.g., xs:integer. xs:{min, max}{Inclusive, Exclusive} are four attributes that can give a lower or upper bound on a numerical range. xs:enumeration is a subelement with attribute value that allows enumerated types.

Example: license Attribute for BAR <xs:simpleType name = ”license”> <xs:restriction base = ”xs:string”> <xs:enumeration value = ”Full” /> <xs:enumeration value = ”Beer only” /> <xs:enumeration value = ”Sushi” /> </xs:restriction> </xs:simpleType>

Example: Prices in Range [1,5) <xs:simpleType name = ”price”> <xs:restriction base = ”xs:float” minInclusive = ”1.00” maxExclusive = ”5.00” /> </xs:simpleType>

Keys in XML Schema An xs:element can have an xs:key subelement. Meaning: within this element, all subelements reached by a certain selector path will have unique values for a certain combination of fields. Example: within one BAR element, the name attribute of a BEER element is unique.

Example: Key <xs:element name = ”BAR” … > . . . And @ indicates an attribute rather than a tag. <xs:element name = ”BAR” … > . . . <xs:key name = ”barKey”> <xs:selector xpath = ”BEER” /> <xs:field xpath = ”@name” /> </xs:key> </xs:element> XPath is a query language for XML. All we need to know here is that a path is a sequence of tags separated by /.

Foreign Keys An xs:keyref subelement within an xs:element says that within this element, certain values (defined by selector and field(s), as for keys) must appear as values of a certain key.

Example: Foreign Key Suppose that we have declared that subelement NAME of BAR is a key for BARS. The name of the key is barKey. We wish to declare DRINKER elements that have FREQ subelements. An attribute bar of FREQ is a foreign key, referring to the NAME of a BAR.

Example: Foreign Key in XML Schema <xs:element name = ”DRINKERS” . . . <xs:keyref name = ”barRef” refers = ”barKey” <xs:selector xpath = ”DRINKER/FREQ” /> <xs:field xpath = ”@bar” /> </xs:keyref> </xs:element>