1 Lecture 10 XML Wednesday, October 18, 2006. 2 XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
An Introduction to XML Based on the W3C XML Recommendations.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
CSE 636 Data Integration XML Semistructured Data Document Type Definitions.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
1 COS 425: Database and Information Management Systems XML and information exchange.
Managing XML and Semistructured Data
1 Introduction to XML Yanlei Diao UMass Amherst April 19, 2007 Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.
XML and Databases 198:541. XML Motivation  Huge amounts of unstructured data on the web: HTML documents  No structure information  Only format instructions.
End of SQL XML April 22 th, Null Values If x=Null then 4*(3-x)/7 is still NULL If x=Null then x=“Joe” is UNKNOWN Three boolean values: –FALSE =
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Managing XML and Semistructured Data Lecture 2: XML Prof. Dan Suciu Spring 2001.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
XML – a data sharing standard DSC340 Mike Pangburn.
4/20/2017.
XML: Extensible Markup Language FST-UMAC Gong Zhiguo.
Web Data Management XML and its Syntax.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
Semistructured data and XML CS 645 April 5, 2006 Some slide content courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a family of languages ● semantic/structured mark-up language.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
An Introduction to XML Sandeep Bhattaram
1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
More XML: semantics, DTDs, XPATH February 18, 2004.
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
1 Indexing The syntax for creating a index is: CREATE [UNIQUE] INDEX index_name ON table_name (column1, column2,... column_n) [ COMPUTE STATISTICS ]; Why.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML – Basic Concepts (modified version from Dr. Praveen Madiraju) 2015, Fall Pusan National University Ki-Joune Li.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
1 XML eXtensible Markup Language. 2 Introduction and Motivation Dr. Praveen Madiraju Modified from Dr.Sagiv’s slides.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
Extensible Markup Language (XML) Pat Morin COMP 2405.
Lecture 14: Relational Algebra Projects XML?
Unit 4 Representing Web Data: XML
Management of XML and Semistructured Data
Management of XML and Semistructured Data
Managing XML and Semistructured Data
Lecture 11 XML Wednesday, Oct. 24, 2001.
Lecture 12: XML, XPath, XQuery
Semi-Structured data (XML Data MODEL)
Lecture 9: XML Monday, October 17, 2005.
CSE 544: Lecture 5 XML 4/15/2002.
Lecture 8: XML Data Wednesday, October
Allyson Falkner Spokane County ISD
Introduction to Database Systems CSE 444 Lecture 10 XML
Semi-Structured data (XML)
Lecture 11: XML and Semistructured Data
Presentation transcript:

1 Lecture 10 XML Wednesday, October 18, 2006

2 XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs

3 Additional Readings on XML Main source: (but hard to read) Strongly recommend readings: For XPath and XQuery:

4 XML A flexible syntax for data Used in: –Data exchange –Flexible databases: e.g. property lists –Configuration files: e.g. Web.Config –Document markup: e.g. XHTML Roots: SGML - a very nasty language We will study only XML as data

5 XML for Data Exchange Relational data does not have a syntax –I can’t “give” you my relational database –Examples of syntaxes: CSV (comma-separated-values), ASN.1 XML = syntax for data –But XML is not relational: semistructured Usage: –Export: Database  XML –Transport/transform XML –Import: XML  Databases or application

6 XML for Databases Relational databases have rigid schema –Schema evolution is costly XML is flexible: semistructured data –Store data in XML Warning: not normal form ! Not even 1NF –Don’t try this at home

7 From HTML to XML HTML describes the presentation

8 HTML Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999 Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999

9 XML Syntax Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … XML describes the content

10 XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags

11 More XML: Attributes Foundations of Databases Abiteboul … 1995 Foundations of Databases Abiteboul … 1995

12 Attributes v.s. Elements Foundations of DBs Abiteboul … 1995 Foundations of DBs Abiteboul … 1995 attributes are alternative ways to represent data Foundations of DBs Abiteboul … USD Foundations of DBs Abiteboul … USD

13 Comparison ElementsAttributes OrderedUnordered May be repeatedMust be unique May be nestedMust be atomic

14 XML v.s. HTML What are the differences between XML and HTML ? In class

15 More XML: Oids and References Jane Mary Jane Mary oids and references in XML are just syntax Are just keys/ foreign keys design by someone who didn’t take 444 Don’t use them: use your own foreign keys instead.

16 More XML: CDATA Section Syntax: Example: <>]]>

17 More XML: Entity References Syntax: &entityname; Example: this is less than < Some entities: << >> && &apos;‘ "“ &Unicode char

18 More XML: Processing Instructions Syntax: Example: What do they mean ? Alarm Clock 19.99

19 More XML: Comments Syntax Yes, they are part of the data model !!!

20 XML Namespaces name ::= [prefix:]localpart … 15 …. … 15 …. Means nothing as URL; just a unique name

21 … … XML Namespaces syntactic:, semantic: provide URL for schema Belong to this namespace

22 XML Semantics: a Tree ! Mary Maple 345 Seattle John Thailand Mary Maple 345 Seattle John Thailand data Mary person name address name address streetnocity Maple345 Seattle John Thai phone id o555 Element node Text node Attribute node Order matters !!!

23 XML Data XML is self-describing Schema elements become part of the data –Reational schema: persons(name,phone) –In XML,, are part of the data, and are repeated many times Consequence: XML is much more flexible XML = semistructured data

24 Mapping Relational Data to XML Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” Persons XML: persons NamePhone John3634 Sue6343 Dick6363 The canonical mapping:

25 Mapping Relational Data to XML Data John Gizmo 2004 Gadget Sue Gadget John Gizmo 2004 Gadget Sue Gadget Persons NamePhone John3634 Sue6343 Application specific mapping Orders PersonNameDateProduct John2002Gizmo John2004Gadget Sue2002Gadget XML

26 XML is Semi-structured Data Missing attributes: Could represent in a table with nulls John 1234 Joe John 1234 Joe no phone ! namephone John1234 Joe-

27 XML is Semi-structured Data Repeated attributes Impossible in tables: Mary Mary namephone Mary ??? Two phones !

28 XML is Semi-structured Data Attributes with different types in different objects Nested collections (no 1NF) Heterogeneous collections: – contains both s and s John Smith 1234 John Smith 1234 Structured name !

29 Document Type Definitions DTD part of the original XML specification an XML document may have a DTD XML document: Well-formed = if tags are correctly closed Valid = if it has a DTD and conforms to it validation is useful in data exchange

30 DTD Goals: Define what tags and attributes are allowed Define how they are nested Define how they are ordered Superseded by XML Schema Very complex: DTDs still used widely

31 Very Simple DTD <!DOCTYPE company [ ]> <!DOCTYPE company [ ]>

32 Very Simple DTD John B Jim B John B Jim B Example of valid XML document:

33 DTD: The Content Model Content model: –Complex = a regular expression over other elements –Text-only = #PCDATA –Empty = EMPTY –Any = ANY –Mixed content = (#PCDATA | A | B | C)* content model

34 DTD: Regular Expressions DTDXML sequence optional Kleene star alternation