Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lecture 10 XML Wednesday, October 18, 2006. 2 XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.

Similar presentations


Presentation on theme: "1 Lecture 10 XML Wednesday, October 18, 2006. 2 XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs."— Presentation transcript:

1 1 Lecture 10 XML Wednesday, October 18, 2006

2 2 XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs

3 3 Additional Readings on XML Main source: www.w3.org (but hard to read) http://www.w3.org/XML/ Strongly recommend readings: http://www.w3.org/XML/1999/XML-in-10-points www.zvon.org/xxl/XMLTutorial/General/book_en.html For XPath and XQuery: http://www.galaxquery.org/

4 4 XML A flexible syntax for data Used in: –Data exchange –Flexible databases: e.g. property lists –Configuration files: e.g. Web.Config –Document markup: e.g. XHTML Roots: SGML - a very nasty language We will study only XML as data

5 5 XML for Data Exchange Relational data does not have a syntax –I can’t “give” you my relational database –Examples of syntaxes: CSV (comma-separated-values), ASN.1 XML = syntax for data –But XML is not relational: semistructured Usage: –Export: Database  XML –Transport/transform XML –Import: XML  Databases or application

6 6 XML for Databases Relational databases have rigid schema –Schema evolution is costly XML is flexible: semistructured data –Store data in XML Warning: not normal form ! Not even 1NF –Don’t try this at home

7 7 From HTML to XML HTML describes the presentation

8 8 HTML Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999 Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999

9 9 XML Syntax Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … XML describes the content

10 10 XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags

11 11 More XML: Attributes Foundations of Databases Abiteboul … 1995 Foundations of Databases Abiteboul … 1995

12 12 Attributes v.s. Elements Foundations of DBs Abiteboul … 1995 Foundations of DBs Abiteboul … 1995 attributes are alternative ways to represent data Foundations of DBs Abiteboul … 1995 55 USD Foundations of DBs Abiteboul … 1995 55 USD

13 13 Comparison ElementsAttributes OrderedUnordered May be repeatedMust be unique May be nestedMust be atomic

14 14 XML v.s. HTML What are the differences between XML and HTML ? In class

15 15 More XML: Oids and References Jane Mary Jane Mary oids and references in XML are just syntax Are just keys/ foreign keys design by someone who didn’t take 444 Don’t use them: use your own foreign keys instead.

16 16 More XML: CDATA Section Syntax: Example: <>]]>

17 17 More XML: Entity References Syntax: &entityname; Example: this is less than < Some entities: << >> && &apos;‘ "“ &Unicode char

18 18 More XML: Processing Instructions Syntax: Example: What do they mean ? Alarm Clock 19.99

19 19 More XML: Comments Syntax Yes, they are part of the data model !!!

20 20 XML Namespaces name ::= [prefix:]localpart … 15 …. … 15 …. Means nothing as URL; just a unique name

21 21 … … XML Namespaces syntactic:, semantic: provide URL for schema Belong to this namespace

22 22 XML Semantics: a Tree ! Mary Maple 345 Seattle John Thailand 23456 Mary Maple 345 Seattle John Thailand 23456 data Mary person name address name address streetnocity Maple345 Seattle John Thai phone 23456 id o555 Element node Text node Attribute node Order matters !!!

23 23 XML Data XML is self-describing Schema elements become part of the data –Reational schema: persons(name,phone) –In XML,, are part of the data, and are repeated many times Consequence: XML is much more flexible XML = semistructured data

24 24 Mapping Relational Data to XML Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick”63436363 Persons XML: persons NamePhone John3634 Sue6343 Dick6363 The canonical mapping:

25 25 Mapping Relational Data to XML Data John 3634 2002 Gizmo 2004 Gadget Sue 6343 2004 Gadget John 3634 2002 Gizmo 2004 Gadget Sue 6343 2004 Gadget Persons NamePhone John3634 Sue6343 Application specific mapping Orders PersonNameDateProduct John2002Gizmo John2004Gadget Sue2002Gadget XML

26 26 XML is Semi-structured Data Missing attributes: Could represent in a table with nulls John 1234 Joe John 1234 Joe no phone ! namephone John1234 Joe-

27 27 XML is Semi-structured Data Repeated attributes Impossible in tables: Mary 2345 3456 Mary 2345 3456 namephone Mary23453456 ??? Two phones !

28 28 XML is Semi-structured Data Attributes with different types in different objects Nested collections (no 1NF) Heterogeneous collections: – contains both s and s John Smith 1234 John Smith 1234 Structured name !

29 29 Document Type Definitions DTD part of the original XML specification an XML document may have a DTD XML document: Well-formed = if tags are correctly closed Valid = if it has a DTD and conforms to it validation is useful in data exchange

30 30 DTD Goals: Define what tags and attributes are allowed Define how they are nested Define how they are ordered Superseded by XML Schema Very complex: DTDs still used widely

31 31 Very Simple DTD <!DOCTYPE company [ ]> <!DOCTYPE company [ ]>

32 32 Very Simple DTD 123456789 John B432 1234 987654321 Jim B123... 123456789 John B432 1234 987654321 Jim B123... Example of valid XML document:

33 33 DTD: The Content Model Content model: –Complex = a regular expression over other elements –Text-only = #PCDATA –Empty = EMPTY –Any = ANY –Mixed content = (#PCDATA | A | B | C)* content model

34 34 DTD: Regular Expressions.......... DTDXML sequence optional Kleene star alternation......................


Download ppt "1 Lecture 10 XML Wednesday, October 18, 2006. 2 XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs."

Similar presentations


Ads by Google