Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 15: Midterm Review

Similar presentations


Presentation on theme: "Lecture 15: Midterm Review"— Presentation transcript:

1 Lecture 15: Midterm Review
Friday, October 31, 2003

2 XML Storage Most often the XML data is small
E.g. a SOAP message Parsed directly into the application (DOM API) Sometimes XML data is large need to store/process it in a database The XML storage problem: How do we choose the schema of the database ?

3 XML Storage Three solutions: Schema derived from DTD
Storing XML as a graph: “Edge relation” Store it as a BLOB Simple, boring, inefficient Won’t discuss in class

4 Designing a Schema from DTD
Design a relational schema for: <!DOCTYPE company [ <!ELEMENT company ((person|product)*)> <!ELEMENT person (ssn, name, office?, phone*)> <!ELEMENT ssn (#PCDATA)> <!ELEMENT name (#PCDATA)> <!ELEMENT office (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT product (pid, name, ((price,availability)|description))> <!ELEMENT pid (#PCDATA)> <!ELEMENT description (#PCDATA)> ]>

5 Designing a Schema from DTD
First, construct the DTD graph: company * We ignore the order * person product * ssn name office phone pid price avail. descr.

6 Designing a Schema from DTD
Next, design the relational schema, using common sense. company * * person product * ssn name office phone pid price avail. descr. Person(ssn, name, office) Phone(ssn, phone) Product(pid, name, price, avail., descr.) Which attributes may be NULL ? (Look at the DTD)

7 Designing a Schema from DTD
What happens to queries: FOR $x IN /company/product[description] RETURN <answer> { $x/name, $x/description } </answer> SELECT Product.name, Product.description FROM Product WHERE Product.description IS NOT NULL

8 Storing XML as a Graph Sometimes we don’t have a DTD:
How can we store the XML data ? Every XML instance is a tree Store the edges in an Edge table Store the #PCDATA in a Value table

9 Can be ANY XML data (don’t know DTD)
Storing XML as a Graph Can be ANY XML data (don’t know DTD) db book publisher title author state “Complete Guide to DB2” “Chamberlin” “Transaction Processing” “Bernstein” “Newcomer” “Morgan Kaufman” “CA” 1 2 3 4 5 6 7 8 9 10 11 Edge Source Tag Dest db 1 book 2 title 3 author 4 5 6 7 . . . Value Source Val 3 Complete guide . . . 4 Chamberlin 6 . . .

10 Storing XML as a Graph FOR $x IN /db/book[author/text()=“Chamberlin”]
What happens to queries: FOR $x IN /db/book[author/text()=“Chamberlin”] RETURN $x/title xdb xbook xauthor xtitle vauthor vtitle db book author title “Chamberlin” Return value

11 Storing XML as a Graph What happens to queries: A 6-way join !!!
SELECT vtitle.value FROM Edge xdb, Edge xbook, Edge xauthor, Edge xtitle, Value vauthor, Value vtitle WHERE xdb.source = and xdb.tag = ‘db’ and xdb.dest = xbook.source and xbook.tag = ‘book’ and xbook.dest = xauthor.source and xauthor.tag = ‘author’ and xbook.dest = xtitle.source and xtitle.tag = ‘title’ and xauthor.dest = vauthor.source and vauthor.value = ‘Chamberlin” and xtitle.dest = vtitle.source

12 Storing XML as a Graph Edge relation summary:
Same relational schema for every XML document: Edge(Source, Tag, Dest) Value(Source, Val) Generic: works for every XML instance But inefficient: Repeat tags multiple times Need many joins to reconstruct data

13 Other XML Topics Name spaces XML API: XML languages: XML Schema
DOM = “Document Object Model” XML languages: XSLT XML Schema Xlink, XPointer SOAP Available from (but don’t spend rest of your life reading those standards !)

14 Research on XML Data Management at UW
Processing: Query languages (XML-QL, a precursor of XQuery) Tukwila XML updates XML publishing/storage SilkRoute: silkroute.sourceforge.net STORED XML tools XML Compressor: Xmill XML Toolkit (xsort, xagg, xgrep, xtransf, etc): xmltk.sourceforge.net Theory: Typechecking Xpath, Xquery containment

15 The Midterm Four questions: SQL E/R Schema design (BCNF/3NF) XML

16 The Midterm Open book exam (books,/notebooks,/lectures) No retakes
But no computers No retakes Your score: Max(midterm-score,100 – 1.2(100 – final-score) No retakes on final: you *must* take it on Dec. 12

17 1. SQL Selection/project/join Understand well duplicates
Aggregate queries avoid nested queries when a GROUP BY suffices Nested queries More difficult: ANY, ALL, NOT IN Updates, table creations, views

18 2. E/R Diagrams One/many v.s. many/many relationships Inheritance
Translation to relations Remember: no table for one/many !

19 3. Schema Design What does AB  CD mean ? Compute {AB}+
Compute all keys Decompose in BCNF or 3NF

20 4. XML XML: XPath XQuery Basic principles in publishing/storing data
Basic syntax: elements + attributes DTDs: elements only The tree model Canonical XML view of a relation (<row>...) XPath XQuery Basic principles in publishing/storing data

21 Final Thoughts Open book Some question(s) may be hard(er)
But read the book before the exam Some question(s) may be hard(er) Answer first the questions that are easier The answers should not be very complex

22 Exercises ABC, CD, DB, DE Decompose in BCNF
General strategy: find X s.t.: X  X+  all attributes

23 BCNF Try AB+ = ABCDE nope… Try C+ = CDBE yep… decompose CDBE, CA
ABC, CD, DB, DE BCNF Try AB+ = ABCDE nope… Try C+ = CDBE yep… decompose CDBE, CA Continue in CDBE: try D+ = DBE yep… decompose DBE, DC Continue in DBE: try D+ = DE yep… decompose DE, DB Answer: CA, DC, DE, DB Notice: AB is a key, but A and B are separated. Cleary 3NF differs

24 3NF ABC, CD, DB, DE Minimal keys: AB, AC, AD
Try C+ = CDBE yes, BUT: D,B part of keys so decompose on C E only: CE, CABD Continue in CABD Try A+ = A nope… Try B+ = B nope… Try AB+ = ABCD nope… Try C+ = CDB yes, but B, D part of keys Try D+ = DB again B part of a key Try others… nope. Answer: CE, CABD. Notice: DE, DABC also OK


Download ppt "Lecture 15: Midterm Review"

Similar presentations


Ads by Google