1 XML Algebra Comparison between: XPERANTO NIAGARA.

Slides:



Advertisements
Similar presentations
XML Data Management 8. XQuery Werner Nutt. Requirements for an XML Query Language David Maier, W3C XML Query Requirements: Closedness: output must be.
Advertisements

Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology.
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Relational Algebra Dashiell Fryer. What is Relational Algebra? Relational algebra is a procedural query language. Relational algebra is a procedural query.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Introduction to XML Algebra
XML Views El Hazoui Ilias Supervised by: Dr. Haddouti Advanced XML data management.
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
Nov 24, 2003Murali Mani SQL B term 2004: lecture 12.
1 Introduction To XML Algebra Wan Liu Bintou Kane Advanced Database Instructor: Elka 2/11/
Database Systems and XML David Wu CS 632 April 23, 2001.
1 XQuery to SQL by XAT Xin Zhang Thanks: Brian, Mukesh, Maged, Lily, Elke.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. SQL - part 2 - Database Management Systems I Alex Coman, Winter 2006.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
1 Introduction to XML Algebra Based on talk prepared for CS561 by Wan Liu and Bintou Kane.
XML Publishing Introduction General approach XPERRANTO SilkRoute Microsoft SQL 2000 Summary.
1 XQuery to XAT Xin Zhang. 2 Outline XAT Data Model. XAT Operator Design. XQuery Block Identification. Equivalent Rewriting Rules. Computation Pushdown.
Murali Mani Relational Algebra. Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQL’s DML (Data Manipulation.
Graph Algebra with Pattern Matching and Aggregation Support 1.
SQL 資料庫查詢語言 取材自 EIS, 3 rd edition By Dunn et al..
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 3: Introduction.
Relational Algebra Instructor: Mohamed Eltabakh 1.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS
Querying Structured Text in an XML Database By Xuemei Luo.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
1 XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin.
From Relational Algebra to SQL CS 157B Enrique Tang.
Chapter 5 Relational Algebra and Relational Calculus Pearson Education © 2009.
XML and Database.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Advanced Relational Algebra & SQL (Part1 )
©Silberschatz, Korth and Sudarshan3.1Database System Concepts Extended Relational-Algebra-Operations Generalized Projection Aggregate Functions Outer Join.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
Chapter 3: Relational Model III Additional Relational Algebra Operations Additional Relational Algebra Operations Views Views.
LECTURE THREE RELATIONAL ALGEBRA 11. Objectives  Meaning of the term relational completeness.  How to form queries in relational algebra. 22Relational.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
Partial Query-Evaluation in Internet Query Engines Jayavel Shanmugasundaram Kristin Tufte David DeWitt David Maier Jeffrey Naughton University of Wisconsin.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 6: Formal Relational.
XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents Michael Carey Daniela Florescu Zachary Ives Ying Lu Jayavel Shanmugasundaram.
Chapter (6) The Relational Algebra and Relational Calculus Objectives
More SQL: Complex Queries,
COMP3017 Advanced Databases
Relational Model By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany)
Relational Algebra - Part 1
Chapter 3: Relational Model III
Chapter 2: Intro to Relational Model
SQL Structured Query Language 11/9/2018 Introduction to Databases.
CS 405G: Introduction to Database Systems
Lecture 4 of 42 Relational Joins Wednesday, 30 January 2008
Instructor: Mohamed Eltabakh
An algebra for XML Leonidas Galanis, Stratis Viglas
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Equivalence of Aggregate Queries in Conjunctive QL
Wednesday, May 22, 2002 XML Publishing, Storage
CS 405G: Introduction to Database Systems
Presentation transcript:

1 XML Algebra Comparison between: XPERANTO NIAGARA

2 Part I NIAGARA XML Query Optimization XML Algebra Data Model Operator Query Plan Equivalent Rules XPERANTO XML Query to SQL XML Algebra Data Model Operator Query Plan Composition Rules Translation Example

3 <!DOCTYPE invoice [ <!ELEMENT invoice (account_number, bill_period, carrier+, itemized_call*, total)> <!ATTLIST itemized_call no ID #REQUIRED date CDATA #REQUIRED number_called CDATA #REQUIRED time CDATA #REQUIRED rate (NIGHT|DAY) #REQUIRED min CDATA #REQUIRED amount CDATA #REQUIRED> ]> Jun 9 - Jul 8, 2000 Sprint $0.35 Example of Telephone Bill

4 Example XQuery User XQuery: { FOR $rate IN LET $itemized_call := WHERE LIKE ‘973%’ RETURN $rate count($itemized_call) } Count number of itemized_calls in calling area 973 grouped by the calling rate.

5 NIAGARA Title : Following the paths of XML Data: An algebraic framework for XML query evaluation By : Leonidas Galanis, Efstratios Viglas, David J. DeWitt, Jeffrey. F. Naughton, and David Maier.

6 Goals Be independent of schema information Query on both structure and content Generate simple, flexible, yet powerful algebraic expressions Allow re-use of traditional optimization techniques

7 Data Model A collection of bags of vertices. The vertices in the bag have no order. Example: Root invoice.xml invoice invoice.account_number Invoice-element-content carrier -element-content [Root “invoice.xml ”, invoice, invoice. account_number ]

8 Data Model Bag elements are reachable by path expressions. The path expression consists of two parts : An entry point A relative forward part Example : account_number:invoice

9 Operators Source S, Follow , Select , Join, Rename , Expose , Vertex, Group , Union , Intersection , Difference -, Cartesian Product .

10 Source Operator S Input : a list of documents Output :a collection of singleton bags Examples : S (*) All Known XML documents S (invoice*.xml) All XML documents whose filename matches “invoice*.xml S (*,schema.dtd) All known XML documents that conform to schema.dtd

11 Follow operator  Input : a path expression in entry point notation Functionality : extracts vertices reachable by path expression Output : a new bag that consist of the extracted vertex + all the contents of the original bag (in care of unnesting follow)

12 Follow operator (Example*) Root invoice.xml invoice Invoice-element-content Root invoice.xml invoice invoice.carrier Invoice-element-content carrier -element-content  (carrier:invoice) *Unnesting Follow {[Root invoice.xml, invoice]} {[Root invoice.xml, invoice, invoice.carrier]}

13 Select operator  Input : a set of bags Functionality : filters the bags of a collection using a predicate Output : a set of bags that conform to the predicate Predicate : Logical operator ( , ,  ), or simple qualifications ( , , , , ,  )

14 Select operator (Example)  invoice.carrier =Sprint Root invoice.xml invoice Invoice-element-content Root invoice.xml invoice Invoice-element-content Root invoice.xml invoice Invoice-element-content {[Root invoice.xml, invoice], [Root invoice.xml, invoice], ……………} {[Root invoice.xml, invoice],… }

15 Join operator Input : two collections of bags Functionality :Joins the two collections based on a predicate Output :the concatenation of pairs of pages that satisfy the predicate

16 Join operator (Example) Root invoice.xml invoice Invoice-element-content Root customer.xml customer customer-element-content account_number: invoice =number:customer Root invoice.xml invoice Root customer.xml customer Invoice-element-content customer-element-content {[Root invoice.xml, invoice]}{[Root customer.xml, customer]} {[Root invoice.xml, invoice, Root customer.xml, customer]}

17 Expose operator  Input : a list of path expressions of vertices to be exposed Output : a set of bags that contains vertices in the parameter list with the same order

18 Expose operator (Example) Root invoice.xml invoice. bill_period invoice.carrier carrier-element-content bill_period -element-content  (bill_period,carrier) {[Root invoice.xml, invoice.bill_period, invoice.carrier]} Root invoice.xml invoice invoice.carrier invoice.bill_period Invoice-element-content bill_period -element-content {[Root invoice.xml, invoice, invoice.carrier, invoice.bill_period]} carrier-element-content

19 Vertex operator Creates the actual XML vertex that will encompass everything created by an expose operator Example : (Customer_invoice)[  ( (account)[invoice.account_number], (inv_total)[invoice.total])]

20 Other operators Group  : is used for arbitrary grouping of elements based on their values Aggregate functions can be used with the group operator (i.e. average) Rename  : Changes the entry point annotation of the elements of a bag. Example:  (invoice.bill_period,date)

21 Example XQuery User XQuery: { FOR $rate IN LET $itemized_call := WHERE LIKE ‘973%’ RETURN $rate count($itemized_call) } Count number of itemized_calls in calling area 973 grouped by the calling rate.

22 Query Plan: Algebra υ(summary)[ ε(υ(rate)[rate] υ(number_of_calls)[number]) [ ρ(rate:invoice.itemized_call, rate), ρ(count(invoice.itemized_call), number) [γ(rate:invoice.itemized_call, count(invoice.itemized_call)) [σ number called:invoice.itemized_call ► ”973%” [Φ μ (invoice.itemized_call) [s(invoice.xml)]]]]]]

23 Equivalent Rules 14 equivalent rules so far. Definition of Auxiliary Operators for Equiv. A > B: Path expression A is a prefix of B ┴ : The null path expression A∏B : The greatest common prefix of path expressions A and B A∏B : The common prefix of path expressions A and B.

24 Equivalent Rules Examples Rule applications Follow ordering Φ μ (A) [Φ μ (B)] = Φ μ (B)[Φ μ (A)] iff C < A, C < B: C = A∏B, or A∏B = ┴. A B B C A... XX

25 Equivalent Rules Examples Rule applications Join commutability and associability (A B) C = (C B) A

26 Equivalent Rules Examples Rule applications Selection distribution and interchangeability σ c [A B] = σ c1 [A] σ c2 [B] where c is a conjoin of the conditions c1 and c2, each of which only refers to one of the join inputs

27 Equivalent Rules Examples Rule applications Elimination of unused bag elements ε(P)(J[A]) = J(ε(P[A])) iff J uses only elements exposed by P

28 XPERANTO Goal: XQuery  SQL References: Querying XML Views of Relational Data J. Shanmugasundaram, et. Al. Querying XML Views of Relational Data, VLDB J. Shanmugasundaram, et. Al. Efficiently Publishing Relational Data as XML Documents, VLDB J. Shanmugasundaram, Ph.D. Dissertation. July, 2001.

29 Query Processing Architecture RDBMS XQuery Parser Query Rewrite & View Composition Computation Pushdown Tagger Runtime XQuery Query Results XPERANTO Query Engine Tagger Graph XQGM SQL QueryTuples RDB User XML View XQuery User

30 Data Model Tables of A List of XML Fragments $carrier </carrier $carriers Groupby: $carrier = aggXMLFrags($carrier_entry) $carrier_entry Project: $carrier_entry = $carrier $carrier Select: $invoice_id = $id Table: Carrier $invoice_id$carrier $invoice_id$carrier $carrier_entry $carriers $carrier </carrier $carrier ……….

31 Operators Table, Project, Select, Join, Groupby, Orderby, Union, Unnest, View, Function - Select, Project, join, groupby, orderby and union have the same semantics as their relational counterparts. - Project : to invoke various function defined - Table/View : to refer to relational table or XML view - Unnest : to unnest XML list - Function : to invoke XQuery valued functions - Groupby : to create XML Fragments

32 XML Functions & Operators XML FunctionDescriptionOperators 1cr8Elem(Tag, Atts, Clist)Creates an element with tag name Tag, attribute list Atts, and contents Clist Project 2cr8AttList(A1,…..An)Creates a list of attributes from the attributes passed as parameters Project 3cr8Att(Name, Val)Creates an attribute with name Name and value ValProject 4cr8XMLFragList(C1,…Cn)Creates an XML fragment list from the content parametersProject 5aggXMLFrags©Aggregate XML function that creates an XML fragment listGroupby 6getTagName(Elem)Returns the element name of the ElemProject, Select 7getAttributes(Elem)Returns the list of attributes of ElemProject, Select 8getContents(Elem)Returns the XML fragment list of contents of ElemProject, Select 9getAttName(Att)Returns the name of attribute AttProject, Select 10getAttValueReturns the value of the attribute AttProject, Select 11isElement(E)Returns true if E is an element, returns false otherwiseSelect 12isText(T)Returns true if T is text, returns false otherwiseSelect 13Unnest(List)Superscalar function that unnests a listUnnest

33 Operators - Examples $elems Project: $elems = getContents($invoice) $count Groupby: $count = count($itemized_call) $elems july – 23 august, 2001 ………….. $count 3 $itemized_call $invoice july – 23 august, 2001 …………… …………..

34 Operators - Examples $entries Groupby: $entries = aggXMLFrags($entry) $result Project: $result = cr8Elem(summary, Att, $entries) $entry DAY 20 NIGHT 23 $entries DAY 20 NIGHT 23 $entries DAY 20 NIGHT 23 $result DAY 20 NIGHT 23

35 Operator - Examples $elem Unnest: $elem = unnest($elems) $elems DAY 20 NIGHT 23 $elem DAY 20 NIGHT 23

36 XML Query $rate Navigate: $doc View: document(“invoice.xml”); XQGM: $itemized_call Selection: $number LIKE ‘973%’ $itemized_call Select: $rate = $irate $entry Project: $entry = $rate $count $entries Groupby: $entries = aggXMLFrags($entry) $result Project: $result = $entries $rate Select: distinct($rate) $itemized_call Navigate: $irate = $number = $irate $count Groupby: $count = count($itemized_call) $rate Join (Correlated): $count $number User XQuery: { FOR $rate IN distinct(document(“invoice ate) LET $itemized_call := document(“invoice”)/invoi ate] WHERE d LIKE ‘973%’ RETURN $rate count($ite mized_call) }

37 Navigation in XQGM $invoice XQGM: $account_number Select: getTagName($elem)=“account_number” $elems Project: $elems = getContents($invoice) $elem Unnest: $elem = unnest($elems) $invoice $account_number Navigate: $invoice/account_number

38 Default XML View Jun 9 – Jun 8, 2000 $ Sprint... idaccount_numberbill_periodtotal Jun 9 – Jun 8, 2000$0.35 invoice invoice_idcarrier 1Sprint carrier invoice_idnodatenumber_calledtimerateminamount 11JUN :17pmNIGHT JUN :19amDAY JUN :25pmNIGHT30.15 itemized_call

39 User Defined XML View Idaccount_numberbill_periodtotal Jun 9 – Jun 8, 2000$0.35 Invoice Invoice_idCarrier 1Sprint Carrier Invoice_idNoDateNumber_calledTimeRateMinAmount 11JUN :17pmNIGHT JUN :19amDAY JUN :25pmNIGHT30.15 Itemized_call Jun 9 - Jul 8, 2000 Sprint $0.35

40 User Defined XML View Cont. Create view invoice as ( FOR $invoice IN view(“default”)/invoice/row RETURN $invoice/account_number $invoice/bill_period FOR $carrier in view(“default”)/carrier/row WHERE $carrier/invoice_id = $invoice/id RETURN $carrier FOR $itemized_call in view(“default”)/itemized_call/row WHERE $itemized_call/invoice_id = $invoice/id RETURN SORTBY $invoice/total )

41 XML View XQGM Create view invoice as ( FOR $invoice IN view(“default”)/invoice/row RETURN $invoice/accoun t_number $invoice/bill_period FOR $carrier in view(“default”)/carrier/r ow WHERE $carrier/invoice_id = $invoice/id RETURN $carrier FOR $itemized_call in view(“default”)/itemized _call/row WHERE $itemized_call/invoice_id = $invoice/id RETURN SORTBY $invoice/total ) $account_number Join (Correlated): $bill_period$total $doc Project: $doc = $account_number $bill_period $carriers $itemized_calls $total $carriers Groupby: $carrier = aggXMLFrags($carrier_entry) $carrier_entry Project: $carrier_entry = $carrier $carrier Select: $invoice_id = $id Table: Carrier $invoice_id$carrier Table: Invoice $id$account_number$bill_period$total $items Subquery. Table: Carrier $invoice_id$carrier $items $carriers

42 View Composition User Query XQGM + User View XQGM To cancel out the Navigation operators By using the composition rules cr8Elem(invoice, cr8AttList(), cr8XMLFragList( cr8Elem(account_number, cr8AttList(), cr8XMLFragList($account_number)), cr8Elem(bill_period, cr8AttList(), cr8XMLFragList($bill_period)), $carriers, $items, cr8Elem(total, cr8AttList(), cr8XMLFragList($total)) ) $account_number Select: getTagName($elem)=“account_number” $elems Project: $elems = getContents($invoice) $elem Unnest: $elem = unnest($elems) $invoice

43 12 Composition Rules FunctionCOMPOSES WITHREDUCTION 1getTagNamecr8Elem(Tag, Atts, Clist)Tag 2getAttributesCr8Elem(Tag, Atts, Clist)Atts 3getContentscr8Element(Tag, Atts, Clist)Clist 4getAttNamecr8Att(Name, Val)Name 5getAttValuecr8Att(Name, Val)Val 6isElementcr8Element(Tag, Atts, Clist)True 7isElementOther than cr8ElemeFalse 8isTextPCDATATrue 9isTextOther than PCDATAFalse 10UnnestaggXMLFrags(C)C 11Unnestcr8XMLFragList(C1,..., Cn)C1 U... U Cn 12Unnestcr8AttList(A1,..., An)A1 U... U An

44 View Composition Example $account_number Select: getTagName($elem)=“account_number” $elems Project: $elems = getContents($invoice) $elem Unnest: $elem = unnest($elems) $account_number Join (Correlated): $bill_period$total $invoice Project: $invoice = $account_number $bill_period $carriers $itemized_calls $total $items $carriers $account_number Join (Correlated):

45 Computation Pushdown Goal: XQGM  SQLs + Tagger Graph Step1: Query Decorrelation Correlated Join  Out Unions Reference: P. Seshadri, et. Al. “Complex Query Decorrelation”, ICDE Step2: Tagger Pull-Up XQGM  Tagger Run-Time Graph Use “Sorted Outer Union” Reference: J. Shanmugasundaram, et. Al. “Efficiently Publishing Relational Data as XML Documents”. Separation of SQL and Tagger Operations Semantically equivalent fragment by pattern.

46 Comparison XPERANTONIAGARA GoalXQuery  SQLXQuery  Algebra AlgebraXQGM and Tagger GraphXML Algebra Data ModelTables of a list of XML FragmentsA collection of bags of vertices Operators * 10 operators with 13 functions12 operators Variable BindingLot of temporary variablesNo variables. OrderSensitiveSemi-sensitive (missing orderby) Regular ExpressionNo Support at operator levelSupport at operator level Text-in-contextNo SupportSupport Level of abstractionFunction level (lower)Logical level (higher) Transition rulesComposition rules & (ad-hoc) 1 Semantically equivalent pattern (ad-hoc) Equivalent rules Operation HistoryNot maintainedMaintained

47 Conclusions and Future Work WE NEED OUR OWN ALGEBRA. More Reading David Beech, et. Al. A Formal Data Model and Algebra for XML. Mary Fernandez, et. Al. An Algebra for XML Query.