1 XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin.

Slides:



Advertisements
Similar presentations
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 5A Relational Algebra.
Advertisements

Copyright © 2004 Pearson Education, Inc.. Chapter 15 Algorithms for Query Processing and Optimization.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Relational Algebra Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
Introduction to XML Algebra
XML Views El Hazoui Ilias Supervised by: Dr. Haddouti Advanced XML data management.
CS 4432query processing1 CS4432: Database Systems II.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
1 XQuery to SQL by XAT Xin Zhang Thanks: Brian, Mukesh, Maged, Lily, Elke.
WIDM 2002 DSRG, Worcester Polytechnic Institute1 Honey, I Shrunk the XQuery! —— An XML Algebra Optimization Approach Xin Zhang, Bradford Pielech and Elke.
1 Rainbow XML-Query Processing Revisited: The Incomplete Story (Part II) Xin Zhang.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
1 XQuery to XAT Xin Zhang. 2 Outline XAT Data Model. XAT Operator Design. XQuery Block Identification. Equivalent Rewriting Rules. Computation Pushdown.
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.
A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
Advanced Database Systems Notes:Query Processing (Overview) Shivnath Babu.
Database Management 9. course. Execution of queries.
Querying Structured Text in an XML Database By Xuemei Luo.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Copyright © 2004 Pearson Education, Inc.. Chapter 15 Algorithms for Query Processing and Optimization.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
CPS216: Data-Intensive Computing Systems Introduction to Query Processing Shivnath Babu.
Relational Databases.  In week 1 we looked at the concept of a key, the primary key is a column/attribute that uniquely identifies the rest of the data.
From Relational Algebra to SQL CS 157B Enrique Tang.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Advanced Relational Algebra & SQL (Part1 )
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 (Part II) INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Optimization Overview Lecture 17. Today’s Lecture 1.Logical Optimization 2.Physical Optimization 3.Course Summary 2 Lecture 17.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
Query Processing and Query Optimization Database System Implementation CSE 507 Slides adapted from Silberschatz, Korth and Sudarshan Database System Concepts.
Querying Structured Text in an XML Database Shurug Al-Khalifa Cong Yu H. V. Jagadish (University of Michigan) Presented by Vedat Güray AFŞAR & Esra KIRBAŞ.
Query Optimization Heuristic Optimization
CS257 Query Optimization.
Relational Algebra at a Glance
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Part A
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Relational Algebra.
Relational Algebra 1.
LECTURE 3: Relational Algebra
Relational Algebra Chapter 4 1.
The Relational Algebra and Relational Calculus
CS 3630 Database Design and Implementation
Relational Algebra Chapter 4, Sections 4.1 – 4.2
SQL: Structured Query Language
Presentation transcript:

1 XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin

2 Outline 1. Overview of Rainbow System 2. Process of translating XQuery -> SQL 3. XML Operators 4. Partial translation walkthrough with running example

3 Rainbow System Complete XML SQL system Uses some ideas from XPERANTO, Niagara, and other systems Several main subsystems: Document Shredder View Generator Query Translation, Query Rewrite Result Generation Work in progress

4 Steps in Translation 1. User inputs XQuery query 2. User Query is converted into an XML Algebra Tree (XAT) 3. Database Mapping Query’s XAT generated 4. Queries are Decorrelated 5. Trees are merged, unnecessary branches cut

5 Steps Continued 6. Computation Pushdown (presentation concludes here) 7. SQL Generation 8. Query Execution 9. Tagging of Results

6 What is the difference between the two queries? The user query is executed over a view of the XML document and specifies what to return and how to return it The mapping query specifies how the view the user is querying “maps” to the database Therefore, combining the two queries into one is necessary in order to correctly process the user’s request

7 XAT Operators Each XAT is comprised of XAT Operators. Similar in concepts to Relational Algebra Operator set is combination between Niagara and Xperanto papers

8 Set of Operators SQL like (9): Project, Select, Join (Theta, Outer, Semi), Groupby, Orderby, Union (Node, Outer), Cartesian Product. XML like (4): Tagger, Navigate, is(Element, Text), Aggregate. Special: SQL, Function, Source, NameColumn, FOR

9 SQL like Operators (9) NiagaraXPERANTO ProjectExposeProject Select Theta JoinJoinTheta Join Outer JoinN/AOuter Join Semi JoinN/A GroupbyGroupGroupby OrderbyN/AOrderby Union Outer UnionUnionOuter Union

10 XML like Operators NiagaraXPERANTO Tagger* (pattern) VertexProject: cr8(Elem, AttList, Att, XMLFragList), Navigate (from, path) FollowProject: get(TagName, Attributes, Contents, AttName, AttValue), Unnest IsN/ASelect: is(Element, Text), AggregateGroupAggXMLFrags

11 Special Operators NiagaraXPERANTODescription SQLN/AInputDenote a SQL query. FunctionN/AFunctionUsed to represent recursive query Source Table, ViewIdentify a data source. NameColumnRenameN/ANaming of columns. FORN/A FOR iteration.

12 Boston Red Sox Nomar Shortstop Pedro Pitcher Manny Outfield … Sports XML Document Fenway Park 33, … <player name="Pedro" number="45" rookieYear = "1991" /> <player name="Nomar" number="5" rookieYear = "1997" /> <player name="Manny" number="24" rookieYear = "1993" />

13 Example XQuery { For $p in document("sports.xml")/sports/organization Let $a = $p/team/text() Where $a = "Boston Red Sox" Return $p/starPlayer/pname/text() } List all of the star players’ names on the Boston Red Sox

14 XAT Tree for Example Query V1 := Aggregate $pname = Navigate($p, starPlayer/pname/text()) Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) $p := Navigate(“/”, sports/organization) Source(“sports.xml”) Tagger( V1 Tagger( $pname

15 RDBMS Tables of Sports Info organizationIDteamNamestadiumnName 1Boston Red SoxFenway Park Organization stadiumIDsnameCapacityyearBuiltticketHighticketLow 1Fenway Park33, Stadium starPlayerNamestarPlayePositionorganizationID NomarShortStop1 PedroPitcher1 MannyOutfield1 StarPlayer PlayerNameNumberrookieYear Nomar51997 Pedro Manny PlayerInfo

16 Partial Default XML View 1 Boston Red Sox Fenway Park 1 Fenway Park …

17 Challenge Question I 1 Boston Red Sox Fenway Park … Nomar shortstop 1 … Boston Red Sox Nomar Shortstop Pedro Pitcher Manny Outfield What is the XQuery that converts the document on the left (default XML view) to the document on the right (user view)?

18 Mapping Query Part I Create view invoice as ( FOR $organization IN view ("default") /Organization/row RETURN $organization/teamName/text() FOR $starPlayer IN view ("default") /StarPlayer/row WHERE $starPlayer/organizationID = $organization/organizationID RETURN $starPlayer/starPlayerName/text() $starPlayer/starPlayerPosition/text() B1 B2

19 Mapping Query Part II FOR $stadium IN view ("default") /Stadium/row RETURN $stadium/sname/text() $stadium/capacity/text() $stadium/yearBuilt/text() FOR $player IN view ("default") /PlayerInfo/row RETURN ) B3 B4

20 Cutting Mapping Query The mapping query has data that is unused by the user query, so we can get rid of it B3 and B4 are completely removed Remove stadium from B1 Remove position from B2

21 Mapping Query XAT General Form $organization := Navigate("/",Organization/row) Source(“default.xml”) FOR $organization More Stuff Some Stuff Source(“default.xml”) FOR $starPlayer  Some Stuff will be shown in Part I  More Stuff in Part II B1 B2 $starPlayer := Navigate("/", StarPlayer/row)

22 Mapping Query XAT Part I B1 O := Tagger( All </sports) All = Aggregate Tagger( V0 </organization) V0 := Aggregate Tagger ( $tname ) $tname := Navigate($organization, teamName/text()) $starPlayer := Navigate("/", StarPlayer/row) Source("default.xml") FOR $starPlayer To: Part II Some Stuff FOR $organization

23 Mapping Query XAT Part II Aggregate $ID := Navigate($organization, organizationID) Select($starPlayerID = $ID) $starPlayerID := Navigate($starPlayer, OrganizationID) $sname := Navigate($starPlayer, starPlayerName) To: Part I B2 Tagger( $sname </starPlayer) More Stuff

24 Decorrelated Mapping XAT Part I Boston Red Sox Nomar Pedro Manny Tagger ( $tname O:= Tagger( All ) All = Aggregate Tagger( V0 </organization) V0 := Aggregate $tname := Navigate($organization, teamName/text()) From Part II

25 Decorrelated Mapping XAT Part II Source("default.xml") $organization = Navigate("/", Organization/row) $starPlayer := (Navigate"/", StarPlayer/row) Cartesian Product $ID := Navigate($organization, organizationID) $starPlayerID := Navigate($starPlayer, organizationID) Select($starPlayerID = $ID) $sname := Navigate($starPlayer, starPlayerName) To Part I Aggregate Tagger( $sname </starPlayer)

26 Progress Report 1. User inputs XQuery query 2. User Query is converted into an XML Algebra Tree (XAT) 3. Database Mapping Query’s XAT generated 4. Queries are Decorrelated 5. Trees are merged, unnecessary branches cut

27 XAT merging Input: User Query XAT + Mapping Query XAT Output: Simplified composite XAT Approach: The Tagger from the top of the Mapping Query is linked to the bottom of the User Query. The Source Operator at the bottom of the User Query is deleted Pushdown Navigation By using the commutative rules Cancel out the navigation operators By using the composition rules

28 Combined XAT V1 := Aggregate $pname = Navigate($p, starPlayer/pname/text()) Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) $p := Navigate(O, sports/organization) Tagger( V1 Tagger( $pname Tagger ( $tname O:= Tagger( All ) All = Aggregate Tagger( V0 </organization) V0 := Aggregate $tname := Navigate($organization, teamName/text()) Top of Mapping Query User Query Rest of Mapping Query

29 Computation Pushdown Part I What is PushDown? After merging the 2 XATs, there may be redundancies in the larger tree. Ex: The user query and mapping query may navigate to the same thing The decorrelated query tree may be unorganized and inefficient Pushdown aims to eliminate these problems

30 Computation Pushdown Part II XPERANTO mentions pushdown as a means of pushing computation to relational engine Niagara defines equivalence rules and specifies several different heuristics for using the rules

31 XAT Pushdown Example Part I V1 := Aggregate $pname = Navigate($p, starPlayer/pname/text()) Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) $p := Navigate(O, sports/organization) Tagger( V1 Tagger( $pname Tagger ( $tname O:= Tagger( All ) All = Aggregate Tagger( V0 </organization) V0 := Aggregate $tname := Navigate($organization, teamName/text()) Top of Mapping Query User Query Rest of Mapping Query

32 XAT Pushdown Example Part II V1 := Aggregate $pname = Navigate($p, starPlayer/pname/text()) Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) $p := Navigate(O, sports/organization) Tagger( V1 Tagger( $pname Tagger ( $tname O:= Tagger( All ) All = Aggregate Tagger( V0 </organization) V0 := Aggregate $tname := Navigate($organization, teamName/text()) Top of Mapping Query User Query Rest of Mapping Query

33 XAT Pushdown Example Part III V1 := Aggregate $pname = Navigate($p, starPlayer/pname/text()) Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) $p := Navigate(O, sports/organization) Tagger( V1 Tagger( $pname Source("default.xml") Cartesian Product $organization = Navigate("/", Organization/row) $starPlayer := (Navigate"/", StarPlayer/row) $ID := Navigate($organization, organizationID) $starPlayerID := Navigate($starPlayer, organizationID) Select($starPlayerID = $ID) $sname := Navigate($starPlayer, starPlayerName) Source("default.xml") User Query Tagger( $sname </starPlayer)

34 XAT Pushdown Example Part IV V1 := Aggregate $pname = Navigate($p, starPlayer/pname/text()) Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) $p := Navigate(O, sports/organization) Tagger( V1 Tagger( $pname Source("default.xml") Cartesian Product $organization = Navigate("/", Organization/row) $starPlayer := (Navigate"/", StarPlayer/row) $ID := Navigate($organization, organizationID) $starPlayerID := Navigate($starPlayer, organizationID) Select($starPlayerID = $ID) $sname := Navigate($starPlayer, starPlayerName) Source("default.xml") User Query Tagger( $sname </starPlayer)

35 Challenge Questions II & III What are some of the heuristics we could use during Pushdown? What can / should we try to accomplish? What should the tree look like afterwards? How could we go about pushing things down? What would the algorithm be? How do we know if an operator can be pushed down? When do we stop pushing an operator down?

36 Computation Pushdown Part III Goal: Tagger + SQL operators + XML operators Use Equivalence rules repository to swap operators Step 1: Navigation Pushdown. Cancel Mapping Query Taggers and corresponding Aggregates Delete redundant Navigates from User Query Rename columns in Mapping Query Step 2: SQL Computation Pushdown. By commutative and composition rules.

37 Equivalence Rules Pair-wise rules that determine if one operator (parent) may be pushed through another (child) Navigate / Navigate rule: If the parent depends on the child, they may not be swapped Navigate / Join: Navigate is pushed to the side of the join that its entry point comes from And many, many more

38 Pushdown Results 1.Push Navigates to the correct side of Cartesian Product 2.Create a NameColumn operator that renames $tname into $a 3.Create a 2 nd NameColumn operator that renames $pname into $sname 4.Get rid of all Taggers and Aggregates from Mapping Query and Navigates that were crossed out from User Query 5.Merge Select($starPlayerID = $ID) and Cartesian into a Join

39 XAT After Computation PushDown Part I V1 := Aggregate Select($a = "Boston Red Sox") Tagger( V1 Tagger( $pname NameColumn( $pname = $sname) NameColumn( $a = $tname) From Part II

40 XAT After Computation PushDown Part II $starPlayerID := Navigate($starPlayer, OrganizationID) $sname := Navigate($starPlayer, starPlayerName) $starPlayer := Navigate("/", StarPlayer/row) Source("default.xml") $ID := Navigate($organization, organizationID) Source("default.xml") $organization := Navigate("/",Organization/row) $tname := Navigate($organization, teamName/text()) Join on ($ID = $starPlayerID) To Part I

41 Rest of the Process 1. Take the Combined XAT from the previous slide and generate a single SQL query. 2. Execute query on local RDBMS 3. Format result tuples according to Tagger 4. Return XML document to user

42 Summary 1. Created XAT of the user query 2. Created XAT for mapping query 1. Cut information unused by user query 2. Decorrelated Mapping query 3. Merged two queries into 1 larger XAT 4. Identified weaknesses in combined tree 5. Walked through pushdown steps 6. Displayed final, optimized tree

43 The End!!!