2003. DSRG, Worcester Polytechnic Institute1 Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects WPI DSRG GROUP.

Slides:



Advertisements
Similar presentations
Design Issues for General-Purpose Adaptive Hypermedia Systems Hongjing Wu, Erik de Kort, Paul De Bra Eindhoven University of Technology The Netherlands.
Advertisements

Data Modeling and Database Design Chapter 1: Database Systems: Architecture and Components.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Raindrop: An Algebra-Automata Combined XQuery Engine over XML Streams Hong Su, Elke Rundensteiner, Murali Mani, Ming Li Worcester Polytechnic Institute.
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Databases. Database Information is not useful if not organized In database, data are organized in a way that people find meaningful and useful. Database.
RAINDROP: XML Stream Processing Engine Murali Mani, DB seminar June 08, 2006 Partially Supported by NSF grant IIS
1 Murali Mani Topics projects in databases and web applications and XML Database Systems Research Lab @cs.wpi.eduWebpages:
11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Ling Wang, Mukesh Mulchandani Advisor: Elke A. Rundensteiner Rainbow Research group, DSRG, WPI Updating XQuery Views over Relational Data.
XML Views El Hazoui Ilias Supervised by: Dr. Haddouti Advanced XML data management.
1 Efficient XML Stream Processing with Automata and Query Algebra A Master Thesis Presentation Student: Advisor: Reader: Jinhui Jian Prof. Elke A. Rundensteiner.
1 Murali Mani Topics projects in databases and web applications and XML Database Systems Research Lab @cs.wpi.eduWebpages:
Database Systems and XML David Wu CS 632 April 23, 2001.
A Uniform and Layered Algebraic Framework for XQueries on XML Streams Hong Su Jinhui Jian Elke A. Rundensteiner Worcester Polytechnic Institute CIKM, Nov.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
1 A Unified Model for XQuery Evaluation over XML Data Streams Jinhui Jian Hong Su Elke A. Rundensteiner Worcester Polytechnic Institute ER 2003.
An Agent-Oriented Approach to the Integration of Information Sources Michael Christoffel Institute for Program Structures and Data Organization, University.
WIDM 2002 DSRG, Worcester Polytechnic Institute1 Honey, I Shrunk the XQuery! —— An XML Algebra Optimization Approach Xin Zhang, Bradford Pielech and Elke.
1 Rainbow XML-Query Processing Revisited: The Incomplete Story (Part II) Xin Zhang.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Client-Server Processing and Distributed Databases
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
Query Processing Presented by Aung S. Win.
Main challenges in XML/Relational mapping Juha Sallinen Hannes Tolvanen.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Extracting Relations from XML Documents C. T. Howard HoJoerg GerhardtEugene Agichtein*Vanja Josifovski IBM Almaden and Columbia University*
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
Fundamentals of Database Chapter 7 Database Technologies.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Schema-Based Query Optimization for XQuery over XML Streams Hong Su Elke A. Rundensteiner Murali Mani Worcester Polytechnic Institute, Massachusetts, USA.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
XML & Mediators Thitima Sirikangwalkul Wai Sum Mong April 10, 2003.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon.
Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on: Global Computing (GC) Proactive.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
1 XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin.
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
View Materialization & Maintenance Strategies By Ashkan Bayati & Ali Reza Vazifehdoost.
The Volcano Optimizer Generator Extensibility and Efficient Search.
Fushen Wang, XinZhou, Carlo Zaniolo Using XML to Build Efficient Transaction- Time Temporal Database Systems on Relational Databases In Time Center, 2005.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
XML and Database.
XML Access Control Koukis Dimitris Padeleris Pashalis.
Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.
Johannes Kepler University Linz Department of Business Informatics Data & Knowledge Engineering Altenberger Str. 69, 4040 Linz Austria/Europe
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Session 1 Module 1: Introduction to Data Integrity
Object storage and object interoperability
Database Management Systems.  Instructor: Yrd. Doç. Dr. Cengiz Örencik   Course material.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Welcome to CPSC 534B: Information Integration Laks V.S. Lakshmanan Rm. 315.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Developing GRID Applications GRACE Project
XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents Michael Carey Daniela Florescu Zachary Ives Ying Lu Jayavel Shanmugasundaram.
Lecture 1: Multi-tier Architecture Overview
Query Processing CSD305 Advanced Databases.
Query Optimization.
Presentation transcript:

2003. DSRG, Worcester Polytechnic Institute1 Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects WPI DSRG GROUP

2003.DSRG, Worcester Polytechnic Institute2 Motivation XML is new, and here to stay … Universal flexible representation of data De facto standard for information exchange XQuery is useful, and here to stay… Powerful query language for XML De facto standard for XML querying Plentitude of relevant new issues …

2003.DSRG, Worcester Polytechnic Institute3 Internet XML Paradigm EVE-Middleware XML 1 XML 3 RDB 4 XML 5 RDBL 6 XML n XML 2 WWW: global scale distributed information system for sharing data XML Queries And Updates – searching – querying – integrating – restructuring – updating

2003.DSRG, Worcester Polytechnic Institute4 Internet What We Aim For… EVE-Middleware XML 1 RDB 3 XML 4 RDB5 XML 6 XML n XML 2 XML Data Management Middleware Technology – efficient – flexible – scalable – lightweight – resource-sensitive – adaptive

2003.DSRG, Worcester Polytechnic Institute5 WPI Project Directions RAINBOW: Exploiting RDB for XML management: Algebraic-XQuery processing XCube: Flexible XML Mapping Tool: Flexible loading/extracting XML to RDB via XQuery Updating Virtual XML Views: Update decomposition and trigger-propagation MASS: Native XML Query Engine: Multi-axis compressed order-preserving XML storage

2003.DSRG, Worcester Polytechnic Institute6 WPI Project Directions XCache: XML Query Caching: Cache containment and query rewriting Materialized XML View Maintenance: Incremental algebraic maintenance strategy SAXE: XML Incremental Updating & Evolution: Lightweight updating by update query rewriting RAINDROP: XQuery-based Stream Processing: Adaptive on-fly multi-subscription optimization

2003. DSRG, Worcester Polytechnic Institute7 THE RAINBOW PROJECT

2003.DSRG, Worcester Polytechnic Institute8 XML meets Relational DBs XML 1)Emerging web standard 2)Flexible data representation 3)Powerful query language Relational Database 1) Widely used to store business data 2) Efficient, reliable, secure DBMS 3) Mature query processing techniques The look and feel of an XML query system with maturity and technology support of RDB +

2003.DSRG, Worcester Polytechnic Institute9 TCP/IP Illustrated Data on the Web Running Example Data on the Web002 TCP/IP Illustrated001 TitleBid PriceBid TCP/IP Illustrated 002 Data on the Web FOR $t IN document(“prices.xml”)/book/title RETURN $t TCP/IP Illustrated Data on the Web FOR $book IN document(“dxv.xml”)/book/row $prices IN document(“dxv.xml”)/prices/row WHERE $book/bid = $prices/bid RETURN $book/title, $prices/price

2003.DSRG, Worcester Polytechnic Institute10 XML Default View Fixed and straight-forward mapping scheme. Paperback Texas Holdem' David Sklansky, Straight Flush Paperback Dracula Bram Stoker … XML Default View

2003.DSRG, Worcester Polytechnic Institute11 Generic Loading FUNCTION Q1($root){ LET $maintag := gettag($root) RETURN FOR $actual IN $root/* LET $innertag := gettag($actual) RETURN IF ($actual/element()) THEN Q1($actual) ELSE IF ($actual/text()) THEN ELSE "" } Knowledge of schema of XML document to be loaded helps to reduce unnecessary parts.

2003.DSRG, Worcester Polytechnic Institute12 Instantiation XML Schema Schema XQuery Expression XQuery Expressio n (recursive) XQuery Expression XQuery Expressio n (flat) Instantiator Generic loading XQuery expression recursive. + It works for every XML document. - Many recursive calls return no value. - Unnecessary FOR-loops, IF-clauses, and getName()-fct.

2003.DSRG, Worcester Polytechnic Institute13 Instantiation (Example) FUNCTION Q1($root){ FOR $book IN $root/BOOK RETURN FOR $name IN $book/AUTHOR/NAME RETURN } Short, non-recursive, more efficient … But: XML schema dependent! (First Step of CLOCK mapping scheme) Instantiated Loading Query

2003.DSRG, Worcester Polytechnic Institute14 Flexible Mapping Management RDB Default View Reverser RDB Default View XQuery (Load) XQuery (Extract) XML’ H XML RelationRelation’ G g F f 1 2

2003.DSRG, Worcester Polytechnic Institute15 XCube in a Nutshell Easy-to-use (no new transformation language). Flexible (interchangeable XQuery expressions). Adaptable (to workload, data specifics, …). General (Schema independent). Extendable (with new mapping schemes). Tunable (Loading manager). 1.Generic XQuery loading expressions 2.XQuery load expression instantiation

2003.DSRG, Worcester Polytechnic Institute16 Tuples XAT Merger SQL Generator RDBMS User XQuery SQL XAT Generator XAT Executor User Query Results in XML XAT Optimizer XAT View XQuery XAT Decorrelator View XAT User XAT Architecture XAT XAT: XML Algebra Tree Virtual XML Document View XAT User XAT XAT Virtual XML Document XML Document

2003.DSRG, Worcester Polytechnic Institute17 XQuery-Level Optimization XAT - XML Algebra Tree Model XAT Algebraic Query Plan Optimization XAT Query Plan Reduction

2003.DSRG, Worcester Polytechnic Institute18 T $t col3 Agg S ”prices.xml” R0  R0, book/title $ t  col3 1: 2: 3: 6: 7: User Query User XML Algebra Tree (XAT) FOR $t IN document(“prices.xml”)/book/title RETURN $t XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT

2003.DSRG, Worcester Polytechnic Institute19  $book, title col10 T col5 col4 S “dxv.xml” R1  R1, /book/row $book  Agg T [col10][col12] col5 S “dxv.xml” R3  R3, /prices/row $prices  $prices, price col12 11: 12: 22: 23: 25: 14: 15: 20: 21: 31:  $book, bid col6  $prices, bid col7 27: 28:  col6=col7 26: View Query View XML Algebra Tree (XAT) FOR $book IN document(“dxv.xml”)/book/row $prices IN document(“dxv.xml”)/prices/row WHERE $book/bid = $prices/bid RETURN $book/title, $prices/price XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT

2003.DSRG, Worcester Polytechnic Institute20 T $t col3 Agg  col4 R0  R0, book/title $ t  col3 1: 2: 3: 6: 7:  $book, title col10 T col5 col4 S “dxv.xml” R1  R1, /book/row $book  Agg T [col10][col12] col5 S “dxv.xml” R3  R3, /prices/row $prices  $prices, price col12 11: 12: 22: 23: 25: 14: 15: 20: 21: 31:  $book, bid col6  $prices, bid col7 27: 28:  col6=col7 26: User Query View Query Merged XML Algebra Tree (XAT) XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT

2003.DSRG, Worcester Polytechnic Institute21 XQuery-Level Optimization XML Algebra Representation: XAT XAT Query Plan Rewriting XAT Query Plan Reduction

2003.DSRG, Worcester Polytechnic Institute22 XAT Rewrite Query Optimization at Logic Algebra Level. Goals: Redundancy Elimination. Computation Pushdown. Technique: Equivalence Rewrite Rules. Heuristics: Pushdown Navigates Remove Construction of Intermediate Result Combine Multiple Operators. XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT

2003.DSRG, Worcester Polytechnic Institute23 T $t col3 Agg  col4 R0  R0, book/title $ t  col3 1: 2: 3: 6: 7:  $book, title col10 T col5 col4 S “dxv.xml” R1  R1, /book/row $book  Agg T [col10][col12] col5 S “dxv.xml” R3  R3, /prices/row $prices  $prices, price col12 11: 12: 22: 23: 25: 14: 15: 20: 21: 31:  $book, bid col6  $prices, bid col7 27: 28:  col6=col7 26: User QueryView Query Before Navigation Pushdown

2003.DSRG, Worcester Polytechnic Institute24  31:  $book, bid col6 27:  R1, /book/row $book 14: S “dxv.xml” R1 15:  $book, title col10 23:  $prices, bid col7 28:  R3, /prices/row $prices 20: S “dxv.xml” R3 21:  $prices, price col12 25: T $t col3 Agg  col3 1: 2: 3:  R0, book/title $t 6:  col6=col7 26: T col5 R0 11: Agg 12: T [col10][col12] col5 22: After Navigation Pushdown View QueryUser Query

2003.DSRG, Worcester Polytechnic Institute25  31:  $book, bid col6 27:  R1, /book/row $book 14: S “dxv.xml” R1 15:  $book, title col10 23:  $prices, bid col7 28:  R3, /prices/row $prices 20: S “dxv.xml” R3 21:  $prices, price col12 25: T $t col3 Agg  col3 1: 2: 3:  R0, book/title $t 6:  col6=col7 26: T col5 R0 11: Agg 12: T [col10][col12] col5 22: Remove any Taggers? View QueryUser Query

2003.DSRG, Worcester Polytechnic Institute26  col3 1: T $t col3 2: Agg 3:  col6=col7 26: After Tagger Cancel Out  31:  $book, bid col6 27:  R1, /book/row $book 14: S “dxv.xml” R1 15:  $book, title $t 23:  $prices, bid col7 28:  R3, /prices/row $prices 20: S “dxv.xml” R3 21:  $prices, price col12 25: View Query User Query

2003.DSRG, Worcester Polytechnic Institute27 After Making Join JOIN col6=col7 31:  $book, bid col6 27:  R1, /book/row $book 14: S “dxv.xml” R1 15:  $book, title $t 23:  $prices, bid col7 28:  R3, /prices/row $prices 20: S “dxv.xml” R3 21:  $prices, price col12 25:  col3 1: T $t col3 2: Agg 3: View QueryUser Query

2003.DSRG, Worcester Polytechnic Institute28 XQuery-Level Optimization XML Algebra Representation: XAT XAT Query Plan Rewriting XAT Query Plan Reduction

2003.DSRG, Worcester Polytechnic Institute29 XAT Cleanup Why: SQL engine cannot reduce redundancy in XQuery. How: Data Redundancy by Schema Cleanup Each operator produced, consumed and modified some columns. Minimum schema is then computed. Tree Redundancy by Unused Operator Cutting Cutting matrix generation. Required columns analysis. Operator cutting. XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT

2003.DSRG, Worcester Polytechnic Institute30 XAT Operator Properties Produced Desc: New column generated by operator. Example: , S, T Consumed Desc: Columns required by operator. Example: ,  Modified Desc: Columns modified by operator. Example: , , 

2003.DSRG, Worcester Polytechnic Institute31 Schema Computation {R3}{}{R3}2021 {R3, $prices}{R3}{$prices}2820 {R3, $prices, col7}{$prices}{col7}2528 {R3, $prices, col7, col12}{$prices}{col12}3125 {R1}{}{R1}1415 {R1, $book}{R1}{$book}2714 {R1, $book, col6}{$book}{col6}2327 {R1, $book, col6, $t}{$book}{$t}3123 {R1, $book, col6, $t, R3, $prices, col7, col12} {col6, col7}{}331 {R1, $book, col6, $t, R3, $prices, col7, col12} {} 23 {col3, R1, $book, col6, $t, R3, $prices, col7, col12} {$t}{col3}12 {}1 Old SchemaConsumedProducedParentNode  $book, title $t S “dxv.xml” R1  R1, /book/row $book  col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3:

2003.DSRG, Worcester Polytechnic Institute32 Schema Computation NodeParentProducedConsumedMinimum Schema 1{}{col3} 21 {$t}{col3} 32{} {$t} 313{}{col6, col7}{$t} 2331{$t}{$book}{col6, $t} 2723{col6}{$book}{$book, col6} 1427{$book}{R1}{$book} 1514{R1}{}{R1} 2531{col12}{$prices}{col7, col12} 2825{col7}{$prices}{$prices, col7} 2028{$prices}{R3}{$prices} 2120{R3}{}{R3}  $book, title $t S “dxv.xml” R1  R1, /book/row $book  col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3:

2003.DSRG, Worcester Polytechnic Institute33 Schema Computation {R3} P2021 {$prices} CP2820 {$prices, col7} CP2528 {col7, col12} CP3125 {R1} P1415 {$book} CP2714 {$book, col6} CP2327 {col6, $t} CP3123 {$t} CC331* {$t} 23 {col3} CP12 C1 New Schema R3$pricescol12R1$bookcol7col6$tcol3Parent()# *We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted. Intuition: Don’t keep anything that’s not used later.  $book, title $t S “dxv.xml” R1  R1, /book/row $book  col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3:

2003.DSRG, Worcester Polytechnic Institute34 Schema Cleanup Result Node Original SchemaMinimum Schema 1 {col3, R1, $book, col6, $t, R3, $prices, col7, col12}{col3} 2 {col3, R1, $book, col6, $t, R3, $prices, col7, col12}{col3} 3 {R1, $book, col6, $t, R3, $prices, col7, col12}{$t} 31 {R1, $book, col6, $t, R3, $prices, col7, col12}{$t} 23 {R1, $book, col6, $t}{col6, $t} 27 {R1, $book, col6}{$book, col6} 14 {R1, $book}{$book} 15 {R1} 25 {R3, $prices, col7, col12}{col7, col12} 28 {R3, $prices, col7}{$prices, col7} 20 {R3, $prices}{$prices} 21 {R3}

2003.DSRG, Worcester Polytechnic Institute35 XAT Cleanup Schema Cleanup Each operator produced, consumed and modified some columns. Minimum schema is then computed. Unused Operator Cutting Cutting matrix generation. Required columns analysis. Operator cutting.

2003.DSRG, Worcester Polytechnic Institute36 Cutting Matrix Purpose: Get rid of unused operators. Equations: Propagation of modified Propagation of required Identify cuttable node.

2003.DSRG, Worcester Polytechnic Institute37 Matrix Computation #Parent()col3$tcol6col7$bookR1col12$pricesR3Cut? 1C 21PC *3CC 2331PC 2723PC 1427PC 1514P 2531PC 2825PC 2028PC 2120P *We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.  $book, title $t S “dxv.xml” R1  R1, /book/row $book JOIN col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3:

2003.DSRG, Worcester Polytechnic Institute38 Matrix Computation (Cont.1) P2021 CP2820 CP2528 CP3125 P1415 CP2714 CP2327 CP3123 CC331* M-23 CP12 RRRR1 Cut?R3$pricescol12R1$bookcol7col6$tcol3Parent()# *We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.  $book, title $t S “dxv.xml” R1  R1, /book/row $book JOIN col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3: Intuition: Give me only the required columns in order to get the final result.

2003.DSRG, Worcester Polytechnic Institute39 Matrix Computation (Cont. 2) #Parent()col3$tcol6col7$bookR1col12$pricesR3Cut? 1RRRR 21PC 32-M *3CCX 2331PC 2723PCX 1427PC 1514P 2531PCX 2825PCX 2028PCX 2120PX *We assume Join didn’t modify $t. Otherwise, only node 25 will be deleted.  $book, title $t S “dxv.xml” R1  R1, /book/row $book JOIN col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3:

2003.DSRG, Worcester Polytechnic Institute40 XAT after Cutting  $book, title $t S “dxv.xml” R1  R1, /book/row $book Agg  col3 14: 15: 23: 1: 3: T $t col3 2:  $book, title $t S “dxv.xml” R1  R1, /book/row $book JOIN col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3: Reduced To

2003.DSRG, Worcester Polytechnic Institute41 SQL Generated  $book, title $t S “dxv.xml” R1  R1, /book/row $book Agg  col3 14: 15: 23: 1: 3: T $t col3 2:  $book, title $t S “dxv.xml” R1  R1, /book/row $book JOIN col6=col7 S “dxv.xml” R3  R3, /prices/row $prices  $book, bid col6  $prices, bid col7  $prices, price col12 T $t col3 Agg  col3 27: 28: 14: 15: 20: 21: 31: 23: 25: 1: 2: 3: SELECT “$book”.title as “$t”, “$book”.bid as “col6”, “$prices”.price as “col12”, “$prices”.bid as “col7” FROMbook “$book”, prices “$prices” WHERE“col6”=“col7” SELECT “$book”.title as “$t”, FROMbook “$book”, XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT

2003.DSRG, Worcester Polytechnic Institute42 XQuery-Level Optimization XML Algebra Representation: XAT XAT Query Plan Rewriting XAT Query Plan Reduction

2003.DSRG, Worcester Polytechnic Institute43 Performance Gain in Execution

2003.DSRG, Worcester Polytechnic Institute44 Rainbow Engine Overhead XAT Merger SQL Generator User XQuery XAT Generator XAT Executor XAT Optimizer XAT View XQuery XAT Decorrelator XAT View XAT User XAT XAT View XAT User XAT XAT Rewrite XAT Cleanup Total: 32,522 ms Ack.: XQuery using Kweelt Parser

2003.DSRG, Worcester Polytechnic Institute45

2003.DSRG, Worcester Polytechnic Institute46 Related Work XPERANTO[VLDBJ2000]: XQGM vs. XAT Xquery Views over RDB, Extension by UDFs for XML features SilkRoute[IEEE2001(24:2)]: Xquery Views over RDB, Generate SQL Efficiently AGORA[VLDB2000]: Syntax level rewriting.

2003.DSRG, Worcester Polytechnic Institute47 Summary Efficient XQuery Processing XML Algebra Tree (XAT) XAT Optimization: Rewrite by using equivalent rules Cleanup Schema cleanup Operator cutting Prototype system implementation.