VOX O rder-sensitive V iew Maintenance of Materialized X Query Views ER 2003 October 14 th 2003 Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner.

Slides:



Advertisements
Similar presentations
Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology.
Advertisements

HUX: Handling Updates in XML DataBase Systems Research Group Departmet of Computer Science Worcester Polytechnic Institute, Worcester, MA 01609, USA
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Incremental Maintenance of XML Structural Indexes Ke Yi 1, Hao He 1, Ioana Stanoi 2 and Jun Yang 1 1 Department of Computer Science, Duke University 2.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Di Yang, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute VLDB 2009, Lyon, France 1 A Shared Execution Strategy for Multiple Pattern.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Incremental Maintenance for Non-Distributive Aggregate Functions work done at IBM Almaden Research Center Themis Palpanas (U of Toronto) Richard Sidle.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Xyleme A Dynamic Warehouse for XML Data of the Web.
11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
Ling Wang, Mukesh Mulchandani Advisor: Elke A. Rundensteiner Rainbow Research group, DSRG, WPI Updating XQuery Views over Relational Data.
A Graphical Environment to Query XML Data with XQuery
Database management concepts Database Management Systems (DBMS) An example of a database (relational) Database schema (e.g. relational) Data independence.
1 Efficient XML Stream Processing with Automata and Query Algebra A Master Thesis Presentation Student: Advisor: Reader: Jinhui Jian Prof. Elke A. Rundensteiner.
Chapter 1: Data Models and DBMS Architecture Title: What Goes Around Comes Around Authors: M. Stonebraker, J. Hellerstein Pages: 2-40.
Storing and Querying Ordered XML Using Relational Database System Swapna Dhayagude.
A Uniform and Layered Algebraic Framework for XQueries on XML Streams Hong Su Jinhui Jian Elke A. Rundensteiner Worcester Polytechnic Institute CIKM, Nov.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
1 A Unified Model for XQuery Evaluation over XML Data Streams Jinhui Jian Hong Su Elke A. Rundensteiner Worcester Polytechnic Institute ER 2003.
WIDM 2002 DSRG, Worcester Polytechnic Institute1 Honey, I Shrunk the XQuery! —— An XML Algebra Optimization Approach Xin Zhang, Bradford Pielech and Elke.
1 Rainbow XML-Query Processing Revisited: The Incomplete Story (Part II) Xin Zhang.
A Unified Model for XQuery Evaluation over XML Data Streams Jinhui Jian Hong Su Elke A. Rundensteiner Worcester Polytechnic Institute ER 2003.
Data Warehouse View Maintenance Presented By: Katrina Salamon For CS561.
AGGREGATE PATH INDEX FOR INCREMENTL WEB VIEW MAINTENANCE Author: Li Chen and Elke Rundensteiner Department of Computer Science Worcester Polytechnic Institure.
1 IVOX I ncremental V iew Maintenance for O rdered X ML DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke.
Module 9 Designing an XML Strategy. Module 9: Designing an XML Strategy Designing XML Storage Designing a Data Conversion Strategy Designing an XML Query.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
Module 17 Storing XML Data in SQL Server® 2008 R2.
Lecture 21 XML querying. 2 XSL (eXtensible Stylesheet Language) In HTML, default styling is built into browsers as tag set for HTML is predefined and.
XML-QL A Query Language for XML Charuta Nakhe
Data Access Patterns Some of the problems with data access from OO programs: 1.Data source and OO program use different data modelling concepts 2.Decoupling.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
Company LOGO OODB and XML Database Management Systems – Fall 2012 Matthew Moccaro.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon.
View Materialization & Maintenance Strategies By Ashkan Bayati & Ali Reza Vazifehdoost.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
XML and Database.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Johannes Kepler University Linz Department of Business Informatics Data & Knowledge Engineering Altenberger Str. 69, 4040 Linz Austria/Europe
Di Yang, Zhengyu Guo, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute EDBT 2010, Submitted 1 A Unified Framework Supporting Interactive.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
OrientX: an Integrated, Schema-Based Native XML Database System
Database management concepts
Querying XML XPath.
Data Model.
Querying XML XPath.
Database management concepts
XML Query Processing Yaw-Huei Chen
Query Processing CSD305 Advanced Databases.
Incremental Maintenance of XML Structural Indexes
Query Optimization.
Adaptive Query Processing (Background)
Course Instructor: Supriya Gupta Asstt. Prof
Presentation transcript:

VOX O rder-sensitive V iew Maintenance of Materialized X Query Views ER 2003 October 14 th 2003 Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner Worcester Polytechnic Institute *Now at Microsoft

2 Motivation Views in general  Information integration  Access control, privacy,..etc  Data warehouses XML Views (EXTRA useful)  Information inter-portability  Crossing gaps between different data models Materialized Views  Fast access over complex views  Increased availability  Query optimization RDBXML Other Sources View View Definition Query

3 View Maintaining Materialized Views Methods of view maintenance Recomputation  recompute view from scratch from base data View Source 1Source 2Sources 3..n View Definition Query update When sources are updated, materialized view may become inconsistent. Incremental view maintenance is usually cheaper than full recomputation. Incremental view maintenance  compute changes to view in response to changes to base sources update

4 Goal Incrementally maintaining XQuery views Why is it a challenge?  XML features Hierarchical Optional elements Self-typed IDRefs Ordered  Expressiveness of XQuery language Complex operations: tagging, unnesting, aggregation,.. Expected large auxiliary information XML Source XML Source XML Source View View Definition XQuery

5 Basics of VOX Approach: Algebraic General approaches to view maintenance  Algorithmic – Fixed procedure exists for fixed view type  Algebraic - Update propagation rules for each algebra operator and each update type XML Source XML Source XML Source XML View Update Algebra Tree XQuery Definition Operator D1 D2 Operator D1 Update D2 Update ExecutionView Maintenance time

6 Example Insert element into second book Bib.xml Data on the Web Data on the Web View Extent TCP/IP Illustrated Advanced Programming in the Unix environment TCP/IP Illustrated Data on the Web Advanced Programming in the Unix environment TCP/IP Illustrated Data on the Web for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title View Definition Query Bib.xml List all books that cost less than $60

7 Background on XML Algebra XAT XQuery  XAT algebra tree [ZR02] XAT Operators:  XAT SQL Operators: Select, Project …  XAT XML Operators: Navigate Unnest, Navigate Collection, Tagger, Combine..  $s6, /book $b S “bib.xml” $s6  $b, title $col3  $b, price/text() $col5 T $col3 $col2 C $col2 T $col2 $col1 bib.xml  ($col5 < 60.0)  $col1 view for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title

8 Background on XML Algebra XAT – Data Model  $s6, /book $b S “bib.xml” $s6  $b, title $col3  $b, price/text() $col5 T $col3 $col2 C $col2 T $col2 $col1 bib.xml  ($col5 < 60.0)  $col1 view $col3$b Advanc Advanc.. TCP/IP … Data on Data on.. $col5$col Advanc.. TCP/IP … Data on..  $b, price/text() $col5 XAT Data Model (XAT Table)  Order sensitive table of tuples  Columns denote user-specified or internally generated variable bindings  A cell in a tuple holds an XML node for a sequence of XML nodes The XAT algebra has ordered bag semantics InputOutput

9 Order in XAT Context: View Maintenance $col5$col Advanced Prog… TCP/IP Illustrated Data on the Web  ($col5 < 60.0) $col3 Data on the Web Non Order-sensitiveOrder-sensitive $col3 Data on the Web TCP/IP Illustrated $col3 Data on the Web $col5$col Advanced Prog… TCP/IP Illustrated Data on the Web $col3 Data on the Web  ($col5 < 60.0) $col3 TCP/IP Illustrated Data on the Web

10 node identity Use node identity Why?  Already present as concept in XQuery reference  Can be reference to base XML data set  Can encode structure and order Our Approach to Maintaining Order

11 bib book price title price title b b.h b.n b.t b.h.r b.h.k b.n.m b.t.k b.t.r Lexicographical Keys: LexKeys price b.n.f Multi-level lexicographical keys Comparison b.h < b.t bab < bd.cc b.b < b.b.c Advantages  It is always possible to generate a key between two keys  The deletion of a LexKey in a sequence does not affect other LexKeys

12 LexKeys - References to source XML nodes $col3$b b.h.rb.h b.n.mb.n b.t.rb.t $b b.h b.n b.t $b Advanc.. TCP/IP … Data on.. $col3$b Advanc Advanc.. TCP/IP … TCP/IP … Data on Data on..  $b, title $col3 bib.xml bib book price title price title b b.h b.n b.t b.h.r b.h.k b.n.m b.t.k b.t.r title Storage Manager

13 LexKeys - References to constructed nodes $col3 b.n.m b.t.r $col2 y.c y.b Constructed Nodes SkeletonLexKey y.b y.c cheap_book b.t.r cheap_book b.n.m T $col3 $col2  $s6, /book $b S “bib.xml” $s6  $b, title $col3  $b, price/text() $col5 T $col3 $col2 C $col2 T $col2 $col1 bib.xml  ($col5 < 60.0)  $col1 view Storage Manager bib.xml bib book b b.h b.n b.t title b.t.r

14 Order Among XAT Tuples Notion: designate order schema to XAT tables  Ordering by LexKeys in columns in order schema yields correct tuple order.  Comparison operation ‘<’ on tuples. $col3$b b.h.rb.h b.n.mb.n b.t.rb.t $b b.h b.n b.t  $b, title $col b.h < b.n < b.t 1 1

15 Order Schema Computation Operatorop(R)Order Schema OS Q, Q = op(R) Tagger T pattern $col’ (R) OS R Calculated in a postorder traversal of the tree Schema Computation Rules

16 Concept of overriding order Overriding Order (LexKey) Key (LexKey) LexKey with overriding order Node identity part, by default also represents order Optional, only represents order when present Notation: key [order] Examples  b.c.b [h]  b.c.b Order Among Nodes in a Cell Combine creates a collection in which nodes may be in order different then one encoded in node identity Most collections of XML nodes are in document order  Navigate Collection, XML Union, …

17 The Impact of Using LexKeys on View Maintenance XML algebra now has (non- ordered) bag semantics Gained distributiveness with regard to bag union and difference Compact intermediate results $col5$col3$b b.h.k.mb.h.rb.h b.n.mb.n b.t.k.m b.t.rb.t  ($col5 < 60.0) $col3$b b.t.rb.t b.n.f.m $col3$b b.t.rb.t b.n.mb.n

18 Update Propagation Strategy XML Source XML View Update XAT iup kk Storage Manager Rainbow Update XQuery

19 Update Propagation Rules Operator R Q Update to R Update to Q ExecutionView Maintenance time Use distributiveness with regard to bag union Reuse rules from relational view maintenance for XAT SQL operators Provide rules for XAT XML operators

20 Update Propagation Rules Example - Navigate Unnest on Insertion of Tuples Q old =  $col,path $col’ (R old ) R new =R old +  R Q new =  $col,path $col’ (R old +  R) = =  $col,path $col’ (R old ) +  $col,path $col’ (  R) = = Q old +  Q + represents bag union R Q  $col,path $col’ u (  R) u (  Q) ExecutionView Maintenance time  $col,path $col’ Propagate u(  Q)

21 bib.xml Constructed XDOMs $col5$col3$btid b.h.k.mb.h.rb.h1 b.n.f.mb.n.mb.n2 b.t.k.mb.t.rb.t3  $s6, /book $b S “bib.xml” $s6  $b, title $col3  $b, price/text() $col5 T $col3 $col2 C $col2 T $col2 $col1  ($col5 < 60.0)  $col1 view bib book price title price title b b.h b.n b.t b.h.r b.h.k b.n.m b.t.k b.t.r $btidpid b.h11 b.n21 b.t b.h.k.m b.t.k.m tid 3 $b b.t SkeletonLexKey y.b cheap_book b.t.r $col1 x x result y.b[b.t] title View Maintenance Example  + b.n.f, book[b.n]/price[b.n.f] b u (  + b.n.f, book[b.n]/price[b.n.f] b, $s6, 1) u (  + b.n.f, price[b.n.f] b.n, $b, 2) u (  c, $col5, 2) |  c= b.n.f.m u (  s) |  s = (b.n, b.n.m) u (  s) |  s = (b.n, y.c ) u (  c, $col2, 1) |  c = y.c[b.n] u (  + y.c[b.n], result[1]/$col2 x, $col1, 1) u (  + b.n.f, price[b.n.f] b.n, $b, 2) Storage Manager Rainbow Insert element into second book price b.n.f b.n.f.m tid 3 2 y.c cheap_book b.n.m $b b.t b.n y.c[b.n]

22 bib.xml Constructed XDOMs bib book price title price title b b.h b.n b.t b.h.r b.h.k b.n.m b.t.k b.t.r b.h.k.m b.t.k.m SkeletonLexKey y.b cheap_book b.t.r x result y.b[b.t] title View Maintenance Example Storage Manager Rainbow Insert element into second book price b.n.f b.n.f.m y.c cheap_book b.n.m y.c[b.n] S “bib.xml” $s6 T $col2 $col1  $col1 view x $col result y.b[b.t] y.c[b.n] cheap_book b.n.m TCP/IP Illustrated title cheap_book title Data on the Web

elements of interest selectivity 50% Experimental Evaluation Basic performance comparisonVarying size of insert Implemented in Java on top of Rainbow system Experimental evaluation

24 Related work Relational  [GMS93] Survey  [GL95] Algebraic approach to maintain relational views with duplicates  [BLT86], [CW91], [ZGHW95], [Q96], [MK00], [PSCP02]… Object-Oriented  [KR96] MultiView. Object algebra, exploit OO features like inheritance, path indexes.  [AFP02] Algebraic approach. Store OID-s rather then actual data. XML-like data models  [ZM98] Select-Project graph structured views as collections of objects.  [AMRVW98] Semistructured data model OEM, query language LOREL. Only atomic updates. Does not handle order.  [QLR02] Dynamic web data. Based on XPath. Maintains path index structure.  [LD00] Hierarchical semistructured data. View defined with WHAX-QL. Does not handle order.  [EWDR02] – Motivation for this work. Algebraic approach. Does not handle order. Large intermediate results.

25 Conclusions Proposed order-encoding scheme that migrates XML algebra from ordered to non-ordered bag semantics Gave first solution to order-sensitive XQuery view maintenance  Handles core of XQuery  Handles complex updates Proved correctness of approach Implemented the solution within Rainbow Experimental evaluation confirms feasibility of solution

26 For more information The Rainbow project Related publications  K. Dimitrova, M. El-Sayed and E. Rundensteiner. Order-sensitive View Maintenance of Materialized XQuery Views. Technical Report WPI-CS-TR-03-17, May  M. El-Sayed, K. Dimitrova, E. Rundensteiner, Efficiently Supporting Order in XML Query Processing, WIDM'03, New Orleans, Nov  X. Zhang, K. Dimitrova, L. Wang, B. Pielech, L. Ding, B. Murphy, M. El-Sayed and E. Rundensteiner. RainbowII: Multi-XQuery Optimization Using Materialized XML Views. SIGMOD DEMO, Jun  M. Sayed, L. Wang, L. Ding and E. Rundensteiner. An Algebraic Approach for Incremental Maintenance of Materialized XQuery Views. In Proceedings of WIDM02, page88, 2002.(.ps)ps  X. Zhang, B. Pielech and E. Rundensteiner. Honey, I Shrunk the Xquery!- An XML Algebra Optimization Approach. In Proceedings of WIDM02,  X. Zhang, M. Mulchandani, S. Christ, B. Murphy and E. Rundensteiner. Rainbow: Mapping-Driven XQuery Processing System. Proceeding of SIGMOD02, In Demo Session, page 614, 2002.

27 Thank you !