HUX: Handling Updates in XML DataBase Systems Research Group Departmet of Computer Science Worcester Polytechnic Institute, Worcester, MA 01609, USA

Slides:



Advertisements
Similar presentations
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA.
Advertisements

Toward Scalable Keyword Search over Relational Data Akanksha Baid, Ian Rae, Jiexing Li, AnHai Doan, and Jeffrey Naughton University of Wisconsin VLDB 2010.
Raghavendra Madala. Introduction Icicles Icicle Maintenance Icicle-Based Estimators Quality Guarantee Performance Evaluation Conclusion 2 ICICLES: Self-tuning.
Sample MQP Projects Murali Mani
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
Jianxin Li, Chengfei Liu, Rui Zhou Swinburne University of Technology, Australia Wei Wang University of New South Wales, Australia Top-k Keyword Search.
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
Efficient IR-Style Keyword Search over Relational Databases Vagelis Hristidis University of California, San Diego Luis Gravano Columbia University Yannis.
CMPT 354 Views and Indexes Spring 2012 Instructor: Hassan Khosravi.
1 Display and Analysis Tools for Updating XML Views Display and Analysis Tools for Updating XML Views David Krolick and Alex Perry Advisor: Professor Elke.
Raindrop: An Algebra-Automata Combined XQuery Engine over XML Streams Hong Su, Elke Rundensteiner, Murali Mani, Ming Li Worcester Polytechnic Institute.
DISCOVER: Keyword Search in Relational Databases Vagelis Hristidis University of California, San Diego Yannis Papakonstantinou University of California,
The CERIF-2000 Implementation. Andrei S. Lopatenko CERIF Implementation Guidelines Andrei Lopatenko Vienna University of Technology
Cs4432optimization1 CS4432: Database Systems II Lecture #18 Query Optimizer – Wrap Up Professor Elke A. Rundensteiner.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
RAINDROP: XML Stream Processing Engine Murali Mani, DB seminar June 08, 2006 Partially Supported by NSF grant IIS
1 Murali Mani Topics projects in databases and web applications and XML Database Systems Research Lab @cs.wpi.eduWebpages:
11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A.
Dynamic Plan Migration for Continuous Query over Data Streams Yali Zhu, Elke Rundensteiner and George Heineman Database System Research Group Worcester.
Ling Wang, Mukesh Mulchandani Advisor: Elke A. Rundensteiner Rainbow Research group, DSRG, WPI Updating XQuery Views over Relational Data.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.
1 Murali Mani Topics projects in databases and web applications and XML Database Systems Research Lab @cs.wpi.eduWebpages:
CS 4432lecture #71 CS4432: Database Systems II Lecture #7 Professor Elke A. Rundensteiner.
WIDM 2002 DSRG, Worcester Polytechnic Institute1 Honey, I Shrunk the XQuery! —— An XML Algebra Optimization Approach Xin Zhang, Bradford Pielech and Elke.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
1 Augmenting MatML with Heat Treating Semantics Aparna Varde, Elke Rundensteiner, Murali Mani Mohammed Maniruzzaman and Richard D. Sisson Jr. Worcester.
AGGREGATE PATH INDEX FOR INCREMENTL WEB VIEW MAINTENANCE Author: Li Chen and Elke Rundensteiner Department of Computer Science Worcester Polytechnic Institure.
Prefetching for Visual Data Exploration Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
1 Processing Recursive Xquery over XML Streams: The Raindrop Approach Mingzhu Wei Ming Li Elke A. Rundensteiner Murali Mani Worcester Polytechnic Institute.
Module 9 Designing an XML Strategy. Module 9: Designing an XML Strategy Designing XML Storage Designing a Data Conversion Strategy Designing an XML Query.
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
Deferred Maintenance of Disk-Based Random Samples Rainer Gemulla (University of Technology Dresden) Wolfgang Lehner (University of Technology Dresden)
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
Schema-Based Query Optimization for XQuery over XML Streams Hong Su Elke A. Rundensteiner Murali Mani Worcester Polytechnic Institute, Massachusetts, USA.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
CAPE: Continuous Query Engine with Heterogeneous-Grained Adaptivity Elke A. Rundensteiner, Luping Ding, Timothy Sutherland, Yali Zhu Brad Pielech, Nishant.
Swarup Acharya Phillip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented By Vinay Hoskere.
M.Kersten Dec 31, Cracking the database store The far side of the Moon Martin Kersten, Stefan Manegold Centre for Mathematics and Computer Science.
Chapter 15 Recovery. Topics in this Chapter Transactions Transaction Recovery System Recovery Media Recovery Two-Phase Commit SQL Facilities.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon.
R-SOX : R untime S emantic Query O ptimization over X ML Streams Song Wang, Hong Su, Ming Li, Mingzhu Wei, Shoushen Yang Drew Ditto, Elke A. Rundensteiner.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle.
University of Toronto Department of Computer Science Lifting Transformations to Product Lines Rick Salay, Michalis Famelis, Julia Rubin, Alessio Di Sandro,
FlexTable: Using a Dynamic Relation Model to Store RDF Data IDS Lab. Seungseok Kang.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
Breadth First Search and Depth First Search. Greatest problem in Computer Science Has lead to a lot of new ideas and data structures Search engines before.
Further Consolidation Objectives of the Lecture : To use simple queries to check out a DB. To use insertions, deletions and amendments to maintain a DB.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
CS4432: Database Systems II
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
نگاشت‌ پرس‌وجوهاي XML به پرس‌وجوهاي رابطه‌اي‌
CLUSTER BY: A NEW SQL EXTENSION FOR SPATIAL DATA AGGREGATION
MCN: A New Semantics Towards Effective XML Keyword Search
Presentation transcript:

HUX: Handling Updates in XML DataBase Systems Research Group Departmet of Computer Science Worcester Polytechnic Institute, Worcester, MA 01609, USA Ling Wang, Elke A. Rundensteiner, Murali Mani and Ming Jiang

Updating XML Views for Data Integration XMLRDB XMLOODBXML Time Magazine Protein Sequence Database (PSD) Citi Bank American Airline Amazon Google Map Virtual XML Views News Biologist Google map Online billing e-ticketShopping ? ?

Examples queriesviewsobservationreason …. …. view elements of region are never updatable base tuples used will also contribute to a view element of Regionnew …... …. view elements of nation are always updatable each used base tuple will join with one view element of Region only R_Key … View Region * Regionnew * View Region Nation * *R.regionkey=N.regionkey N_key … RegionNation

Challenges Update Translatability Checking  Does at least one correct translation exist?  If yes, what are the candidate translations?  If not, where the view side effect could happen? Update Translation Strategy  What are the correct translations?  How to find the correct translations?  Which one is the best translation?

Criteria of correct translations View side-effect free No “ extra ” updates Option 1: Face with unexpected side effects Option 2: Expensive Rollback to fix problems (3) U U(D) V D View Query (1) (2) u u(V) View Query (4) Accept Update if there exists a correct translation Otherwise

Naive Approach -- Pure Data-driven Check Pure data-driven check  Guarantee “ safe ”  Guarantee “ complete ” But:  Very inefficient for XML view updating Data examination for all view nodes Core Idea: When updating a view element, base tuples that contribute to other view elements should remain untouched.

ValidInvalid UntranslatableTranslatable View Update Uncertain Untranslatable Translatable Schema Data HUX I: Exploiting Schema Knowledge Schema-driven translatable For every update on any element of the schema node: There is at least one correct translation Schema-driven untranslatable For every update on any element of the schema node: There does not exist any correct translation

View Elements Classification  Classify schema nodes into Self, Ancestor, Descendent, Others  No side effects on any element of SADO schema nodeAD O O S View side-effects Classification When updating a relation r to update a view element ve i: (1) r may contribute to the existence of vej, ve j = ve j (2) j may get deleted. j is a relation that refers to r and contribute to vej, vej = vei

Exploiting Schema Knowledge (Con’d) Core Idea: When updating a view element, relations that contribute to other view elements should remain untouched. Pros: Efficiency: use schema knowledge only Cons: Conservative: always assume the worst case View Region Nation Customer Orders * * LineItem * * * R.regionkey=N.regionkey N.nationkey=C.nationkey C.customerkey=N.customerkey O.orderkey=LI.orderkey Regionnew * LineItem ? Nation ? Regionnew ? Can we delete a view element whose schema node is..

HUX II: Schema-directed Data-driven Check data-driven translatable For the given update on an element of the schema node: There is at least one correct translation data-driven untranslatable For the given update on an element of the schema node: There does not exist any correct translation ValidInvalid UntranslatableTranslatable View Update Uncertain Untranslatable Translatable Schema Data

Schema-directed Data-driven Check (Con ’ d) Core Idea: When updating a view element, base tuples that contribute to other view elements should remain untouched. Not Pure Data-driven Check! Only check the part where schema check cannot perform View Region Nation Customer Orders * * LineItem * * * R.regionkey=N.regionkey N.nationkey=C.nationkey C.customerkey=N.customerkey O.orderkey=LI.orderkey Regionnew * Can we delete a view element whose schema node is Regionnew ?

HUX: Handling Updates in XML View Query STAR: Schema-driven Translatability Reasoning SDC: Schema-directed Data Checking SQL Update Generator HUX Data Storage Oracle Valid User Update Query Uncertain SQL Updates Annotated Schema Graph Generator XML/RDB Schema ASG Fail Untranslatable Error Message DB2SQL-ServerSybase Success Translatable

Experiments: HUX vs. Data-based TPCH Benchmark Data-based XML view updating (DXVU) Extend the relational view update system [CWW2000] to perform XML view check Observation (1) Schema cases: HUX performance much better than DXVU. Great performance gain! (2) Data cases: Two systems are comparable. View Region Nation Customer Orders * * LineItem * * * R.regionkey=N.regionkey N.nationkey=C.nationkey C.customerkey=N.customerkey O.orderkey=LI.orderkey Regionnew * Schema translatable Schema untranslatableData check

Contribution Proposed theoretical foundation for XML view updates. Proposed schema-centric view updating algorithm. Proved the correctness and completeness. Implemented as an extension of Rainbow query engine. Demonstrated efficiency through experiments.

Rainbow Project Recent Publications L. Wang, E. A. Rundensteiner, Murali Mani and Ming Jiang. HUX: A Schemacentric Approach for Updating XML Views. In CIKM, 2006 to appear L. Wang, E. A. Rundensteiner and Murali Mani. UFilter: A Lightweight XML View Update Checker. In ICDE, 2006 L. Wang, E. A. Rundensteiner, and M. Mani. Updating XML Views Published Over Relational Databases: Towards the Existence of a Correct Update Mapping. In DKE Journal, L. Wang, S. Wang, B. Murphy, and E. A. Rundensteiner. Order Sensitive XQuery Processing over Relational Sources: An Algebraic Approach. In IDEAS, L. Wang and E. A. Rundensteiner. On the Updatability of XQuery Views Publised over Relational Data. In ER, pages 795–809, 2004.

Problem Space

HUX vs. Relational View Update System View Region Nation Customer Orders * * LineItem * * * R.regionkey=N.regionkey N.nationkey=C.nationkey C.customerkey=N.customerkey O.orderkey=LI.orderkey L LO LOC LOCN LOCNR Data-driven relational view update (RVU) system [CWW2000] Best case --- find translation at first probe. Worst case --- find it at the last probe. HUX: Only schema-driven check Observation: HUX is better than RVU even for the best case.

Experiments  HUX vs. Non-guaranteed (unsafe) system  HUX vs. relational view update system  HUX vs. pure data-driven XML view update system  Performance of HUX  Usefulness of HUX (user study) Experimental set up  TPCH benchmark  Stop criteria: First Correct Translation (FCT) Exhaustive search criteria: Find All Correct Translations (ACT)  Parameters: Database size Key and foreign keys Element to be deleted in the view View size