Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team.

Slides:



Advertisements
Similar presentations
12 Copyright © 2005, Oracle. All rights reserved. Query Rewrite.
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Database System Concepts and Architecture
XML: Extensible Markup Language
What is a Database By: Cristian Dubon.
Outline  Introduction  Background  Distributed DBMS Architecture  Distributed Database Design  Semantic Data Control ➠ View Management ➠ Data Security.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Using Multi-Encryption to Provide Secure and Controlled Access to XML Documents Tomasz Müldner, Jodrey School of Computer Science, Acadia University, Wolfville,
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
Incremental Maintenance for Non-Distributive Aggregate Functions work done at IBM Almaden Research Center Themis Palpanas (U of Toronto) Richard Sidle.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Databases. Database Information is not useful if not organized In database, data are organized in a way that people find meaningful and useful. Database.
15 Chapter 15 Web Database Development Database Systems: Design, Implementation, and Management, Fifth Edition, Rob and Coronel.
11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A.
Incremental Network Programming for Wireless Sensors NEST Retreat June 3 rd, 2004 Jaein Jeong UC Berkeley, EECS Introduction Background – Mechanisms of.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Introduction to Structured Query Language (SQL)
XQuery: 1 W3C (World Wide Web Consortium) What is W3C? –An industry consortium, best known for standardizing HTML and XML. –Working Groups create or adopt.
A Graphical Environment to Query XML Data with XQuery
1 COS 425: Database and Information Management Systems XML and information exchange.
Overview Distributed vs. decentralized Why distributed databases
VOX O rder-sensitive V iew Maintenance of Materialized X Query Views ER 2003 October 14 th 2003 Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner.
Winter 2002Arthur Keller – CS 18018–1 Schedule Today: Mar. 12 (T) u Semistructured Data, XML, XQuery. u Read Sections Assignment 8 due. Mar. 14.
Querying Ontology Based Database Using OntoQL Stephane Jean et al. Presented by: Meher Talat Shaikh.
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Sorting and Query Processing Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 29, 2005.
The Efficient Maintenance of Access Roles with Role Hiding Chaoyi Pang Xiuzhen Zhang
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
Data Warehouse Operational Issues Potential Research Directions.
SPARQL All slides are adapted from the W3C Recommendation SPARQL Query Language for RDF Web link:
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
Introduction to Databases A line manager asks, “If data unorganized is like matter unorganized and God created the heavens and earth in six days, how come.
70-294: MCSE Guide to Microsoft Windows Server 2003 Active Directory, Enhanced Chapter 4: Active Directory Architecture.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Database Management 9. course. Execution of queries.
March 6th, 2008Andrew Ofstad ECE 256, Spring 2008 TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael J. Franklin, Joseph.
An Algebra for Composing Access Control Policies (2002) Author: PIERO BONATTI, SABRINA DE CAPITANI DI, PIERANGELA SAMARATI Presenter: Siqing Du Date:
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
View Materialization & Maintenance Strategies By Ashkan Bayati & Ali Reza Vazifehdoost.
XML Databases by Sebastian Graf Hier beginnt mein toller Vortrag.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
XML and Database.
Cost Framework for a Heterogeneous Distributed Semi-structured Environment Tianxiao Liu (1)(2) Tuyet-Tram Dang-Ngoc (1) Dominique Laurent (1) DBMAN 2007.
Presented By: Miss N. Nembhard. Relation Algebra Relational Algebra is : the formal description of how a relational database operates the mathematics.
IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.
(A comparative study for XML change detection) Grégory Cobéna (INRIA), Talel Abdessalem (ENST), Yassine Hinnach (ENST) Etude comparative sur la détection.
An Effective SPARQL Support over Relational Database Jing Lu, Feng Cao, Li Ma, Yong Yu, Yue Pan SWDB-ODBIS 2007 SNU IDB Lab. Hyewon Lim July 30 th, 2009.
Chapter 1 Database Access from Client Applications.
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
Written By: Presented By: Swarup Acharya,Amr Elkhatib Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy Join Synopses for Approximate Query Answering.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Partial Query-Evaluation in Internet Query Engines Jayavel Shanmugasundaram Kristin Tufte David DeWitt David Maier Jeffrey Naughton University of Wisconsin.
XML: Extensible Markup Language
Paper Presentation Prepared by Dindar Öz
Database Management System
Functions of a Database Management System
Temporal Indexing MVBT.
Distributed Databases
Database Architecture
Views 1.
Query Optimization.
CPSC-608 Database Systems
Shelly Cashman: Microsoft Access 2016
Presentation transcript:

Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team

2 Summary 1) Issue and context 1) Pre-requisite 2) The issue 3) Context 4) State of the art 2) Contributions 1) View computation with the XAlgebra 2) Detection and Identification of source updates 3) View maintenance 4) Applications and performances Conclusion

3 Mediation architecture Introduced by WiederHold The architecture  mediator  wrappers  sources  Query langague 1.1 Pre-requisite

4 Mediation architecture Mediator  Handle the user request: canonization, atomization  Send atomic request to a source via its wrapper wrappers  Translate query coming from the mediator into a query in the native langague of the web source  Give the mediator an answer in XML Data sources  heterogeneous  distributed  In a web context : Partially unavailable Source SQL Wrapper Meditor XML Atomic request SQLTuples 1.1 Pre-requisite

5 Views What about views ?  Data integration  Access control, security  Data-warehouses Why ?  Interoperability  Heterogeneous data Materializing views  Fast access to complex query  Better Availability  Request optimization RDBSQLHTML Materialized views Wrapper Mediator Wrapper 1.1 Pre-requisite

6 Issue : View maintenance Maintenance process Recomputation  Recompute the whole view from scratch When data sources are updated, the view consistency should be kept Incremental maintenance  compute changes to view in response to changes to base sources Source t View t View computation Source t+1 View t+1 Recomputation Update incremental Maintenance 1.2 Issue

7 Context : semi-structured XML data  XML views are materialized at the mediator level Hierarchical data No scheme, except the query scheme Advanced Programming in the Unix environment TCP/IP Illustrated Advanced Programming in the Unix environment Data on the Web Données sur le Web Advanced Programming in the Unix environment TCP/IP Illustrated Advanced Programming in the Unix environment Data on the Web Données sur le Web 1.3 Context

8 Context : XQUERY  XQuery  Dedicated to XML data Relational operator (projection, select, join, union, …) XML operator (tagging, unnesting, aggregation,..)  FLWOR syntax …………(pronounced Flower !) for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return $b/title for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return $b/title Syntaxe FLWOR for $var in foret [$var in foret]* let $var:= sous-arbre Where condition Return result Syntaxe FLWOR for $var in foret [$var in foret]* let $var:= sous-arbre Where condition Return result 1.3 Context

9 Context : Other specificities  Views are computed using XAlgebra  Cf.View computation  Wrappers have limited resources  Few computation possibilities  A component named logger stores the last modification date and a checksum of sources  Non cooperative web sources  No information about their updates  Not always available  Not enough granularity 1.3 Context

10 State of the art (1/2) Relational views  Not fit for semi-structured data Abiteboul and Al.  OEM (Object Embedded Model)  LOREL language  Some Operators are missing VOX – Rainbow Team  Need to know the exact position in the XML Tree where the update has been done 1.4 State of the art

11 State of the art (2/2) Cobena and Al.  XDiff – an algorithm for XML files comparison  Need a copy of the source at the wrapper level Bonnet and Al. /Papadimos and Al.  Parachute queries  A mutant query plan What about when sources are really unavailable ? Our goal : Reduce to the minimum sources access Use information that are stored in the view 1.4 State of the art

12 View maintenance : The process View computation  An algebraic approach using XAlgebra – Extension of the XAlgebra (identifiers) Update detection  Comparison of the information of the source and those stored in the logger Update identification  Recovering process  Diff Algorithm View maintenance  Propagation rules for each operator 2.1 View computation

13 View computation Steps : 2.1 View computation

14 The XAlgebra data model Data structures :  XRelation, XTuple, XAttributes Operators :  XSource, XConstruct, XUnion, …. 2.1 View computation

15 XSource Operator– Step 1 XQuery analysis We obtain :  A context  A set of patterns For $f in doc("informations.xml")/personnes/personne Let $a:=$f/nom Where $f/age<27 and $a="Durand" Return {$a} {$f/prenom} Path extraction :  Optional  Mandatory  Hidden 2.1 View computation

16 XSource Operator– Step 2 and 3 From XML Sub-Trees to the tabular structure 1 Sub Tree => 1 Xtuple XRelation = set of XTuples 2.1 View computation

17 XSource Operator– Extending the Algebra adding identifiers : XTids An XTID is a set of pair : {(idsource, idfragment), …..} 2.1 View computation

18 View computation - XOperator XProject 2.1 View computation

19 View computation - XOperator XJoin XTids propagation : card (XTID)  1 for some nodes 2.1 View computation

20 Update detection and Identification Detection Comparison of the information of the source and those stored in the logger The last modification date The checksum of the source Identification  Partial recovery of the source information based on Xtids  Comparison of the recovered XRelation with the updated source  Δ computation 2.2 Update detection and identification

21 XRecover Step 1 : Project XR v on XR 1 patterns 2.2 Update detection and identification

22 XRecover Step 2 : filtering XTuples values 2.2 Update detection and identification

23 XRecover Step 3 : re-ordering XTuples XTidUnnest 2.2 Update detection and identification Xtuples are unnested depending on their XTids

24 XRecover Step 3 : re-ordering Xtuples XTidnest 2.2 Update detection and identification Xtuples are nested by their Xtids Xtuples are re-ordered

25 Update Identification – Comparison Algorithm Comparison of XR 1 t+1 avec XR t ’  XR 1 t+1 is the XRelation obtained by applying Xsource to source 1 at t+1  XR t’ is the partial recovery of Xrelation of source 1 at t Remark : XR 1 t+1 can also be filtered using predicates before comparison The Diff algorithm is based on Unix Diff (Hunt & McIllroy). The symbol is the Xtuple instead of being the line 2.2 Update detection and identification

26 Update identification – Diff algorithm Delta  with hunks :  Insert(pos; Xtuple)  delete(pos;Xtuple)  Replace(pos; Xtupleold, Xtuplenew) 2.2 Update detection and identification Insert(2,{Leclerc,Avide,{(1,3)}} {John,Avide,{(1,3)}} } Delete(4,{Durand,Avide,{(1,11)}}, {Marcel,Avide,{(1,11)}} {Eric,Avide,{(1,11)}}} Etc…

27 Maintenance Rules From Delta to view maintenance  Case of a deletion - delete(pos, xtuple) An Xtuple is associated to an Xtid {(x)} such that card=1, Each Xvalue of the view have xtids noted XTID 1) We delete from Xvalues each pair of the Xtid such that x  XTID Example : The XTuple where xtid is x=1,3 has been deleted The Xvalue {Alain}1,3;1,4 becomes XValeur {Alain}1,4 2) We delete each Xvalues such that card(XTID)=0 If XValue {Alain}1,3 become XValeur {Alain}  We delete entirely the XValue 3) If the Xvalue was concenned by the predicate, we delete the XTuple  Join and restriction case 2.3 View maintenance

28 Maintenance Rules From Delta to view maintenance  Case of an insertion - insert(pos; xtuple) 1) A new Xtid is created Goal : preserved Xtuples order for a later recovery 2) Depending on the operator; we obtain various maintenance instructions Projection: insert of the projection of the xtuple Select : xtuple satisfies the predicat  insertion Join XR 1 * XR 2, computation of XT= xtuple * XR 2. If XT    insertion of XT Union and Intersect: we keep the conservation des doublons  Union  Select where the predicate is always true  Intersect  join Depending on the predicate, we can request either XR 2 or its recovery 2.3 View maintenance

29 Maintenance Rules From Delta to view maintenance  Case of a modification- Replace(pos; Xtupleold, Xtuplenew) Xtuple modification = Xvalue modification OR Xvalues deletion followed by insertion Project and Union: modification of the concerned XValues Select and Intersect: If modification is applied an Xvalue that must verify the condition,  deletion of the Xtuple Else modification of the XValues Intersect  select. Join  deletion followed by insertion. 2.3 View maintenance

30 Maintenance Rules From Delta to view maintenance 2.3 View maintenance

31 Maintenance rules Missing Information  Missing Information (join ?)  Source Recovery  Multi-view strategy  Source request Goal : limited acces to the sources !!!! Example : View= S 1 *S 2 SQLHTML Materialized views Mediator Wrapper xtuple x is inserted in S1 Computation of S2 ’ Insertio : x * S 2 ’ 2.3 View maintenance

32 Applications On the web With sensors (ANR Project ) When necessary sources are unavailable Goal : Limited access to them With sensors that have no wire Goal: Preserve power ressources 2.4 Applications and performances

33 Performances Comparison between XRecover and Recomputation 2.4 Applications and performances

34 Performances Comparison between XRecover and Recomputation 2.4 Applications and performances

35 Contributions Maintenance process in the context of non-cooperative web sources Contribution to the XAlgebra  New operators : XRecover, XTidUnnest, XTidNest  New data structure : XTids Futur work  Order sensitive view maintenance  A better Diff algorithm Conclusion

36 Thanks for you attention ! Any questions ?