Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team.

Similar presentations


Presentation on theme: "Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team."— Presentation transcript:

1 Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team

2 2 Summary 1) Issue and context 1) Pre-requisite 2) The issue 3) Context 4) State of the art 2) Contributions 1) View computation with the XAlgebra 2) Detection and Identification of source updates 3) View maintenance 4) Applications and performances Conclusion

3 3 Mediation architecture Introduced by WiederHold The architecture  mediator  wrappers  sources  Query langague 1.1 Pre-requisite

4 4 Mediation architecture Mediator  Handle the user request: canonization, atomization  Send atomic request to a source via its wrapper wrappers  Translate query coming from the mediator into a query in the native langague of the web source  Give the mediator an answer in XML Data sources  heterogeneous  distributed  In a web context : Partially unavailable Source SQL Wrapper Meditor XML Atomic request SQLTuples 1.1 Pre-requisite

5 5 Views What about views ?  Data integration  Access control, security  Data-warehouses Why ?  Interoperability  Heterogeneous data Materializing views  Fast access to complex query  Better Availability  Request optimization RDBSQLHTML Materialized views Wrapper Mediator Wrapper 1.1 Pre-requisite

6 6 Issue : View maintenance Maintenance process Recomputation  Recompute the whole view from scratch When data sources are updated, the view consistency should be kept Incremental maintenance  compute changes to view in response to changes to base sources Source t View t View computation Source t+1 View t+1 Recomputation Update incremental Maintenance 1.2 Issue

7 7 Context : semi-structured XML data  XML views are materialized at the mediator level Hierarchical data No scheme, except the query scheme 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 65.95 Advanced Programming in the Unix environment 39.95 Data on the Web Données sur le Web 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 65.95 Advanced Programming in the Unix environment 39.95 Data on the Web Données sur le Web 1.3 Context

8 8 Context : XQUERY  XQuery  Dedicated to XML data Relational operator (projection, select, join, union, …) XML operator (tagging, unnesting, aggregation,..)  FLWOR syntax …………(pronounced Flower !) for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return $b/title for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return $b/title Syntaxe FLWOR for $var in foret [$var in foret]* let $var:= sous-arbre Where condition Return result Syntaxe FLWOR for $var in foret [$var in foret]* let $var:= sous-arbre Where condition Return result 1.3 Context

9 9 Context : Other specificities  Views are computed using XAlgebra  Cf.View computation  Wrappers have limited resources  Few computation possibilities  A component named logger stores the last modification date and a checksum of sources  Non cooperative web sources  No information about their updates  Not always available  Not enough granularity 1.3 Context

10 10 State of the art (1/2) Relational views  Not fit for semi-structured data Abiteboul and Al.  OEM (Object Embedded Model)  LOREL language  Some Operators are missing VOX – Rainbow Team  Need to know the exact position in the XML Tree where the update has been done 1.4 State of the art

11 11 State of the art (2/2) Cobena and Al.  XDiff – an algorithm for XML files comparison  Need a copy of the source at the wrapper level Bonnet and Al. /Papadimos and Al.  Parachute queries  A mutant query plan What about when sources are really unavailable ? Our goal : Reduce to the minimum sources access Use information that are stored in the view 1.4 State of the art

12 12 View maintenance : The process View computation  An algebraic approach using XAlgebra – Extension of the XAlgebra (identifiers) Update detection  Comparison of the information of the source and those stored in the logger Update identification  Recovering process  Diff Algorithm View maintenance  Propagation rules for each operator 2.1 View computation

13 13 View computation Steps : 2.1 View computation

14 14 The XAlgebra data model Data structures :  XRelation, XTuple, XAttributes Operators :  XSource, XConstruct, XUnion, …. 2.1 View computation

15 15 XSource Operator– Step 1 XQuery analysis We obtain :  A context  A set of patterns For $f in doc("informations.xml")/personnes/personne Let $a:=$f/nom Where $f/age<27 and $a="Durand" Return {$a} {$f/prenom} Path extraction :  Optional  Mandatory  Hidden 2.1 View computation

16 16 XSource Operator– Step 2 and 3 From XML Sub-Trees to the tabular structure 1 Sub Tree => 1 Xtuple XRelation = set of XTuples 2.1 View computation

17 17 XSource Operator– Extending the Algebra adding identifiers : XTids An XTID is a set of pair : {(idsource, idfragment), …..} 2.1 View computation

18 18 View computation - XOperator XProject 2.1 View computation

19 19 View computation - XOperator XJoin XTids propagation : card (XTID)  1 for some nodes 2.1 View computation

20 20 Update detection and Identification Detection Comparison of the information of the source and those stored in the logger The last modification date The checksum of the source Identification  Partial recovery of the source information based on Xtids  Comparison of the recovered XRelation with the updated source  Δ computation 2.2 Update detection and identification

21 21 XRecover Step 1 : Project XR v on XR 1 patterns 2.2 Update detection and identification

22 22 XRecover Step 2 : filtering XTuples values 2.2 Update detection and identification

23 23 XRecover Step 3 : re-ordering XTuples XTidUnnest 2.2 Update detection and identification Xtuples are unnested depending on their XTids

24 24 XRecover Step 3 : re-ordering Xtuples XTidnest 2.2 Update detection and identification Xtuples are nested by their Xtids Xtuples are re-ordered

25 25 Update Identification – Comparison Algorithm Comparison of XR 1 t+1 avec XR t ’  XR 1 t+1 is the XRelation obtained by applying Xsource to source 1 at t+1  XR t’ is the partial recovery of Xrelation of source 1 at t Remark : XR 1 t+1 can also be filtered using predicates before comparison The Diff algorithm is based on Unix Diff (Hunt & McIllroy). The symbol is the Xtuple instead of being the line 2.2 Update detection and identification

26 26 Update identification – Diff algorithm Delta  with hunks :  Insert(pos; Xtuple)  delete(pos;Xtuple)  Replace(pos; Xtupleold, Xtuplenew) 2.2 Update detection and identification Insert(2,{Leclerc,Avide,{(1,3)}} {John,Avide,{(1,3)}} } Delete(4,{Durand,Avide,{(1,11)}}, {Marcel,Avide,{(1,11)}} {Eric,Avide,{(1,11)}}} Etc…

27 27 Maintenance Rules From Delta to view maintenance  Case of a deletion - delete(pos, xtuple) An Xtuple is associated to an Xtid {(x)} such that card=1, Each Xvalue of the view have xtids noted XTID 1) We delete from Xvalues each pair of the Xtid such that x  XTID Example : The XTuple where xtid is x=1,3 has been deleted The Xvalue {Alain}1,3;1,4 becomes XValeur {Alain}1,4 2) We delete each Xvalues such that card(XTID)=0 If XValue {Alain}1,3 become XValeur {Alain}  We delete entirely the XValue 3) If the Xvalue was concenned by the predicate, we delete the XTuple  Join and restriction case 2.3 View maintenance

28 28 Maintenance Rules From Delta to view maintenance  Case of an insertion - insert(pos; xtuple) 1) A new Xtid is created Goal : preserved Xtuples order for a later recovery 2) Depending on the operator; we obtain various maintenance instructions Projection: insert of the projection of the xtuple Select : xtuple satisfies the predicat  insertion Join XR 1 * XR 2, computation of XT= xtuple * XR 2. If XT    insertion of XT Union and Intersect: we keep the conservation des doublons  Union  Select where the predicate is always true  Intersect  join Depending on the predicate, we can request either XR 2 or its recovery 2.3 View maintenance

29 29 Maintenance Rules From Delta to view maintenance  Case of a modification- Replace(pos; Xtupleold, Xtuplenew) Xtuple modification = Xvalue modification OR Xvalues deletion followed by insertion Project and Union: modification of the concerned XValues Select and Intersect: If modification is applied an Xvalue that must verify the condition,  deletion of the Xtuple Else modification of the XValues Intersect  select. Join  deletion followed by insertion. 2.3 View maintenance

30 30 Maintenance Rules From Delta to view maintenance 2.3 View maintenance

31 31 Maintenance rules Missing Information  Missing Information (join ?)  Source Recovery  Multi-view strategy  Source request Goal : limited acces to the sources !!!! Example : View= S 1 *S 2 SQLHTML Materialized views Mediator Wrapper xtuple x is inserted in S1 Computation of S2 ’ Insertio : x * S 2 ’ 2.3 View maintenance

32 32 Applications On the web With sensors (ANR Project ) When necessary sources are unavailable Goal : Limited access to them With sensors that have no wire Goal: Preserve power ressources 2.4 Applications and performances

33 33 Performances Comparison between XRecover and Recomputation 2.4 Applications and performances

34 34 Performances Comparison between XRecover and Recomputation 2.4 Applications and performances

35 35 Contributions Maintenance process in the context of non-cooperative web sources Contribution to the XAlgebra  New operators : XRecover, XTidUnnest, XTidNest  New data structure : XTids Futur work  Order sensitive view maintenance  A better Diff algorithm Conclusion

36 36 Thanks for you attention ! Any questions ?


Download ppt "Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team."

Similar presentations


Ads by Google