Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 IVOX I ncremental V iew Maintenance for O rdered X ML DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke.

Similar presentations


Presentation on theme: "1 IVOX I ncremental V iew Maintenance for O rdered X ML DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke."— Presentation transcript:

1 1 IVOX I ncremental V iew Maintenance for O rdered X ML DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke Rundensteiner

2 2 Outline Motivation Problem Description Background  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work

3 3 Outline Motivation Problem Description Background  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

4 4 Motivation Views in general  Data warehouses  Information integration  Access control, Privacy,..etc XML Views (EXTRA useful)  Information Inter-Portability  Crossing gaps between different data models Materialized Views  Speed up data retrieval  Query optimization  Increased availability RDBXML Other Sources View View Definition Query

5 5 Maintaining Materialized Views When sources are updated, materialized view may becomes inconsistent. Methods of view maintenance Recomputation  recompute view from scratch from base data Incremental view maintenance  compute changes to view in response to changes to base sources Heuristic: Incremental view maintenance is usually cheaper than full recomputation.

6 6 Outline Motivation Problem Description Background  The XAT Algebra  XML order in the XAT Context The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

7 7 The Problem Previous work for:  Relational [GMS93], bag semantics [GL95], [ZGHW95], [PSCP02]  Object-Relational [LVM00]  Object-Oriented [AFP02]  Structured data models [AMRVW98], [ZM98]  XML data model not handling order [LD00] Can techniques for other data models be reused for XML?

8 8 Is Maintaining XML Views Different? XML features  Hierarchical  Optional elements  Self-typed  References  Ordered Expressiveness of view definition language  Complex operations tagging, unnesting, aggregation,..  Expected large auxiliary information

9 9 Example 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price List all books that cost less than $60, including their title and price Data on the Web 39.95 Data on the Web 39.95 Bib.xml View Definition Query View Extent

10 10 Example Insert element 55.48 into second book Bib.xml Data on the Web 39.95 Data on the Web 39.95 View Extent TCP/IP Illustrated 55.48 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web 55.48 for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price View Definition Query

11 11 Our Goal Design incremental view maintenance strategy for XQuery views that:  Correctly update the view  Is order sensitive Returns view in proper order Allows for updates that specify order  Covers at least the “core” of XQuery language views  Minimizes auxiliary information requirements

12 12 Basics of IVOX Approach: Algebraic Update propagation rules for each algebra operator and each update type XML Source XML Source XML Source XML View Update Algebra Tree XQuery Definition Operator D1 D2 Operator D1 Update D2 Update ExecutionView Maintenance time

13 13 Why Algebraic? Robust – Easily adaptable to operator semantic changes Extensible – new operators can be added Allows for reuse of techniques for known operators Language independent- independent of syntax changes (of XQuery by W3C) Formal – basis for provable correctness

14 14 Outline Motivation Problem Description Background  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

15 15 Background on XML Algebra XAT XAT Operators  SQL Operators: Select, Project …  Special Operators: Source, FOR…  XML Operators: Navigate, Tagger.. XAT Data Model (XAT Table)  Order sensitive table of tuples  Columns denote user-specified or internally generated variable bindings  A cell in a tuple holds an XML node for a sequence of XML nodes  $col1, price $col3 $col3$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … ….

16 16 Order in XAT Context Order among tuples Order among XML nodes in a cell  $col1, price $col3 $col3$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … ….

17 17 Order in the XAT Context Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ) (, Order among the tuples Order among XML nodes in a single cell

18 18 Order in XAT Context: View Maintenance On update worry about:  Order among tuples  Order among XML nodes in a cell  $col1, price $col3 $col3$b 65.95 …. 55.48 …. 39.95 …. $b 65.95 Advanced … TCP/IP … 55.48 ….

19 19 Order in XAT Context & View Maintenance Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ), ( On update worry about:  Order among the tuples  Order among XML nodes in a single cell

20 20 Duplicate Information in XAT Context Complex operations require auxiliary information Auxiliary information can be too large in XAT context May be expensive to maintain it  $col1, price $col3 $col3$b 65.95 65.95 Advanced … 39.95 …. $b 65.95 Advanced … TCP/IP … …. Duplicated Storage !

21 21 Outline Motivation Problem Description Background  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

22 22 Possible Solutions to Order Preservation (I) Sequential storage (XPROP approach by Maged, Ling & Luping)  Assume intermediate results stored sequentially  Inserts and deletes are performed in physical order  No order encoding Special support required for secondary storage May require iteration over many tuples to determine order  $col1, price $col3 39.95 …. 65.95 $col3 …. $b …. TCP/IP … 65.95 Advanced … $b $col3$b 65.95 …. 39.95 …. 55.48 ….

23 23 Possible Solutions to Order Preservation (II) Naïve order encoding for tuples and sequences of XML nodes  Assign order numbers to tuples and to XML nodes in a sequence Requires frequent renumbering on inserts.  $col1, price $col3 $col3$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … …. Ord 1 2 1 2 3 55.48 …. 2 3 2 1 Ord 55.48

24 24 Using Node Identity node identity Idea: Use node identity Usage:  For encoding order and structure  As a reference to base data

25 25 What Encoding For Node Identity? bib book price title price title 1 2 5 7 4 3 6 8 9 Existing techniques for encoding order for XML Global Order (UW) Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) price 6 7 8 9 10

26 26 bib book price title price title 1 1 2 3 2 1 1 1 2 Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? price 1 2

27 27 bib book price title price title 1 1.1 1.2 1.3 1.1.2 1.1.1 1.2.1 1.3.1 1.3.2 Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? price 1.2.1 1.2.2

28 28 bib book price title price title b b.b b.d b.f b.b.cd b.b.b b.d.f b.f.cm b.f.l Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) Lexicographical Order (MASS) What Encoding For Node Identity? The Winner price b.d.b

29 29 Lexicographical Keys: LexKeys What are LexKeys?  Multi-level lexicographical keys  Example: c, ba.c.b Examples of comparison b < b.c bab < bd.cc b.b < b.b.c Advantages  All LexKeys form a totally ordered set with respect to <  It is always possible to generate a key between two keys  The deletion of a LexKey in a sequence does not affect other LexKeys Usage  Reference to XML nodes  Encoding order

30 30 LexKeys in XAT Tables  $b, price $col2 $col2$b b.b.bb.b b.f.cmb.f $b b.b b.d b.f  $b, price $col2 $col2$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … ….

31 31 Order Among XAT Tuples Notion: designate order schema to XAT tables  Ordering by LexKeys by columns in order schema yields correct tuple order. $d$c$b c.mb.b.bb.f d.cb.f.cmb.b d.c.bb.f.cmb.b Order Schema 12 1 3 2

32 32 Calculating Order Schema OperatorOrder Schema odc(out) Tagger T pattern $col’ (s) odc(s) Source S desc $col’ none. Navigate Unnest  $col, path $col’ (s) If col is last in odc(s) Concat (odc(s) – col, col’ ) else Concat (odc(s), col’ ) Rules for each operator Calculated in a postorder traversal of the tree Sample Rules

33 33 Order Among Tuples Example  $b, price $col2 $col2$b b.b.bb.b b.f.cmb.f $b b.b b.d b.f  $b, price $col2 $col2$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … …. 1 1 2 1 3 2 1

34 34 Order in Collection within a cell? Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ) (, Agg $col5 $col5 tbb tbc $col5$col4$col2 tbbb.f.lb.f.cm tbcb.d.bb.d.f {}, 12 1 2 12

35 35 Smart Keys What is a SmartKey? Overriding Order (LexKey) Key (LexKey) SmartKey Key part, by default also represents order Optional, only represents order when present Notation: key(order) Examples  b.c.b (h)  b.c.b

36 36 SmartKeys in XATTables Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ) (, Agg $col5 $col5 tbb(b.f.cm..b.f.l) tbc(b.d.f..b.d.b) $col5$col4$col2 tbbb.f.lb.f.cm tbcb.d.bb.d.f {}, 12 1 2 12

37 37 The Impact of SmartKeys on View Maintenance

38 38 Order Among XAT Tuples during View Maintenance Not touching other tuples in XAT table No reordering ever needed. Gaining distributiveness in regard to bag union on tuple level  $col1, price $col3 $col3$b b.b.bb.b b.f.cmb.f b.d.bb.d $b b.b b.f b.d 3 1 2 3 1 2

39 39 Order in a Sequence during View Maintenance Agg $col5 $col5 tb..b.f.l..b.f.cm tb..b.d.f..b.d.b $col5 tb..b.f.l..b.f.cm tb..b.d.f..b.d.b Not touching other members of the sequence No reordering ever needed. Gaining distributiveness in regard to bag union on cell level {}, 1 2 12

40 40 Update Propagation Rules Operator XAT table 1 XAT table 2 Operator Update to XAT table 1 Update to XAT table 2 ExecutionView Maintenance time Use distributiveness in regard to bag union Reuse rules from relational for most SQL XAT operators

41 41 Update Propagation Rules Example ( Navigate Unnest on Insert Tuple) T2 old =  $col,path $col’ (T1 old ) T1 new =T1 old +  T1 T2 new =  $col,path $col’ (T1 old +  T1) = =  $col,path $col’ (T1 old ) +  $col,path $col’ (  T1) = = T2 old +  T2 + represents bag union T1 T2  $col,path $col’  T1  T2 ExecutionView Maintenance time  $col,path $col’

42 42 Update Propagation Strategy XML Source XML View Update XAT xatup keyup Translator xmlup Update XQuery Storage Manager

43 43 Update Primitives (The Format of Delta) XML Update Primitives (xup)  Insert (xmlFragment, path)  Delete (path)  InsertAtt (name, value, path)  DeleteAtt (name, path)  Replace (oldValue, newValue, path) XML Key Update Primitives (keyup)  Insert (el, path)  Delete (path)  Replace (el, pos) XAT Update Primitives (xatup)  InsertTuple (tuple)  DeleteTuple (tupleId)  ChangeTuple (Keyup, columnName, tupleId) Apply to original XML Document Express update on original XML data in terms of LexKeys Apply to XATTable

44 44 A Complete Example

45 45 S ”bib.xml” $S1 bib.xml  $S1, bib $col1  $col1, book $b  $b, price $col2  $b, title $col4  $col3 < 60 T $col4 $col2 $col5 Agg $col5 Storage Manager bib book pricetitle price title b b.bb.db.f b.b.cd b.b.b b.d.f b.f.cmb.f.l bib.xml Constructed XDOMs { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) } $col5 tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm tr $col6 tr tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm result tb..b.f.l.. b.f.cm T $col5 $col6 b $col1 b.f b.d b.b $b b.f.cm b.b.b $col2 b.f b.b $b b.f.l b.b.cd $col4 b.f.cm b.b.b $col2 b.f.l $col4 b.f.cm $ col2 tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm tb..b.f.l..b.f.cm $col5 Execution

46 46 S ”bib.xml” $S1 bib.xml  $S1, bib $col1  $col1, book $b  $b, price $col2  $b, title $col4  $col3 < 60 T $col4 $col2 $col5 Agg $col5 Storage Manager bib book pricetitle price title b b.bb.db.f b.b.cd b.b.b b.d.f b.f.cmb.f.l bib.xml Constructed XDOMs T $col5 $col6 price b.d.b Insert (price, bib[1].book[2]) Insert (price[b.d.b], bib[b].book[b.d]) b $col1 ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col1, b) b.f b.d b.b $b changeTuple(insert(price[b.d.b], book[b.d]), $b, b.d) ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col2, b.f, b.f.m) b.f.cm b.b.b $col2 b.f b.b $b insertTuple({b.d, b,d.b}) b.f.l b.b.cd $col4 b.f.cm b.b.b $col2 insertTuple({b.d.b, b.d.f}) b.f.l $col4 b.f.cm $ col2 insetTuple({b.d.b, b.d.f}) tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm tb..b.f.l..b.f.cm $col5 insertTuple({tb..b.d.f..b.d.b}) tr $col6 tr tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm result tb..b.f.l.. b.f.cm ChangeTuple(insert(tb..b.d.f..b.d. b, result[tr]), $col6, tr) b.d.bb.d b.f.cm b.b.b $col2 b.f b.b $b b.d.fb.d.d b.f.l b.b.cd $col4 b.f.cm b.b.b $col2 b.d.f b.f.l $col4 b.d.d b.f.cm $ col2 tb.. b.d.f.. b.d.b tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm book b.d.fb.d.b tb..b.d.f..b.d.b tb..b.f.l..b.f.cm $col5 tb.. b.d.f.. b.d.b tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm book b.d.fb.d.b { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) tb..b.d.f..b.d.b(..b.d.f..b.d.b) } $col5 tb..b.d.f..b.d.b(..b.d.f..b.d.b) { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) } $col5 ChangeTuple(insert( tb..b.d.f..b.d.b, null), $col5, ) tb.. b.d.f.. b.d.b tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm book b.d.fb.d.b View Maintenance

47 47 Outline Motivation Problem Description Background on XAT  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

48 48 System Architecture Process Data Legend XML Query Engine XML Source XML Algebra Tree Materialized Auxiliary Views Materialized XML View XML Source Persistent Data Storage One time occurrence On-update occurrence XML View Maintainer VM Initializer View Definition XQuery Rainbow User Update XQuery Update Propagation Rules Repository XML Source Update Primitive Generator Executer XTUP Storage Manager Execution View Maintenance

49 49 Outline Motivation Problem Description Background on XAT  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

50 50 Related Work A.Gupta, I.S.Mumick. Maintenance of Materialized Views: Problems, Techniques, and Application. In Bulletin of the Technical Committee on Data engineering 1995. T. Grin, L.Libkin. Incremental maintenance of views with duplicates. In SIGMOD 1995. H. Liefke and S. Davidson. View Maintenance for Hierarchical Semistructured Data. In DAWAK 2000. S. Abiteboul, J. McHugh, Rys, Vassalos, J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB 1998.

51 51 Outline Motivation Problem Description Background on XAT  XML Algebra  Order in XML Algebra The IVOX Approach  Order Encoding  Overall strategy System Architecture Related Work Future Work 

52 52 Future Work Near Future …  Launch the system  Batch update coming  Experiments and Evaluation Compare the system’s performance to recomputation … and Beyond  Batching updates coming from different sources  Integrity constraints  Algebra tree rewrite rules


Download ppt "1 IVOX I ncremental V iew Maintenance for O rdered X ML DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke."

Similar presentations


Ads by Google