Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Querying Versions of Multiversion Data Warehouse Tadeusz Morzy Robert Wrembel Poznań University of Technology Institute of Computing Science Poznań,

Similar presentations


Presentation on theme: "On Querying Versions of Multiversion Data Warehouse Tadeusz Morzy Robert Wrembel Poznań University of Technology Institute of Computing Science Poznań,"— Presentation transcript:

1 On Querying Versions of Multiversion Data Warehouse Tadeusz Morzy Robert Wrembel Poznań University of Technology Institute of Computing Science Poznań, Poland Robert.Wrembel@cs.put.poznan.pl

2 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 2 Presentation Outline  Context and motivation of our work  Related work  Contribution  The concept of a multiversion data warehouse  Querying a muliversion data warehouse  Ongoing and future work

3 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 3 Context and Motivation (1)  The research area encompases  handling dynamics of EDSs in a DW (by means of applying a MVDW)  querying a MVDW  Dynamic nature of EDSs  data dynamics  user data processing in EDSs  DW refreshing  schema dynamics  new user requirements  dynamic nature of a real world  tuning purposes  require changes to EDSs schemas  changes to DW structure

4 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 4  European Union extension  compare the sum of membership fees paid by the countries in 2002, 2003, 2004  in 2004 – substantial increase  did the countries pay more in 2004?  without the knowledge about EU extensions we could end up with confusing conclusions Context and Motivation (2) EU Belgium... UK T 1 - 2002T 3 - 2004 Belgium... UK EU Poland Slovenia... EU T 1 - 2003 Belgium... UK

5 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 5  Sales analysis: reclassification of products to categories  e.g., building elements changed tax from 7% to 22% (Poland)  compute and compare the sum of income from brick sales in 2003 and 2004  did sales of bricks increased 2004 by 15%?  by simply updating brick’s vat from 7% to 22% we lose information that in the past vat was 7% Context and Motivation (3)

6 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 6Challenges  Dynamic nature of EDSs and real world should be reflected in a DW  New functionality of a data warehouse  supporting changes of:  fact and level tables  dimension instance structures  providing tools for appropriate analysis of data coming from different time periods

7 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 7 Related Approaches (1)  Schema and data evolution  [Blaschka et al., DaWaK99], [Hurtado et al., ICDE99, DOLAP99], [Koeller et al. DOLAP98]  Temporal extensions  [Chamoni et al., DaWaK99], [Eder et al., DaWaK01, CAISE02], [Mendelzon et al., VLDB00]  Versioning  implicit [Kang et al., VLDB02], [Quass et al., SIGMOD97], [Kulkarni et al, IDEAS99], [Teschke et al., DEXA98]  explicit [Bellahsene, DEXA98], [Body et al. DOLAP02]  virtual [Balmin et al., VLDB00]

8 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 8  The approaches assume that time is linear (DW states are ordered by time)  true for past Other Limitations  not always true for future  what-if analysis

9 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 9 Our Approach (1)  Multiversion Data Warehouse  MVDW is composed of a set of its versions  changes in a DW structure and data reflected in a new explicitly derived version of a DW  DW Version  a schema version (facts, dimensions, levels, level instances)  an instance version (stores the set of data consistent with its schema version; measures/cell values)

10 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 10 Our Approach (2)  Types of DW versions  real reflects changes in real world linearly ordered by time they are valid within derived from another real version  alternative created for simulation purposes (what-if analysis) form DAG derived from another real or alternative version

11 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 11 Data Model and Concepts  Formal model of MVDW  International Conference on Enterprise Information Systems (ICEIS), France, 2003  Time integrity constraints for DW versions  ACM Symposium on Applied Computing (SAC), Cyprus, 2004  Data sharing concept and evaluation  6th Baltic Conference on Databases & Information Systems, Ryga, 2004  Conference on Current Trends in Theory and Practice of Informatics (SOFSEM), Slovakia, 2005 (to appear)  Transaction concept  International Conference on Enterprise Information Systems (ICEIS), Portugal, 2004

12 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 12 Querying MVDW  Step 1  query decomposition  PQ execution  PR retrieval and presentation  Step 2  PR integration

13 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 13 MVQ User Interface (1)

14 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 14 MVQ User Interface (2)

15 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 15 Modes of Querying (1)  Querying the current DW version  by default a user addresses the latest real DW version  Querying the set of real DW versions  by specifying time period of interest, real versions are valid within  begin validity time - end validity time select... from... where... group by... version from date 'begin date' to date 'end date' select... from... where... group by... version from date 'begin date' to date 'end date'

16 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 16 Modes of Querying (2)  Querying the set of alternative DW versions  a user has to explicitly provide a set of alternative versions of interest select... from... where... group by... alternative version in (ver_id | ver_name,..., ) select... from... where... group by... alternative version in (ver_id | ver_name,..., )

17 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 17 Modes of Querying (3)  Merging results of partial queries  by default, every result set of a partial query is presented to a user separately  in some cases partial queries can be merged into one result set  merging the results obtained by partial queries is defined by including the MERGE INTO clause select... from... where... group by... version from date 'begin date' to date 'end date' merge into {ver_id | ver_name} select... from... where... group by... version from date 'begin date' to date 'end date' merge into {ver_id | ver_name}  original partial results have to be transformed into a common schema  transformation methods have to exist in the MVDW data dictionary

18 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 18 Modes of Querying (4)  Merging results of partial queries into a common DW version will be possible if  a multiversion query addresses attributes that are present in all versions of interest  there exist transformation methods between adjacent DW versions

19 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 19 Heterogeneous Schema Versions  Every version of a schema may have different structure  problems in querying  Cases causing schemas heterogeneity handled by our prototype  reclassification of level instances  merging level instances  splitting level instances  level detachment  level inclusion  changing table name  changing attribute name  changing attribute domain  dropping an attribute  adding an attribute

20 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 20 Reclassification of level instances  Result sets from RV2 and RV3 annotated with metadata information Dimension PRODUCT: Level PRODUCT: change association: Ytong bricks (vat 7%  vat 22%) Dimension PRODUCT: Level PRODUCT: change association: Ytong bricks (vat 7%  vat 22%)

21 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 21 Merging level instances  Result sets from RV3 and RV4 annotated with metadata information Merge (Castorama, Marx Pipes)  Castorama

22 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 22 Level Detachment  Result sets from RV4 and RV5 annotated with metadata information Dimension Shop: level detachment City Dimension Shop: source attribute: Shop.city_name  City.city_name Dimension Shop: level detachment City Dimension Shop: source attribute: Shop.city_name  City.city_name

23 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 23 Changing table/attribute name  Result sets from RV5 and RV6 annotated with metadata information Table name changing: Sale  Poland_Sale Attribute name changing: City.city_name  City.city Table name changing: Sale  Poland_Sale Attribute name changing: City.city_name  City.city

24 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 24 Changing attribute domain  Hangling changes of attribute domains between DW versions  A forward and a backward conversion method has to be provided and registered in data dictionary  Attr_Mappings.am_forward_meth_name  Attr_Mappings.am_backward_meth_name  Conversion methods are implemented by a DW admin

25 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 25 Prototype Limitations  All predicates of the SELECT command apply to all DW versions pointed to in the VERSION FROM and VERSION IN clauses  it is not possible to express a predicate on a single DW version  The query parser is unable to infer appropriate versions of interest from the WHERE clause  The query parser is able to compute an integrated result set of a multiversion query using basic aggregate functions: SUM, MIN, MAX, AVG

26 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 26 MVDW Metamodel

27 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 27 Ongoing Work  Building a Multiversion DWMS  a project supported from the Polish State Committee for Scientific Research DS1 DS2DS3 ODS  data integration and buffering  detection of schema and data changes  propagation of schema and data changes MVQI MVDWM ODSM

28 Morzy T., Wrembel R.: On Querying Versions of Multiversion Data Warehouse DOLAP 2004 28  Open issues  indexing multiversion data  handling quality of information in a MVDW Future Work


Download ppt "On Querying Versions of Multiversion Data Warehouse Tadeusz Morzy Robert Wrembel Poznań University of Technology Institute of Computing Science Poznań,"

Similar presentations


Ads by Google