1 M ATERIALIZED V IEW M AINTENANCE FOR THE X ML D OCUMENTS Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

The Hierarchical Model
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Designing Functional Dependencies For XML Mong Li LEE, Tok Wang LING, Wai Lup LOW EDBT 2002.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Incremental Maintenance for Materialized Views over Semistructured Data Written By: Serge Abiteboul Jason McHuge Michael Rys Vasilis Vassalos Janet L.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
1 Resolving Structural Conflicts in the Integration of XML Schemas: A Semantic Approach Xia Yang Mong Li Lee Tok Wang Ling National University of Singapore.
Xyleme A Dynamic Warehouse for XML Data of the Web.
From Semistructured Data to XML: Migrating The Lore Data Model and Query Language Roy Goldman, Jason McHugh, Jennifer Widom Stanford University
Indexing Semistructured Data J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman Stanford University January 1998
1 COS 425: Database and Information Management Systems XML and information exchange.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Managing XML and Semistructured Data Lecture 1: Preliminaries and Overview Prof. Dan Suciu Spring 2001.
AGGREGATE PATH INDEX FOR INCREMENTL WEB VIEW MAINTENANCE Author: Li Chen and Elke Rundensteiner Department of Computer Science Worcester Polytechnic Institure.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University.
Tok Wang Ling1 Mong Li Lee1 Gillian Dobbie2
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
4/20/2017.
Technical University of Valencia Computer Science Department SOFSEM’07 (22/01/2007) A Program Slicing Based Method to Filter XML/DTD documents.
1 XDO2: A Deductive Object-Oriented Query Language for XML Wei Zhang 1, Tok Wang Ling 1, Zhuo Chen 1, and Gillian Dobbie 2 School of Computing National.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Cooperative Query Answering for Semistructured data Michael Barg Raymond K. Wong Reviewed by SwethaJack Christian (Absent) Chris.
1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore.
Querying Tree-Structured Data Using Dimension Graphs Dimitri Theodoratos (New Jersey Institute of Technology, USA) Theodore Dalamagas (National Techn.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS
1 On View Support for a Native XML DBMS Ting Chen, Tok Wang Ling School of Computing, National University of Singapore Daofeng Luo, Xiaofeng Meng Information.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Querying Structured Text in an XML Database By Xuemei Luo.
Dimitrios Skoutas Alkis Simitsis
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
University of Crete Department of Computer Science ΗΥ-561 Web Data Management XML Data Archiving Konstantinos Kouratoras.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
Declaratively Producing Data Mash-ups Sudarshan Murthy 1, David Maier 2 1 Applied Research, Wipro Technologies 2 Department of Computer Science, Portland.
April 9, 2006KDXD 2006, Singapore1 Capturing Semantics in XML Documents Tok Wang Ling Department of Computer Science National University of Singapore.
View Materialization & Maintenance Strategies By Ashkan Bayati & Ali Reza Vazifehdoost.
____________________________ XML Access Control for Semantically Related XML Documents & A Role-Based Approach to Access Control For XML Databases BY Asheesh.
1 Automatic Generation of XQuery View Definitions from ORA-SS Views Ya Bing Chen Tok Wang Ling Mong Li Lee School of Computing National University of Singapore.
Chapter 9 Logical Database Design : Mapping ER Model To Tables.
XML and Database.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
CIS 250 Advanced Computer Applications Database Management Systems.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
April 9, 2007SWIIS, Bangkok1 Using Semantics in XML Data Management Tok Wang Ling Department of Computer Science National University of Singapore Gillian.
Reuse or Never Reuse the Deleted Labels in XML Query Processing Based on Labeling Schemes Changqing Li, Tok Wang Ling, Min Hu.
Modeling Your Data Chapter 2 cs5421. Part II Discussion of the Model: Good Design/ Bad Design? cs5422.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
Data Models. 2 The Importance of Data Models Data models –Relatively simple representations, usually graphical, of complex real-world data structures.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
Wenyue Du, Mong Li Lee, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore {duwenyue, leeml,
Conceptual Modeling for XML Data
Semi-Structured Data and Agile Application Development
OrientX: an Integrated, Schema-Based Native XML Database System
MANAGING DATA RESOURCES
Data Model.
MCN: A New Semantics Towards Effective XML Keyword Search
Presentation transcript:

1 M ATERIALIZED V IEW M AINTENANCE FOR THE X ML D OCUMENTS Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen National University of Singapore National University of Singapore Presenter: Qing Li (City University of Hong Kong Presenter: Qing Li (City University of Hong Kong)

2  Background of Materialized View Maintenance Materialized View Maintenance  ORA-SS Data Model  XML View  Incremental XML View Maintenance  Related Works  Conclusion A GENDA

3 Background

4  Views  Relational View  XML View  Materialized Views  Maintain the Materialized Views  Re-computation  Incremental approach I NTRODUCTION TO VIEW

5 O VERVIEW OF A RCHITECTURE Updated Materialized View ff δ δ’δ’ Data source Updated Data source Materialized View δ: changes on the source data f: function to compute the view content from scratch δ’: changes on the view

6  Why choose incremental approach?  Re-computing the materialized view from scratch is usually too costly when only a part of the materialized view needs to be changed  The incremental approach will absorb incoming updates and incrementally modify the materialized views without halting query processing. We prefer the incremental approach I NCREMENTAL A PPROACH

7  What’s important for incremental XML view maintenance?  Good XML data model to define flexible views with swap, join and aggregations  Efficient incremental view maintenance method X ML V IEW M AINTENANCE

8  XML view  Defined view with swap, join and aggregation using ORA-SS  Extend the XML view transformation to support the flexible views  Materialized view maintenance for XML documents  Developed relevance checking process for each source XML update. Those update without affecting the view will be detected  Developed incremental method to maintain the view with swap, join and aggregation Contributions

9 ORA-SS DATA MODEL

10  Object-Relationship-Attribute model for Semi- Structured data [4]  Basic concepts:  object classes  relationship types  Attributes  Captures rich semantic information ORA-SS DATA MODEL

11  Represented as a labeled rectangle  Attributes are labeled circles connected to the object class by edges ORA-SS : Object Class

12  represented as a labeled edge  label: (name, n, p, c)  name: relationship name  n: degree  p: parent participation constraint  c: child participation constraint ORA-SS : Relationship Type

13  represented as a labeled circle  distinguish object attributes and relationship attributes ORA-SS : Attribute

14 Source XML Document DOC1 - SPJ

15 ORA-SS Schema Diagram of DOC1

16 Source XML Document DOC2 - JD

17 ORA-SS Schema Diagram of DOC2

18  A semantically rich, labeled and directed graph schema  Captures much semantic information  distinguish attributes from object classes  express the degree of relationship types  specify the participation constraints on the object classes in a relationship type  distinguish object attributes and relationship attributes ORA-SS : Summary

19 XML VIEW

20  View is defined using ORA-SS schema diagram  Selection  Projection  Swap  Join  Aggregation X ML V IEW D EFINITION

21 X ML V IEW E XAMPLE  The view shows information of project of department dn1, part of each project  Object class supplier is dropped from the source schema 1.  part and project are swapped.  A new relationship type jp is created between project and part.  A new attribute called total_quantity is created for jp, which is the sum of quantity of a specific part that the suppliers are supplying for the project.

22 X ML V IEW E XAMPLE (cont.)

23  Materialized view  View is materialized by using view transformation technique  Previous Work  Daofeng Luo, Ting Chen, Tok Wang Ling, and Xiaofeng Meng. On View Transformation Support for a Native DBMS. DASFAA 2004, pages , Jeju Island, Korea, March 2004  It can perform accurate and efficient view transformation based on ORA-SS. But the method is only transforming a single source ORA-SS schema to a view schema  Our Extended Work  Here we enrich the method to handle the complex views which can be over multiple source XML schemas, have selection conditions, and have aggregation functions X ML V IEW M ATERIALIZATION

24  Projection (on object type or relationship type)  It selects instances of object classes and relationship types from the source XML documents  Selection (on attribute of object class or relationship type)  It prunes the instances retrieved from Projection Procedure by checking the selection conditions in the view schema  Join (different object classes)  It joins the elements with the same name and key attributes together from different source XML documents  Aggregation (on attributes)  It applies the aggregation function to the values of aggregate attribute if there is an aggregation function associated with the attribute X ML E xtended XML View Materialization Outline

25 X ML Materialized View EXAMPLE

26 VIEW MAINTENANCE

27  Obtain the source update tree according to the update specification and the source document and source schema  Check the relevance of the source update to see whether the update will affect the view. If the source update is relevant, we proceed to step 3, otherwise we stop here  Generate the view update tree, which contains the update information to the view  Merge the view update tree into the view to produce the completed updated materialized view I ncremental Materialized XML View Maintenance Outline

28 S OURCE U PDATE T REE E XAMPLE  Source Update  Suppose supplier s3 is going to supply part p1 to project j1 with a quantity of 10.  This will insert part p1 with child project j1 as the child element of supplier s3 in the source XML doc1  The source update tree in this case is shown in next page, which contains the path from supplier s3 to project j1

29 S OURCE U PDATE T REE E XAMPLE (cont.)

30  Benefit  Avoid generating and evaluating unnecessary maintenance statements  Insertion/Deletion  [STEP 1] Check whether the object classes or relationship types in the source update tree are in the view schema  Require to query schema only  [STEP 2] Check whether each path in the source update tree satisfies the selection conditions in the view schema  Require to query schema using source update tree  [STEP 3] Check whether each path in the source update tree joins with any source XML documents  Require to query schema, source update tree and source XML documents C heck Source Update Tree Relevance

31  Modification  [STEP 1] Check whether the modified attribute appears in the view schema  Require to query schema only  [STEP 2] Check whether the new and old modified values satisfy the selection condition  Require to query schema using source update tree C heck Source Update Tree Relevance (CONT.)

32  Almost same process as view materialization  One exception is the source update tree is used as an input instead of the updated source XML document itself  General Process:  Projection (on object type or relationship type)  Selection (on attribute of object class or relationship type)  Join (different object classes)  Aggregation (on attributes) Generate View Update Tree

33 S AMPLE V IEW U PDATE T REE

34  After the view update tree is computed, we are going to merge the change into the materialized view  We merge each path in the view update tree one by one  Insertion  Deletion  Modification  Handling aggregation Merge View Update Tree

35 Updated Materialized View

36 RELATED WORKS

37  Abiteboul, et.al. “Incremental Maintenance for Materialized Views over Semistructured Data”, VLDB 98’  The work supposes that the updates are identified by Object IDs.  Updates are restricted to single element/attribute update  Updates to XML documents may be subtrees and in this case the OIDs are unlikely to be available  The work handles the view which is the portion of the source semi-structured data  The complex views with swap of XML elements in the hierarchy cannot be handled Related Works

38  Zhuge, et.al. “Graph Structured Views and Their Incremental Maintenance”, ICDE 98’  The view is to retrieve a set of specific objects with their children from the source semi-structured data  That means the only hierarchical structure in the view is a binary relationship, and the view only have the set of objects and their children which are originally in the source semi- structured data and satisfying the view specification  Only the parent-child relationship needs to be checked with the view definition to determine whether the updated element affect the view Related Works (cont.)

39  Existing Works  Updates are limited to atomic value update  any single insertion/deletion/change of atomic values causes view maintenance process  Views with swap, join and aggregation are not addressed  Our work addresses the above issues Related Works Comparison

40 CONCLUSION

41  Extended the XML view transformation to support the flexible views with swap, join, aggregation  Proposed a new incremental view maintenance method for XML documents  Flexible views with swap, join, aggregation can be handled C ONCLUSION

42  Transaction Update  To handle transaction, we will enable multiple changes to be specified in one single update tree. Thus, the view update tree can be derived together at one time  All the updates with counter effects need to be removed  Implement XML order support  Storing order information in the source update tree F UTURE W ORK

43 R EFERENCES 1.S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel Query Language for Semistructured Data. Journal of Digital Libraries, 1(1), Nov S. Abiteboul, J. McHugh, M. Rys, V. Vassalos, and J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB, pages 38-49, D. Luo, T. Chen, T. W. Ling, and X. Meng. On View Transformation Support for a Native XML DBMS. In 9th International Conference on Database Systems for Advanced Applications, Korea, March G. Dobbie, X. Y. Wu, T. W. Ling, M. L. Lee. ORA-SS: An Object – Relationship - Attribute Model for Semistructured Data. Technical Report TR21/00, School of Computing, National University of Singapore, Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object Exchange across Heterogeneous Information Sources. In Proceedings of the 11th International Conference on Data Engineering, pages , Taipei, Taiwan, Mar D. Suciu. Query Decomposition and View Maintenance for Query Language for Unstructured Data. In VLDB, pages , Bombay, India, September Y. Zhuge and H. Garcia-Molina. Graph Structured Views and Their Incremental Maintenance. In Proceedings of the 14th International Conference on Data Engineering (DE), World Wide Web Consortium, “XQuery: A Query Language for XML”, W3C Working Draft,

44 T HANKS FOR Y OUR A TTENTION