1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

BUSINESS DRIVEN TECHNOLOGY Plug-In T4 Designing Database Applications.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 13-1 COS 346 Day 25.
Databases Revision.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Fundamentals, Design, and Implementation, 9/e Chapter 5 Database Design.
Database management concepts Database Management Systems (DBMS) An example of a database (relational) Database schema (e.g. relational) Data independence.
Data Management Design
Chapter 11 Data Management Layer Design
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Entity-Relationship Model Database Management Systems I Alex Coman, Winter.
Modeling & Designing the Database
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University.
Tok Wang Ling1 Mong Li Lee1 Gillian Dobbie2
Dr. Mohamed Osman Hegaz1 Conceptual data base design: The conceptual models: The Entity Relationship Model.
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
Information storage: Introduction of database 10/7/2004 Xiangming Mu.
1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan Lecture-02,03 Introduction –Data Models Lectured by, Jesmin Akhter.
Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 1 The Relational Model Chapter 3.
Web-Enabled Decision Support Systems
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Database Technical Session By: Prof. Adarsh Patel.
Introduction to Databases A line manager asks, “If data unorganized is like matter unorganized and God created the heavens and earth in six days, how come.
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
CSCI 3140 Module 2 – Conceptual Database Design Theodore Chiasson Dalhousie University.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
Concepts and Terminology Introduction to Database.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Querying Structured Text in an XML Database By Xuemei Luo.
Fundamentals of Relational Database Operations
Copyright 2008 McGraw-Hill Ryerson 1 TECHNOLOGY PLUG-IN T5 DESIGNING DATABASE APPLICATIONS.
Object Oriented Analysis and Design 1 Chapter 7 Database Design  UML Specification for Data Modeling  The Relational Data Model and Object Model  Persistence.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Design Issues Mapping.
Object Persistence (Data Base) Design Chapter 13.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Entity-Relationship Model Using High-Level Conceptual Data Models for Database Design Entity Types, Sets, Attributes and Keys Relationship Types, Sets,
3 & 4 1 Chapters 3 and 4 Drawing ERDs October 16, 2006 Week 3.
PowerPoint Presentation for Dennis, Wixom, & Tegarden Systems Analysis and Design with UML, 3rd Edition Copyright © 2009 John Wiley & Sons, Inc. All rights.
Chapter 2 : Entity-Relationship Model Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E-R Diagram Extended E-R Features Design of.
Chapter 3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Databases Illuminated Chapter 3 The Entity Relationship Model.
CSE314 Database Systems Lecture 3 The Relational Data Model and Relational Database Constraints Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
DatabaseIM ISU1 Fundamentals of Database Systems Chapter 3 Data Modeling Using Entity-Relationship Model.
1 DATABASE TECHNOLOGIES (Part 2) BUS Abdou Illia, Fall 2015 (September 9, 2015)
Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin APPENDIX C DESIGNING DATABASES APPENDIX C DESIGNING DATABASES.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Object storage and object interoperability
Martin Kruliš by Martin Kruliš (v1.1)1.
Fundamentals, Design, and Implementation, 9/e Appendix B The Semantic Object Model.
Transforming ER models to relational schemas
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Mapping Constraints Keys.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Mapping ER to Relational Model Each strong entity set becomes a table. Each weak entity set also becomes a table by adding primary key of owner entity.
Lecture 4: Logical Database Design and the Relational Model 1.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Converting ER/EER to logical schema; physical design issues 1.
Conceptual Modeling for XML Data
COP Introduction to Database Structures
Logical Database Design and the Rational Model
Physical Database Design
Query Optimization.
Presentation transcript:

1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang

2 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion

3 Main Accomplishment This study provides an efficient and consistent storage for semistructured data by developing algorithms that map the XML document to logical ORA-SS model and then to an object- relational data store.

4 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion

5 (1) the file system store each XML document as a separate operating system file and use a DOM or SAX parser whenever the document is accessed by a query Disadvantage XML files in ASCII format need to be parsed every time when they are accessed for either browsing or querying. the entire parsed file must be memory-resident during query processing in DOM. it is hard to build and maintain indices on documents stored this way. update operations are difficult to implement. Related Works

6 (2)Using a relational DBMS XML data is stored in relations and the XML query language (for example, XQuery) is translated to SQL and executed by the underlying relational database system Related Works Disadvantages A great deal of redundancy Difficult to do search or update Handling multi-valued attribute is expensive -- The Edge Approach -- The Attribute Approach -- Universal Table -- Normalized Universal Approach -- STORED

7 (3)Using a storage manager the XML query is parsed, translated to a suitable operator tree representation, optimized, and then executed by an XML Query Engine -- Shore -- B-tree Related Works Disadvantage Inconvenient when doing the search or update

8 (4)Our approach --Store ORA-SS in nested relations Problems in existing storage approaches Stored in flat files – it is long and difficult to query or update Relational DBMS – these approaches cannot get the semantic information ORA-SS reflects the nested structure of semi-structured data, distinguishes between object classes, relationship types and attributes. It is possible to specify the degree of n-ary relationship types and indicate if an attribute is an attribute of a relationship type or an attribute of an object class. Such information is essential for designing an efficient and non- redundant storage organization for semi-structured data Handling multi-valued attribute better in nested relations Related Works

9 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion

10 ORA-SS A semantically richer data model for semi- structured data 3 main concepts Object class Relationship type Attribute

11 Example Binary relationship type ORA-SS

12 Example (Cont) Ternary relationship type ORA-SS

13 Example (Cont) The distinction between binary and ternary relationship types cannot be made in other semi-structured data models. ORA-SS

14 ORA-SS ORA-SS can specify the degree of n-ary relationship types ORA-SS can indicate if an attribute is an attribute of a relationship type or an attribute of an object class Existing semi-structured data models cannot specify such information while it is essential and important for storage

15 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion

16 ORA-SS to OR database Object-Relational database can handle multi- valued attributes efficiently. Multi-valued attributes are treated as repeating groups in nested relations. Storing Algorithm

17 ORA-SS to OR database Main rules Each object class together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Object relation). Each relationship type(object classes involved in this relationship type) together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Relationship relation). Storing Algorithm

18 (1)Object class translation algorithm O1 The identifier and candidate key of this object class is the primary key and candidate key of the generated relation. O2 Each single-valued attribute of this object class is a single-valued attribute of the generated relation. O3 Composite attributes of object class are represented directly. They are replaced by their components in the generated relation. Storing Algorithm

19 Object class translation algorithm (cont) O4 Each multi-valued attribute of this object class forms a repeating group in this relation. O5 Each reference is a foreign key in this relation. O6 Each disjunctive attribute is treated as two attributes. O7 For the ID dependency relationship type, the rule for the ID dependent object class is the same as the rule for the regular object class. The ID dependent object class together with its attributes forms a nested relation within its parent object class. Storing Algorithm

20 Translation Example1 Storing Algorithm

21 (2)Relationship type translation algorithm R1 All the identifiers of the object classes participating in this relationship type form the single-valued attributes of the nested relation. The key of the relationship type can be determined by the participation constraint of the relationship type. R2 Each single-valued attribute of this relationship type is a single-valued attribute of the generated relation. Storing Algorithm

22 Relationship type translation algorithm (cont) R3 Composite attributes of relationship type are represented directly. They are replaced by their components in the generated relation R4 Each multi-valued attributes of this relationship type forms a repeating group in this relation. R5 A disjunctive relationship type is treated as two relationship types. R6 There is no need to translate ID dependency relationship type. Storing Algorithm

23 Translation Example1 Storing Algorithm

24 Translation for Ordering and ANY (3)Translation for Ordering we define another attribute named ordinal within the ordered object class (ie, the ordered attribute). (4)Translation for ANY the unknown structured attribute or an attribute may have a different structure for different instances, which is denoted as ANY we define a separate table as (Identifier, ANY, ANY-value). Identifier is the identifier of the object class or the relationship type which this ANY belongs to. ANY is the different structure name (the TAG) for the different instances. ANY-value is its value. Storing Algorithm

25 Translation Results Followed these algorithms, the Normal Form ORA-SS schema will result in the normal form nested relations. the undesirable update anomalies in semi- structured databases are removed and any redundancy due to many-to-many relationships and n-ary relationships are controlled Storing Algorithm

26 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion

27 Comparison Other models Supply(J#, S#, P#, price, Qty)

28 Conclusion Our approach is to use ORA-SS as our data model and use object-relational database as the database management system. We can store and access the semi-structured data correctly, more efficient and without avoidable redundancy. There is no node ID needed in our approach.

29 Conclusion (cont) Our approach can capture the semantic information which is essential and important for storage. Our approach can represent the degree of n-ary relationship types. Our approach can represent the attribute as attribute of object class or attribute of relationship type.