1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.

1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang

2 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion

3 Main Accomplishment This study provides an efficient and consistent storage for semistructured data by developing algorithms that map the XML document to logical ORA-SS model and then to an object- relational data store.

5 (1) the file system store each XML document as a separate operating system file and use a DOM or SAX parser whenever the document is accessed by a query Disadvantage XML files in ASCII format need to be parsed every time when they are accessed for either browsing or querying. the entire parsed file must be memory-resident during query processing in DOM. it is hard to build and maintain indices on documents stored this way. update operations are difficult to implement. Related Works

6 (2)Using a relational DBMS XML data is stored in relations and the XML query language (for example, XQuery) is translated to SQL and executed by the underlying relational database system Related Works Disadvantages A great deal of redundancy Difficult to do search or update Handling multi-valued attribute is expensive -- The Edge Approach -- The Attribute Approach -- Universal Table -- Normalized Universal Approach -- STORED

7 (3)Using a storage manager the XML query is parsed, translated to a suitable operator tree representation, optimized, and then executed by an XML Query Engine -- Shore -- B-tree Related Works Disadvantage Inconvenient when doing the search or update

8 (4)Our approach --Store ORA-SS in nested relations Problems in existing storage approaches Stored in flat files – it is long and difficult to query or update Relational DBMS – these approaches cannot get the semantic information ORA-SS reflects the nested structure of semi-structured data, distinguishes between object classes, relationship types and attributes. It is possible to specify the degree of n-ary relationship types and indicate if an attribute is an attribute of a relationship type or an attribute of an object class. Such information is essential for designing an efficient and non- redundant storage organization for semi-structured data Handling multi-valued attribute better in nested relations Related Works

10 ORA-SS A semantically richer data model for semistructured data 3 main concepts Object class Relationship type Attribute

11 Example Binary relationship type ORA-SS

12 Example (Cont) Ternary relationship type ORA-SS

13 Example (Cont) The distinction between binary and ternary relationship types cannot be made in other semi-structured data models. ORA-SS

14 ORA-SS ORA-SS can specify the degree of n-ary relationship types ORA-SS can indicate if an attribute is an attribute of a relationship type or an attribute of an object class Existing semi-structured data models cannot specify such information while it is essential and important for storage

16 ORA-SS to OR database Object-Relational database can handle multi- valued attributes efficiently. Multi-valued attributes are treated as repeating groups in nested relations. Storing Algorithm

17 ORA-SS to OR database Main rules Each object class together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Object relation). Each relationship type(object classes involved in this relationship type) together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Relationship relation). Storing Algorithm

18 (1)Object class translation algorithm O1 The identifier and candidate key of this object class is the primary key and candidate key of the generated relation. O2 Each single-valued attribute of this object class is a single-valued attribute of the generated relation. O3 Composite attributes of object class are represented directly. They are replaced by their components in the generated relation. Storing Algorithm

19 Object class translation algorithm (cont) O4 Each multi-valued attribute of this object class forms a repeating group in this relation. O5 Each reference is a foreign key in this relation. O6 Each disjunctive attribute is treated as two attributes. O7 For the ID dependency relationship type, the rule for the ID dependent object class is the same as the rule for the regular object class. The ID dependent object class together with its attributes forms a nested relation within its parent object class. Storing Algorithm

20 Translation Example1 Storing Algorithm

21 (2)Relationship type translation algorithm R1 All the identifiers of the object classes participating in this relationship type form the single-valued attributes of the nested relation. The key of the relationship type can be determined by the participation constraint of the relationship type. R2 Each single-valued attribute of this relationship type is a single-valued attribute of the generated relation. Storing Algorithm

22 Relationship type translation algorithm (cont) R3 Composite attributes of relationship type are represented directly. They are replaced by their components in the generated relation R4 Each multi-valued attributes of this relationship type forms a repeating group in this relation. R5 A disjunctive relationship type is treated as two relationship types. R6 There is no need to translate ID dependency relationship type. Storing Algorithm

23 Translation Example1 Storing Algorithm

24 Translation for Ordering and ANY (3)Translation for Ordering we define another attribute named ordinal within the ordered object class (ie, the ordered attribute). (4)Translation for ANY the unknown structured attribute or an attribute may have a different structure for different instances, which is denoted as ANY we define a separate table as (Identifier, ANY, ANY-value). Identifier is the identifier of the object class or the relationship type which this ANY belongs to. ANY is the different structure name (the TAG) for the different instances. ANY-value is its value. Storing Algorithm

25 Translation Results Followed these algorithms, the Normal Form ORA-SS schema will result in the normal form nested relations. the undesirable update anomalies in semistructured databases are removed and any redundancy due to many-to-many relationships and n-ary relationships are controlled Storing Algorithm

27 Comparison Other models Supply(J#, S#, P#, price, Qty)

28 Conclusion Our approach is to use ORA-SS as our data model and use object-relational database as the database management system. We can store and access the semi-structured data correctly, more efficient and without avoidable redundancy. There is no node ID needed in our approach.

29 Conclusion (cont) Our approach can capture the semantic information which is essential and important for storage. Our approach can represent the degree of n-ary relationship types. Our approach can represent the attribute as attribute of object class or attribute of relationship type.

1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.

Similar presentations

Presentation on theme: "1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.

Similar presentations

Presentation on theme: "1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang."— Presentation transcript:

Similar presentations

About project

Feedback