Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University.

Similar presentations


Presentation on theme: "1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University."— Presentation transcript:

1 1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University of Auckland, New Zealand

2 2 Outline 1. Motivation 2. Introduction to ORA-SS (Object-Relationship- Attribute ) Model 3. From ORA-SS to XML DTD 4. Normal form for ORA-SS schema diagram 5. Designing ORA-SS schema diagram into normal form 6. Comparison with related proposals 7. Summary

3 3 1. Motivation 4 Example 1.1: Redundancy in XML document cs 12 Smith 230 Database 22 Jones 230 Database

4 4 1. Motivation ( Cont. ) 4 Example 1.1 ( Cont. )

5 5 1. Motivation ( Cont. ) 4 Example 1.1 ( Cont. ) Corresponding ORA-SS instance diagram and schema diagram

6 6 1. Motivation ( Cont. ) 4 Example 1.1 (Cont.) A better Designed ORA-SS schema diagram

7 7 4 Example 1.1 (Cont.) 1. Motivation ( Cont. ) A better Designed ORA-SS instance schema diagram

8 8 1. Motivation ( Cont. ) 4 Example 1.2:Ambiguity in OEM database and its DataGgide

9 9 1. Motivation ( Cont. ) 4 Example 1.2(Cont.) :Ternary Relationship Type Representation

10 10 1. Motivation ( Cont. ) 4 Example 1.2 ( Cont. ): Binary Relationship Type Representation Note the DataGuide for the schema diagram is the same as for the previous schema!

11 11 2. Introduction to ORA-SS Model 4 Four concepts: 4 object classes 4 relationship types 4 attributes 4 references 4 Four Diagrams: 4 schema diagram 4 instance diagram 4 functional dependency diagram 4 inheritance diagram

12 12 2. Introduction to ORA-SS Model( Cont. ) 4 Object Class –attributes of object class Single valued Multi-valued –ordering on object class Object class employee with attributes in an ORA-SS schema diagram

13 13 2. Introduction to ORA-SS Model( Cont. ) 4 Relationship Type –attributes of relationship type Single valued Multi-valued –degree of n-ary relationship type –participation constraints of objects in relationship type –disjunctive relationship type –recursive relationship type

14 14 2. Introduction to ORA-SS Model( Cont. ) 4 Relationship type (Cont.) Representing binary relationship type

15 15 2. Introduction to ORA-SS Model( Cont. ) 4 Relationship type (Cont.) Representing ternary relationship type

16 16 2. Introduction to ORA-SS Model( Cont. ) 4 Attributes –key attribute and identifier –composite attribute –disjunctive attribute –attribute with unknown structure (ANY) –ordering on attribute –Attributes of object class/relationship type –Single-valued / multi-valued attribute –fixed and default values of attribute –derived attribute

17 17 2. Introduction to ORA-SS Model( Cont. ) 4 Attributes (Cont.) Object classes with relationship type and attributes in an ORA-SS schema diagram

18 18 4 Attributes (Cont.) 2. Introduction to ORA-SS Model( Cont. ) Disjunctive attribute and relationship in an ORA-SS schema diagram

19 19 2. Introduction to ORA-SS Model( Cont. ) 4 References Referencing an object class in an ORA-SS schema diagram

20 20 2. Introduction to ORA-SS Model( Cont. ) Recursive relationship type in an ORA-SS schema diagram Symmetric relationship sets in an ORA-SS schema diagram 4 References (Cont.)

21 21 3. Mapping ORA-SS schema diagram to XML DTD Algorithm 1: Mapping ORA-SS Schema Diagram to XML DTD input: an ORA-SS schema diagram SD output: an XML DTD Begin For each object class O in SD do: Step 1. sub-object classes of O. Step 2. For each attribute A of O Case (1)A is a single valued simple attribute Case (2)A is a single valued composite attribute, replace A with its components and add them to Case (3)A is a multivalued simple attribute. Case (4)A is a multivalued composite attribute, A’s components

22 22 4 Algorithm 1: mapping ORA-SS schema diagram to XML DTD (cont.) 3. Mapping ORA-SS schema diagram to XML DTD (Cont.) Step 3. For each relationship attribute A under O Case (1)A is a simple attribute add A to O ’s subelementsList. Case (2)A is a multi-valued simple attribute and add A to O ’s subelementsList. Case (3)A is a single-valued composite attribute. A’s components. Case (4) A is a multi-valued composite attribute. A’s components. add A to O ’s subelementsList. Step 4. For each reference O-Ref Case (1) O is a child object class of O 1, and has no extra attributes and child object classes Case (2) O is a root object class or it has nested attributes or child object classes

23 23 3. Mapping ORA-SS schema diagram to XML DTD (Cont.) 4 Example 3.1 Referencing an object class in an ORA-SS schema diagram

24 24 4 Example 3.1 (Cont.) An XML DTD for the ORA-SS schema diagram 3. Mapping ORA-SS schema diagram to XML DTD (Cont.)

25 25 4. Normal form for ORA-SS schema diagram 4 Observation: ORA-SS is similar to nested relations –tree-like structure –repeating groups or multiple occurrences of objects. e.g.: the corresponding nested relation for the following ORA-SS schema diagram is Dept (dept-name, course (code, title, student (number, s-name, grade)*)*)

26 26 4. Normal form for ORA-SS schema diagram (Cont.) 4 Objectives: To ensure the corresponding set of nested relations of the ORA-SS schema diagram is in normal form for set of nested relations (NF- NR) [5,6] We will define 4 Object class normal form (O-NF) 4 Relationship type normal form (R-NF) 4 ORA-SS normal form schema (ORA-SS NF)

27 27 4. Normal form for ORA-SS schema diagram (Cont.) 4 Defn: object class normal form (O-NF) An object class O of an ORA-SS schema diagram is said to be in object class normal form (O-NF), if the nested relation constructed by O’s single valued attributes as its atomic attributes, O’s multivalued attributes as its repeating groups, is in normal form NF-NR.

28 28 4 Example 4.1:Assume we have following functional dependencies: {S#  dept, dept  faculty} for the ORA-SS schema diagram: 4. Normal form for ORA-SS schema diagram (Cont.) The corresponding nested relation for the schema diagram is : Staff(s#,dept,faculty), it is not in 3NF, since faculty is transitive dependent on S#, hence the relation is not in NF-NR. A better Designed ORA-SS schema diagram: Transitive functional dependency is removed.

29 29 4. Normal form for ORA-SS schema diagram (Cont.) 4 Defn: relationship type normal form (R-NF) A relationship type R of an ORA-SS schema diagram D is said to be in relationship type normal form (R-NF), if the nested relation constructed by the identifiers of the participating object classes, and R’s atomic attributes as its atomic attributes, R’s multivalued attributes and composite attributes as its repeating groups, is in normal form NF-NR.

30 30 4 Example 4.2:The ORA-SS schema attempts to show that the lecturer can teach all the courses using all the textbooks as described on the curriculum, i.e. it should satisfy a MVD constraints: course-code  isbn | staff#.. The nested relation for the relationship type ctl is: ctl(course-code,isbn,staff#) It is not in 4NF, so is not in NF-NR, hence the relationship type ctl is not in R-NF. 4. Normal form for ORA-SS schema diagram (Cont.) A better design: MVD is removed

31 31 4. Normal form for ORA-SS schema diagram (Cont.) 4 Defn: ORA-SS normal form schema An ORA-SS schema diagram D is in normal form (NF) iff it satisfies the following conditions: 1.Every object class in D is in O-NF. 2.For every relationship type R in D (a) R is in R-NF. (b) Case(1) R is a binary relationship type from object class A to object class B, then all the B’s attributes can stay with B only if R is a one-to-many or one-to-one binary relationship type from A to B. All the attributes of R (if any) should be attached to B. Case (2) R is a n-ary relationship type with n (n>2) participating object classes O 1,O 2,…,O n, and the path going downward from the top of D linking those object classes is /O1/O2/…/O n, then for each object class O i (2  i  n), (i) O i should have an i-ary relationship R i with its ancestors O 1,O 2,…,O i-1. (ii) The attributes of O i can stay with O i only if functional dependency O i  O 1,O 2,…,O i-1 can be derived from the functional dependency diagram for D. The attributes of R i (if any) should be attached to O i. 3.There is no relationship type nested under another many-to-many or many-to one binary or n-ary (n>2) relationship type. 4.Every relationship type cannot be derived from other relationship types in D.

32 32 4. Normal form for ORA-SS schema diagram (Cont.) 4 Example 4.4: The ORA-SS schema diagram is not in NF, if professor is also an employee in the department: the qualification of a professor can be derived from that of employee, such information will be repeated in the underlying databases. A ORA-SS schema diagram that not in NFA ORA-SS schema diagram that in NF

33 33 5. Converting ORA-SS Schema Diagrams into Normal Form Two Approaches for Designing Semistructured Databases: 4 Approach 1. –based on the users’ requirements, come out an initial ORA-SS schema diagram; –normalize the ORA-SS schema diagram to its normal form; –map it to an XML DTD or XML Schema; 4 Approach 2. –Extract schema from the instances using the schema extracting techniques. –Translate the schema into ORA-SS schema diagram. Here we need semantic enrichment, since not all semantics needed are available from the extracted schema. –Convert the ORA-SS schema diagram into its normal form. –translate the NF ORA-SS schema diagram back to XML DTD or XML Schema. –Restructuring the initial data instance to conform to the generated XML DTD or XML Schema.

34 34 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Algorithm 2: Converting an ORA-SS schema diagram into NF ORA-SS schema diagram. Input : an ORA-SS schema diagram SD, and its functional dependency diagram. Output : a NF ORA-SS schema diagram. { step 1. Convert any non O-NF object class to O-NF. step 2. Make each relationship type R in R-NF. step 3. This step involves two sub-steps. (1) Construct diagrams for each object class with their attributes. (2) Represent each relationship type R. We make R satisfy the item (b) of condition 2 as well as condition 3 of the NF definition by introducing referencing object classes, and requiring each relationship type start with an object class with attributes (i.e., non-reference object class). step 4. Remove those relationship types along with their associated attributes that can be derived from other relationship types in the schema diagram to satisfy condition 4 of NF definition. }

35 35 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1: There is a many-to-many binary relationship pc between professor and course, and a many-to-many binary relationship ct between course and textbook. It is not in NF ORA-SS since it violates the condition 3 of the NF definition.. (a) Initial ORA-SS schema diagram

36 36 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1 (Cont.) Step 1. The three given object classes are already in O-NF. Step 2. The two relationship type pc and ct are already in R-NF. Step 3. (1) generate three diagrams for the object classes with attributes. (b) Fragment diagrams for object classes

37 37 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1 (Cont.) Step 3.( Cont. ) (2) represent the binary relationship pc, by creating a reference object class course 1 referencing course and nest course 1 under professor (c) Diagrams after representing relationship pc

38 38 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1 (Cont.) Step 3.( Cont. ) (2) represent the binary relationship ct, by creating a reference object class textbook 1 referencing textbook and nest textbook 1 under course. Step 4.(passed). The schema generated is in NF. (d) Final ORA-SS schema diagram that in NF

39 39 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2. There is a binary relationship cs between course and student and a ternary relationship cst between course, student and tutor. The grade is an attribute of the binary relationship cs, and feedback is an attribute of the ternary relationship cst. It is not in NF ORA-SS since it violates the item (ii) of case 2 in condition 2-(b) of NF definition. (a) Initial ORA-SS schema diagram

40 40 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2( Cont.) Step 1. The three given object classes are already in O-NF. Step 2.The two relationship type cs and cst are already in R-NF. Step 3. (1) generate three diagrams for the object classes with attributes. (b) Fragment diagrams for object classes

41 41 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2 (Cont.) Step 3.( Cont. ) (2) represent the binary relationship cs. we create a reference object class student 1 referencing student and nest student 1 under course. Relationship attribute grade is attached to student 1. (c) Diagram representing binary relationship cs

42 42 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2 (Cont.) Step 3.(Cont.) (2) represent the relationship cst. we create a reference object class tutor 1 referencing tutor, and nest tutor 1 under student 1. Relationship attribute feedback is attached to tutor 1. Step 4.(passed). The schema generated is in NF. (d) Final ORA-SS schema diagram that in NF

43 43 6. Comparison with Related Proposal 4 The first attempt to define normal form for semistructured data[4] –Defines a schema called S3-Graph, a labeled graph in which vertices correspond to objects and edges represent the object-subobject relationship. Its data instance is called semistructured data graph. –S3-Graph cannot show the degree of a n-ary relationship type, neither can it distinguish between attributes of object classes and attributes of relationships types.

44 44 6. Comparison with Related Proposal (Cont.) 4 The first attempt to define normal form for semistructured data[4] (Cont.) –Defined a dependency constraint SS- dependency. –Proposes S3-NF. An S3-Graph is in S3-NF if there is no transitive SS-dependency. Hence, only this kind of redundancy can be recognized by S3-NF

45 45 6. Comparison with Related Proposal (Cont.) 4 The first attempt to define normal form for semistructured data[4] (Cont.) –Presents two approaches to design S3-NF databases 1.The decomposition method can remove identified transitive SS-dependency and achieve S3-NF, while may not able to remove the partial functional dependency inside an entity type or object classes, as well as the redundancy result from over-nesting. 2.The transformation of a normal form ER diagram into an S3-Graph. The result may not be unique but is dependent on the path constructed. Hence some results may not satisfy the application requirements and comply with the user’s viewpoints.

46 46 6. Comparison with Related Proposal (Cont.)  The most recent proposal: XNF (XML Normal Form) [2] –It mainly provides algorithms to translate a schema, represented in a conceptual model called CM hypergraph to a scheme-tree forest in XNF. –CM hypergraph has no concept of attribute (so too many objects) and no hierarchical structure. –The given algorithms are non-deterministic, and suffers from efficiency. –Adding new required information requires redesign schema. –The algorithms generate a large no of solutions rather than verifying whether a SS schema is in normal form or not. –ISA hierarchies are removed from CM hypergraph before input to the algorithms.

47 47 6. Comparison with Related Proposal (Cont.) 4 The advantages of our proposal: –2-level design: incremental and iterative First, identify or figure out object classes,and relationship types from user requirements. Then add attributes for object classes and relationship types. In contrast, XNF requires all the needed information to be presented at once. Even a small change in information requirements requires redesign the whole schema.

48 48 6. Comparison with Related Proposal (Cont.) 4 The advantages of our proposal (Cont.) : –Preserve the hierarchical structure satisfying users’ requirements. In contrast, since CM graph has no hierarchy, XNF needs to generate many solutions. The approach fails when user already has a hierarchical structure, and wants to preserve it and verifies the design is good or not.

49 49 7. Summary 4 ORA-SS model helps to detect redundancy in semistructured data. 4 We need a normal form for ORA-SS, since ORA- SS schema diagrams may contain redundancies and suffers from considerable updating anomalies. 4 We define a normal form ORA-SS schema diagram. It ensures –no unnecessary redundancy and –no updating anomalies for semistructured databases generated from the schema. 4 We present an algorithm for mapping ORA-SS schema diagram into XML DTD/Schema

50 50 7. Summary (Cont.) 4 We give a design methodology and present a comprehensive algorithm for normalizing an ORA-SS schema diagram into its normal form. The steps presented can also be used as guidelines for designing semistructured databases using the ORA- SS model –As ORA-SS distinguished objects Vs. attributes, the design complexity is reduced. –ORA-SS allows 2 levels of design: first object classes and relationship type then add in attributes. 4 We show that ORA-SS design approach outperform other related proposals.

51 51 References 1. G.Dobbie, X.Y.Wu, T.W.Ling and M.L.Lee. ORA-SS: An Object-Relationship- Attribute Model for Semistructured Data. Technical Report TR21/00, School of Computing, National University of Singapore, 2000. 2. D.W.Embley and W.Y.Mok. Developing XML Documents with Guaranteed “Good” Properties. ER 2001. 3. R. Goldman and J. Widom. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. Proceedings of the Twenty- Third International Conference on Very Large Data Bases, pages 436-445, Athens, Greece, August 1997. 4. S. Y. Lee, M. L. Lee, T. W. Ling and L. A.. Kalinichenko. Designing Good Semi-structured Databases. ER 1999: 131-145 5. T.W. Ling. A Normal Form for Entity-Relationship Diagrams. Proc. 4 th International Conference on Entity-Relationship Approach (1985) 6. T. W. Ling. A normal form for sets of not-necessarily normalized relations. In Proceedings of the 22nd Hawaii International Conference on System Sciences, pp. 578-586. United States: IEEE Computer Society Press, 1989. 7. X.Y.Wu, T.W. Ling, M.L.Lee, G.Dobbie. Designing Semistructured Databases Using ORA-SS Model, in Proceedings of the 2nd International Conference on Web Information Systems Engineering (WISE), IEEE Computer Society Kyoto, Japan, December 2001.


Download ppt "1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University."

Similar presentations


Ads by Google