Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Lecture 15 of 42 Monday, 25 February 2008.

Similar presentations


Presentation on theme: "Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Lecture 15 of 42 Monday, 25 February 2008."— Presentation transcript:

1 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Lecture 15 of 42 Monday, 25 February 2008 William H. Hsu Department of Computing and Information Sciences, KSU KSOL course page: http://snipurl.com/va60http://snipurl.com/va60 Course web site: http://www.kddresearch.org/Courses/Spring-2008/CIS560http://www.kddresearch.org/Courses/Spring-2008/CIS560 Instructor home page: http://www.cis.ksu.edu/~bhsuhttp://www.cis.ksu.edu/~bhsu Reading for Next Class: First half of Chapter 7, Silberschatz et al., 5 th edition Normal Forms Notes: E-R and Exam 1 Review

2 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Relationship Sets with Attributes: Review

3 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Weak Entity Sets: Review An entity set that does not have a primary key is referred to as a weak entity set. The existence of a weak entity set depends on the existence of a identifying entity set  it must relate to the identifying entity set via a total, one-to-many relationship set from the identifying to the weak entity set  Identifying relationship depicted using a double diamond The discriminator (or partial key) of a weak entity set is the set of attributes that distinguishes among all the entities of a weak entity set. The primary key of a weak entity set is formed by the primary key of the strong entity set on which the weak entity set is existence dependent, plus the weak entity set’s discriminator.

4 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Weak Entity Sets (Cont.) ‏ We depict a weak entity set by double rectangles. We underline the discriminator of a weak entity set with a dashed line. payment_number – discriminator of the payment entity set Primary key for payment – (loan_number, payment_number)

5 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Weak Entity Sets (Cont.) ‏ Note: the primary key of the strong entity set is not explicitly stored with the weak entity set, since it is implicit in the identifying relationship. If loan_number were explicitly stored, payment could be made a strong entity, but then the relationship between payment and loan would be duplicated by an implicit relationship defined by the attribute loan_number common to payment and loan

6 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts More Weak Entity Set Examples In a university, a course is a strong entity and a course_offering can be modeled as a weak entity The discriminator of course_offering would be semester (including year) and section_number (if there is more than one section)‏ If we model course_offering as a strong entity we would model course_number as an attribute. Then the relationship with course would be implicit in the course_number attribute

7 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Extended E-R Features: Specialization Top-down design process; we designate subgroupings within an entity set that are distinctive from other entities in the set. These subgroupings become lower-level entity sets that have attributes or participate in relationships that do not apply to the higher-level entity set. Depicted by a triangle component labeled ISA (E.g. customer “is a” person). Attribute inheritance – a lower-level entity set inherits all the attributes and relationship participation of the higher-level entity set to which it is linked.

8 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Specialization Example

9 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Extended ER Features: Generalization A bottom-up design process – combine a number of entity sets that share the same features into a higher-level entity set. Specialization and generalization are simple inversions of each other; they are represented in an E-R diagram in the same way. The terms specialization and generalization are used interchangeably.

10 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Specialization and Generalization (Cont.) ‏ Can have multiple specializations of an entity set based on different features. E.g. permanent_employee vs. temporary_employee, in addition to officer vs. secretary vs. teller Each particular employee would be  a member of one of permanent_employee or temporary_employee,  and also a member of one of officer, secretary, or teller The ISA relationship also referred to as superclass - subclass relationship

11 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Design Constraints on a Specialization/Generalization Constraint on which entities can be members of a given lower-level entity set.  condition-defined  Example: all customers over 65 years are members of senior-citizen entity set; senior-citizen ISA person.  user-defined Constraint on whether or not entities may belong to more than one lower-level entity set within a single generalization.  Disjoint  an entity can belong to only one lower-level entity set  Noted in E-R diagram by writing disjoint next to the ISA triangle  Overlapping  an entity can belong to more than one lower-level entity set

12 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Design Constraints on a Specialization/Generalization (Cont.) ‏ Completeness constraint -- specifies whether or not an entity in the higher-level entity set must belong to at least one of the lower-level entity sets within a generalization.  total : an entity must belong to one of the lower-level entity sets  partial: an entity need not belong to one of the lower-level entity sets

13 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Aggregation Consider the ternary relationship works_on, which we saw earlier Suppose we want to record managers for tasks performed by an employee at a branch

14 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Aggregation (Cont.) ‏ Relationship sets works_on and manages represent overlapping information  Every manages relationship corresponds to a works_on relationship  However, some works_on relationships may not correspond to any manages relationships  So we can’t discard the works_on relationship Eliminate this redundancy via aggregation  Treat relationship as an abstract entity  Allows relationships between relationships  Abstraction of relationship into new entity Without introducing redundancy, the following diagram represents:  An employee works on a particular job at a particular branch  An employee, branch, job combination may have an associated manager

15 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts E-R Diagram With Aggregation

16 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts E-R Design Decisions The use of an attribute or entity set to represent an object. Whether a real-world concept is best expressed by an entity set or a relationship set. The use of a ternary relationship versus a pair of binary relationships. The use of a strong or weak entity set. The use of specialization/generalization – contributes to modularity in the design. The use of aggregation – can treat the aggregate entity set as a single unit without concern for the details of its internal structure.

17 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts E-R Diagram for a Banking Enterprise

18 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Summary of Symbols Used in E-R Notation

19 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Summary of Symbols (Cont.) ‏

20 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Reduction to Relation Schemas Primary keys allow entity sets and relationship sets to be expressed uniformly as relation schemas that represent the contents of the database. A database which conforms to an E-R diagram can be represented by a collection of schemas. For each entity set and relationship set there is a unique schema that is assigned the name of the corresponding entity set or relationship set. Each schema has a number of columns (generally corresponding to attributes), which have unique names.

21 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Representing Entity Sets as Schemas A strong entity set reduces to a schema with the same attributes. A weak entity set becomes a table that includes a column for the primary key of the identifying strong entity set payment = ( loan_number, payment_number, payment_date, payment_amount )‏

22 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Representing Relationship Sets as Schemas A many-to-many relationship set is represented as a schema with attributes for the primary keys of the two participating entity sets, and any descriptive attributes of the relationship set. Example: schema for relationship set borrower borrower = (customer_id, loan_number )‏

23 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Redundancy of Schemas Many-to-one and one-to-many relationship sets that are total on the many-side can be represented by adding an extra attribute to the “many” side, containing the primary key of the “one” side Example: Instead of creating a schema for relationship set account_branch, add an attribute branch_name to the schema arising from entity set account

24 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Redundancy of Schemas (Cont.) ‏ For one-to-one relationship sets, either side can be chosen to act as the “many” side  That is, extra attribute can be added to either of the tables corresponding to the two entity sets If participation is partial on the “many” side, replacing a schema by an extra attribute in the schema corresponding to the “many” side could result in null values The schema corresponding to a relationship set linking a weak entity set to its identifying strong entity set is redundant.  Example: The payment schema already contains the attributes that would appear in the loan_payment schema (i.e., loan_number and payment_number).

25 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts UML UML: Unified Modeling Language UML has many components to graphically model different aspects of an entire software system UML Class Diagrams correspond to E-R Diagram, but several differences.

26 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Summary of UML Class Diagram Notation

27 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts UML Class Diagrams (Cont.) ‏ Entity sets are shown as boxes, and attributes are shown within the box, rather than as separate ellipses in E-R diagrams. Binary relationship sets are represented in UML by just drawing a line connecting the entity sets. The relationship set name is written adjacent to the line. The role played by an entity set in a relationship set may also be specified by writing the role name on the line, adjacent to the entity set. The relationship set name may alternatively be written in a box, along with attributes of the relationship set, and the box is connected, using a dotted line, to the line depicting the relationship set. Non-binary relationships drawn using diamonds, just as in ER diagrams

28 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts UML Class Diagram Notation (Cont.) ‏ *Note reversal of position in cardinality constraint depiction *Generalization can use merged or separate arrows independent of disjoint/overlapping overlapping disjoint

29 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts UML Class Diagrams (Contd.) ‏ Cardinality constraints are specified in the form l..h, where l denotes the minimum and h the maximum number of relationships an entity can participate in. Beware: the positioning of the constraints is exactly the reverse of the positioning of constraints in E-R diagrams. The constraint 0..* on the E2 side and 0..1 on the E1 side means that each E2 entity can participate in at most one relationship, whereas each E1 entity can participate in many relationships; in other words, the relationship is many to one from E2 to E1. Single values, such as 1 or * may be written on edges; The single value 1 on an edge is treated as equivalent to 1..1, while * is equivalent to 0..*.

30 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Chapter 7: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal Form Decomposition Using Functional Dependencies Functional Dependency Theory Algorithms for Functional Dependencies Decomposition Using Multivalued Dependencies More Normal Form Database-Design Process Modeling Temporal Data

31 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Combine Schemas? Suppose we combine borrow and loan to get bor_loan = (customer_id, loan_number, amount )‏ Result is possible repetition of information (L-100 in example below)‏

32 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts A Combined Schema Without Repetition Consider combining loan_branch and loan loan_amt_br = (loan_number, amount, branch_name)‏ No repetition (as suggested by example below)‏

33 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts What About Smaller Schemas? Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan? Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key” Denote as a functional dependency: loan_number  amount In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan. Not all decompositions are good. Suppose we decompose employee into employee1 = (employee_id, employee_name)‏ employee2 = (employee_name, telephone_number, start_date)‏ The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.

34 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts A Lossy Decomposition

35 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts First Normal Form Domain is atomic if its elements are considered to be indivisible units  Examples of non-atomic domains:  Set of names, composite attributes  Identification numbers like CS101 that can be broken up into parts A relational schema R is in first normal form if the domains of all attributes of R are atomic Non-atomic values complicate storage and encourage redundant (repeated) storage of data  Example: Set of accounts stored with each customer, and set of owners stored with each account  We assume all relations are in first normal form (and revisit this in Chapter 9)‏

36 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts First Normal Form (Cont’d) ‏ Atomicity is actually a property of how the elements of the domain are used.  Example: Strings would normally be considered indivisible  Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127  If the first two characters are extracted to find the department, the domain of roll numbers is not atomic.  Doing so is a bad idea: leads to encoding of information in application program rather than in the database.

37 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Goal — Devise a Theory for the Following Decide whether a particular relation R is in “good” form. In the case that a relation R is not in “good” form, decompose it into a set of relations {R 1, R 2,..., R n } such that  each relation is in good form  the decomposition is a lossless-join decomposition Our theory is based on:  functional dependencies  multivalued dependencies

38 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Functional Dependencies Constraints on the set of legal relations. Require that the value for a certain set of attributes determines uniquely the value for another set of attributes. A functional dependency is a generalization of the notion of a key.

39 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Functional Dependencies (Cont.) ‏ Let R be a relation schema   R and   R The functional dependency    holds on R if and only if for any legal relations r(R), whenever any two tuples t 1 and t 2 of r agree on the attributes , they also agree on the attributes . That is, t 1 [  ] = t 2 [  ]  t 1 [  ] = t 2 [  ] Example: Consider r(A,B ) with the following instance of r. On this instance, A  B does NOT hold, but B  A does hold. 1 4 1 5 37

40 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Functional Dependencies (Cont.) ‏ K is a superkey for relation schema R if and only if K  R K is a candidate key for R if and only if  K  R, and  for no   K,   R Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema: bor_loan = (customer_id, loan_number, amount ). We expect this functional dependency to hold: loan_number  amount but would not expect the following to hold: amount  customer_name

41 Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Use of Functional Dependencies We use functional dependencies to:  test relations to see if they are legal under a given set of functional dependencies.  If a relation r is legal under a set F of functional dependencies, we say that r satisfies F.  specify constraints on the set of legal relations  We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F. Note: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances.  For example, a specific instance of loan may, by chance, satisfy amount  customer_name.


Download ppt "Computing & Information Sciences Kansas State University Monday, 25 Feb 2008CIS 560: Database System Concepts Lecture 15 of 42 Monday, 25 February 2008."

Similar presentations


Ads by Google