Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science.

Similar presentations


Presentation on theme: "Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science."— Presentation transcript:

1 Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science

2 ©Silberschatz, Korth and Sudarshan3.2Database System Concepts Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

3 ©Silberschatz, Korth and Sudarshan3.3Database System Concepts Relational Database Model Relations Source: ESRI Advanced ArcInfo

4 ©Silberschatz, Korth and Sudarshan3.4Database System Concepts Source: ESRI Advanced ArcInfo

5 ©Silberschatz, Korth and Sudarshan3.5Database System Concepts Source: ESRI Advanced ArcInfo

6 ©Silberschatz, Korth and Sudarshan3.6Database System Concepts Source: ESRI Advanced ArcInfo

7 ©Silberschatz, Korth and Sudarshan3.7Database System Concepts Georelational Database Model

8 ©Silberschatz, Korth and Sudarshan3.8Database System Concepts Attribute Relationships Functional Dependency: refers to the relationships between attributes within a relation. If the value of attribute A determines the value of attribute B, then attribute B is functionally dependent upon attribute A.

9 ©Silberschatz, Korth and Sudarshan3.9Database System Concepts Source: ESRI Advanced ArcInfo

10 ©Silberschatz, Korth and Sudarshan3.10Database System Concepts Functional Dependencies Functional Dependencies X -> Y means: X functionally determines Y Y depends on X Values of Y component depend on, determined by values of X component

11 ©Silberschatz, Korth and Sudarshan3.11Database System Concepts Functional Dependencies Given t1 and t2: if t1[X] = t2 [X] then t1[Y] = t2 [Y] (1) In other words if the values of X are equal, then Y value are equal Values of X component uniquely (functionally) determine values of Y component iff (1)

12 ©Silberschatz, Korth and Sudarshan3.12Database System Concepts Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. The process of decomposing relations with anomalies to produce smaller, well-structured relations. Primary Objective: Reduce Redundancy,Reduce nulls, Improve “modify” activities:  insert,  update,  delete,  but not read Price: degraded query, display, reporting

13 ©Silberschatz, Korth and Sudarshan3.13Database System Concepts Normal Forms First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)

14 ©Silberschatz, Korth and Sudarshan3.14Database System Concepts Normalization Boyce- Codd and Higher Functional dependency of nonkey attributes on the primary key - Atomic values only Full Functional dependency of nonkey attributes on the primary key No transitive dependency between nonkey attributes All determinants are candidate keys - Single multivalued dependency

15 ©Silberschatz, Korth and Sudarshan3.15Database System Concepts Unnormalized Relations First step in normalization is to convert the data into a two- dimensional table In unnormalized relations data can repeat within a column

16 ©Silberschatz, Korth and Sudarshan3.16Database System Concepts Unnormalized Relation

17 ©Silberschatz, Korth and Sudarshan3.17Database System Concepts First Normal Form To move to First Normal Form a relation must contain only atomic values at each row and column.  No repeating groups  A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation.

18 ©Silberschatz, Korth and Sudarshan3.18Database System Concepts First Normal Form

19 ©Silberschatz, Korth and Sudarshan3.19Database System Concepts Second Normal Form A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key.  That is, every nonkey attribute needs the full primary key for unique identification

20 ©Silberschatz, Korth and Sudarshan3.20Database System Concepts Second Normal Form

21 ©Silberschatz, Korth and Sudarshan3.21Database System Concepts Second Normal Form

22 ©Silberschatz, Korth and Sudarshan3.22Database System Concepts Second Normal Form

23 ©Silberschatz, Korth and Sudarshan3.23Database System Concepts Third Normal Form A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes  When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency. The side effect column in the Surgery table is determined by the drug administered  Side effect is transitively functionally dependent on drug so Surgery is not 3NF

24 ©Silberschatz, Korth and Sudarshan3.24Database System Concepts Third Normal Form

25 ©Silberschatz, Korth and Sudarshan3.25Database System Concepts Third Normal Form

26 ©Silberschatz, Korth and Sudarshan3.26Database System Concepts Functional Dependency and Keys Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. Candidate Key: Each non-key field is functionally dependent on every candidate key.

27 ©Silberschatz, Korth and Sudarshan3.27Database System Concepts Steps in Normalization

28 ©Silberschatz, Korth and Sudarshan3.28Database System Concepts Normalization – most used n Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal forms, and Boyce–Codd normal form (BCNF). n Based on functional dependencies among the attributes of a relation. n A relation can be normalized to a specific form to prevent possible occurrence of update anomalies.

29 ©Silberschatz, Korth and Sudarshan3.29Database System Concepts First Normal Form No multi-valued attributes. Every attribute value is atomic. Why are the following tables not in 1NF Employee (ssn, Name, Salary, Address, ListOfSkills) Department (Did, Dname, ssn)

30 ©Silberschatz, Korth and Sudarshan3.30Database System Concepts Second Normal Form 1NF and every non-key attribute is fully functionally dependent on the primary key. Every non-key attribute must be defined by the entire key, not by only part of the key. No partial functional dependencies. Assuming that we have a composite PK (LicensePlate, OwnerSSN) for the Vechicle Table below, why is the table not in 2NF ? Vehicle (LicensePlate, Brand, Model, PurchasePrice, Year, OwnerSSN, OwnerName)

31 ©Silberschatz, Korth and Sudarshan3.31Database System Concepts Third Normal Form & BCNF 2NF and no transitive dependencies (functional dependency between non-key attributes = BCNF) Why are the following tables not in 3NF or BCNF ? Why is Employee [ssn, name, salary, did, dname] Customer

32 ©Silberschatz, Korth and Sudarshan3.32Database System Concepts 3NF & BCNF It is very rare for a Table to be in 3NF and not be in BCNF (violation of BCNF). Given a Relation R with attributes A, B and C where A and B are together the composite PK, IF A, B -> C and C -> B THEN R is in 3NF and is not in BCNF Example: Student, course -> Instructor Instructor -> Course

33 ©Silberschatz, Korth and Sudarshan3.33Database System Concepts Steps in Normalization 1NF: a table, without multivalued attributes  if not, then decompose 2NF: 1NF and every non-key attribute is fully functionally dependent on the primary key  if not, then decompose 3NF: 2NF and no transitive dependencies  if not, then decompose GENERAL:  Each table should describe a single theme  Modification anomalies are minimized Hint: THE KEY, THE WHOLE KEY AND NOTHING BUT THE KEY

34 ©Silberschatz, Korth and Sudarshan3.34Database System Concepts EXAMPLE - OBTAIN CANDIDATE KEYS Consider the following scheme from an airline database system: ( P (pilot), F (flight# ), D (date), T (scheduled time to depart) ) We have the following FD's : l F ----> T PDT ----> F FD ----> P Provide some superkeys: l PDT is a superkey, and FD is a superkey. l Is PDT a candidate key? ä PD is not a superkey, nor is DT, nor is PT. ä So, PDT is a candidate key. l FD is also a candidate key, since neither F or D are superkeys.

35 ©Silberschatz, Korth and Sudarshan3.35Database System Concepts CLOSURE OF A SET OF FD'S l If F is a set of functional dependencies for a relation R, the set of all functional dependencies that can be derived from F, denoted by F+, is called the CLOSURE of F. l We can use Armstrong's axioms, and the 3 derived rules, to compute the closure of F, F+.

36 ©Silberschatz, Korth and Sudarshan3.36Database System Concepts WORKING TO GET THE CLOSURE F+ l GIVEN: scheme (A, B, C, G, H, I) l GIVEN: FD set (A--->B, A--->C, CG--->H, CG--->I, B--->H) l Some members of F+ are H A--->H {Transitivity Rule applied to A--->B and B--->H) H CG--->HI {Union Rule applied to CG--->H and CG--->I} H AG--->I {By Augmentation Rule, AG--->CG; then Transitivity}

37 ©Silberschatz, Korth and Sudarshan3.37Database System Concepts THE CLOSURE OF A SET OF ATTRIBUTES l GIVEN: FD set F and a given attribute A (or set of attributes A) l FIND : The set of attributes functionally dependent on A, called the closure of A, and denoted by A+ l IMPORTANT USE FOR THIS: To determine if A is a superkey, we compute A+, the set of attributes functionally dependent on A. If A+ consists of ALL the attributes in the relation, then A is a superkey l HOW DO WE FIND A+? The following algorithm does the trick!

38 ©Silberschatz, Korth and Sudarshan3.38Database System Concepts ALGORITHM TO FIND THE CLOSURE OF ATTRIBUTE A, DENOTED BY A+ result := A; while { result changes } for each functional dependency B--->C begin if B is contained in result, then result := result U C ' end endwhile A+ := result

39 ©Silberschatz, Korth and Sudarshan3.39Database System Concepts EXAMPLE TO FIND THE CLOSURE A+ OF AN ATTRIBUTE A l GIVEN: Relation R with attributes W, X, Y, Z and FD's W ---> Z YZ ---> X WZ ---> Y l FIND : WZ+ l PSEUDO TRACE OF THE ALGORITHM: H result := WZ H from first 2 FD's, no change to "result" H from WZ ---> Y, since WZ is contained in result, we get result := WZY H since YZ is contained in result, we get result := WZYX H Thus, every attribute in R is in WZ+, so WZ is a superkey!

40 ©Silberschatz, Korth and Sudarshan3.40Database System Concepts Normalization Normalization of data - method for analyzing schemas Unsatisfactory schemas decomposed into smaller ones with desirable properties Objectives of normalization  good relation schemas disallowing update anomalies

41 ©Silberschatz, Korth and Sudarshan3.41Database System Concepts Formal framework database normalized to any degree (1, 2, 3, 4, 5, etc.) normalization is not done in isolation need:  lossless join  dependency preservation  additional normal forms meet other desirable criteria

42 ©Silberschatz, Korth and Sudarshan3.42Database System Concepts Normal Forms 1st, 2nd, 3rd, BCNF consider only FD and key constraints constraints must not be hard to understand or detect need not normalize to highest form (e.g. for performance reasons)

43 ©Silberschatz, Korth and Sudarshan3.43Database System Concepts 1NF - 1st normal form part of the formal definition of a relation disallow multivalued attributes, composite attributes and their combination In 1NF single (atomic, indivisible) values

44 ©Silberschatz, Korth and Sudarshan3.44Database System Concepts Normalize into 1NF? How to normalize nested relations into 1NF?  Remove nested relation attributes into new relation  propagate PK  combine PK and partial PK  recursively unnest - multilevel nesting  useful in converting hierarchical schemes into 1NF

45 ©Silberschatz, Korth and Sudarshan3.45Database System Concepts Difficulties with 1NF Difficulties with 1NF insert, delete, update Determine if describe entity identified by PK? If not, called non-full FDs we need full FDs for good inserts, deletes, updates

46 ©Silberschatz, Korth and Sudarshan3.46Database System Concepts Second Normal Form - 2NF Second Normal Form - 2NF Uses the concepts of FDs, PKs and this definition:  An FD is a Full functional dependency if: given Y -> Z Removal of any attribute from Y means the FD does not hold any more

47 ©Silberschatz, Korth and Sudarshan3.47Database System Concepts 2NF A relation schema R is in 2NF if:  Relation is in 1NF  Every non-prime attribute A in R is fully functionally dependent on the primary key Prime attribute - attribute that is a member of the primary key K R can be decomposed into 2NF relations via the process of 2NF normalization  Remove partial dependencies  create new relations where partials are full

48 ©Silberschatz, Korth and Sudarshan3.48Database System Concepts Simplifying Functional Dependencies through Normalization Normalization: the identification of functional dependencies and the modifications required to structurally change the database to remove undesirable dependencies

49 ©Silberschatz, Korth and Sudarshan3.49Database System Concepts Source: ESRI Advanced ArcInfo

50 ©Silberschatz, Korth and Sudarshan3.50Database System Concepts Source: ESRI Advanced ArcInfo

51 ©Silberschatz, Korth and Sudarshan3.51Database System Concepts Source: ESRI Advanced ArcInfo

52 ©Silberschatz, Korth and Sudarshan3.52Database System Concepts Source: ESRI Advanced ArcInfo

53 ©Silberschatz, Korth and Sudarshan3.53Database System Concepts Source: ESRI Advanced ArcInfo

54 ©Silberschatz, Korth and Sudarshan3.54Database System Concepts Source: ESRI Advanced ArcInfo

55 ©Silberschatz, Korth and Sudarshan3.55Database System Concepts September 2,2004 Read the following article: IBM's early relational database scientists: http://www.mcjones.org/System_R/SQL_Re union_95/sqlr95.html http://www.mcjones.org/System_R/SQL_Re union_95/sqlr95.html Chapter 3 3.1. And Chapter 7,7.1-7.3.2 Work on problems: 7.12.7.13,7.14,7.15


Download ppt "Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science."

Similar presentations


Ads by Google