Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.

Similar presentations


Presentation on theme: "Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li."— Presentation transcript:

1 Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li

2 Properties of Table When we design relational DB, o It is a set of relations. o Relations can be derived from UML diagram But NOT all relations are correct. o We should carefully observe the properties of table o Functional Dependency o Key o Decomposition of Table 2

3 Definition of Functional Dependency FD (Functional Dependency) on a Relation R o iff A 1 A 2 A 3 … A n  B where A 1, A 2, A 3, …, A n, B are attributes of R o A set of attributes A 1 A 2 A 3 … A n functionally determines B o More than one B’s  A 1 A 2 A 3 … A n  B 1  A 1 A 2 A 3 … A n  B 2 …  A 1 A 2 A 3 … A n  B k  A 1 A 2 A 3 … A n  B 1 B 2 … B k A 1 A 2 A 3 … A n B 1 B 2 B 3 … B k 3

4 Functional Dependency: Example A Relation o Movies (title, year, length, filmType, studioName, starName) (title year)  length (title year)  filmType (title year)  studioName (title year)  length filmType studioName ? (title year)  starName : more than one star in a film It is important to discover FD in a relation o It helps to decide the correctness of relation design. 4

5 Key Given a relation R o A set of one or more attributes {A 1, A 2, A 3, …, A n } is a KEY iff  the set functionally determines all other attributes and  no proper subset of {A 1, A 2, A 3, …, A n } functionally determines other attributes (Minimal) o Primary Key:  If a relation has more than one keys, a key is defined as primary key o Super Key  a set of attributes containing a key  No minimality condition Example o Movies (title, year, length, filmType, studioName, starName) o What are keys ? 5

6 How to discover keys From E-R Diagram: Underlined Attributes o It means that keys are defined based on the understanding of the real world o Example: Movies (title, year, length, filmType, studioName, starName)  (year, starName) is not key if a star can make more than one film per year  (year, starName) is a key if a star is allowed to make only one film per year Relation (A 1, A 2, B) for relationship between R 1 and R 2 o One-One o One-Many o Many-One o Many-Many 6

7 Rules about Functional Dependencies Functional Dependency o An important property of Relation (or Table) o Some interesting properties or rules of FD Transitive Rule o A  B and B  C then A  C Splitting/Combining Rule o A 1 A 2 A 3 …A n  B 1, A 1 A 2 A 3 …A n  B 2, …, A 1 A 2 A 3 … A n  B k iff A 1 A 2 A 3 … A n  B 1 B 2 … B k Trivial FD Rule: Given a FD A 1 A 2 A 3 …A n  B o FD is trivial if B is one of {A 1 A 2 A 3 …A n } : really trivial o FD is Completely non-trivial: B is not in {A 1 A 2 A 3 …A n } 7

8 Rules about Functional Dependencies Trivial Dependency Rule o A 1 A 2 … A n  B 1 B 2 … B m is equivalent to A 1 A 2 … A n  C 1 C 2 … C k if {C 1 C 2 … C k }  { B 1 B 2 … B m } and for any C  {C 1 C 2 … C k }, C  {A 1 A 2 … A n } o Example: (year, title)  (studioName, year), (year, title)  studioName Unnecessary A 1 A 2 A 3 … A n C 1 C 2 C 3 … C k B 1 B 2 B 3 … B m 8

9 Armstrong's Axioms Reflexivity: (Trivial FD) If {C 1 C 2 … C k }  { B 1 B 2 … B m }, then B 1 B 2 … B m  C 1 C 2 … C k Augmentation: If A 1 A 2 … A n  B 1 B 2 … B m, then A 1 A 2 … A n C 1 C 2 … C k  B 1 B 2 … B m C 1 C 2 … C k Transitivity: A 1 A 2 … A n  B 1 B 2 … B m and B 1 B 2 … B m  C 1 C 2 … C k, then A 1 A 2 … A n  C 1 C 2 … C k 9

10 Closure of Attributes Closure : {A 1, A 2, … A n } + o {A 1 A 2 … A n } is a set of attributes and S is a set of FD o Closure of {A 1 A 2 … A n } under FD's in S: set of attributes B such that A 1 A 2 … A n  B o That is, under all functional dependencies, every B i that we derive A 1 A 2 … A n  B 1 A 1 A 2 … A n  B 2... A 1 A 2 … A n  B k then {A 1 A 2 … A n } + = {B 1,B 2,…, B k } 10

11 Algorithm to Find Closure Input: Set of Attributes {A 1, A 2, … A n }, and set S of FDs Output: {A 1, A 2, … A n } + Process 1. Split FDs that each FD has a single attribute on the right. e.g. A 1 A 2  B C then Split it to A 1 A 2  B and A 1 A 2  C 2. Initialize X = {A 1, A 2, … A n } 3. Search for some FD e.g. B 1 B 2... B m  C such that B 1, B 2,.. B m are in X but C not in X 4. Repeat 3 until no more attribute to add in X Example o Given attributes A, B, C, D, E, and F o S: A B  C, B C  A D, D  E, and C F  B What is { A, B } + ? 11

12 Closure and Key If {A 1, A 2, … A n } + is the set of all attributes of relation R, then A 1, A 2, … A n is a super key o Example: R (A, B, C, D, E) and S: A B  C, B C  A D, D  E then { A, B } + = {A, B, C, D, E} : all attributes of R.  {A, B} is a super key of R. if no attribute can be removed to cover the all attributed, then it is a key. o Example: if we remove B from {A, B} then {A} + is not {A, B, C, D, E}. therefore {A, B} is a key 12

13 Closing Set of Functional Dependencies Closing Set of FD set S: o Basis T of S: If we can derive S from a T, then T is a basis of S. o Remove all duplicated FDs o Minimal Basis B satisfies three conditions  All the FD in B have one attribute in right side  If any FD is removed from S, then some FD becomes no longer valid.  If for any FD in B, we remove one or more attributes from the left side, then the result is no more a basis Example o for a S={A  B, A  C, B  A, B  C, C  A, C  B}, what is the minimal basis of S? {AB  C, AC  B, BC  A}? 13

14 14 Bad Design: Anomalies Bad Design: Example Redundancy Update Anomaly Deletion Anomaly TitleYearLengthFilm TypeStudioNameStarName Star Wars1977124ColorFoxCarrie Fisher Star Wars1977124ColorFoxMark Hamill Star Wars1977124ColorFoxHarrison Ford Mighty Ducks1991104ColorDisneyEmilio Estevez Wayne’s World199295ColorParamountDana Carvey Wayne’s World199295ColorParamountMike Meyers

15 15 Decomposing Relations Decomposition of Bad Relation o A good way to remove the problem of bad relations Decomposition: Lossless Decomposition o { A 1 A 2 … A n }  { B 1 B 2 … B m }, {C 1 C 2 … C k } such that { B 1 B 2 … B m }  {C 1 C 2 … C k } = { A 1 A 2 … A n } and { B 1 B 2 … B m }  {C 1 C 2 … C k }  {}

16 16 Decomposing Relations: Example R={title, year, length, filmType, studioName, starName}  {title, year, length, filmType, studioName} (=R1), {title, year, starName} (=R2) Redundancy Update Anomaly Deletion Anomaly TitleYearLengthFilm TypeStudioName Star Wars1977124ColorFox Mighty Ducks1991104ColorDisney Wayne’s World199295ColorParamount TitleYearStarName Star Wars1977Carrie Fisher Star Wars1977Mark Hamill Star Wars1977Harrison Ford Mighty Ducks1991Emilio Estevez Wayne’s World1992Dana Carvey Wayne’s World1992Mike Meyers

17 17 Normal Form: Conditions for Good Relation 1 st Normal Form (1NF) 2 nd Normal Form (2NF) 3 rd Normal Form (3NF) Boyce-Codd Normal Form (BCNF)

18 18 1 st Normal Form 1NF: Every component of relation should be ATOMIC o No Table in component o No Set o No List etc..

19 19 2 nd Normal Form 2NF o 1NF and o None of the non-prime attributes of the relation is functionally dependent on a part of a candidate key  Partial Dependency on non-prime attribute Example o Player (Team, Number, TeamAddress, Name, Position) o 1NF but not 2NF B CA

20 Example Player (Team, Number, TeamAddress, Name, Position) o FD1: Team, Name  Name, Position o FD2: Team  TeamAddress o Key: {Team, Name} + ={Team, Number, TeamAddress, Name, Position} o in FD2, TeamAddress (non-prime attribute) is dependent on {Team}, which is a subset of the key and o 2NF violation Should be decomposed o R1(Team, Number, Name, Position) and R2(Team, TeamAddress) o R1 R2 = R 20

21 21 Example EmployeeSkillCurrent Work Location JonesTyping114 Main Street JonesShorthand114 Main Street JonesWhittling114 Main Street RobertsLight Cleaning73 Industrial Way EllisAlchemy73 Industrial Way EllisJuggling73 Industrial Way HarrisonLight Cleaning73 Industrial Way Candidate Key: {Employee, Skill} Not 2ND  Partial FD: Employee  Current Work Location  Should be decomposed (Employee, Skill), (Employee, Current Work Location)

22 22 3 rd Normal Form 2NF: Every non-prime attributes of the relation must be non- transitively dependent on every candidate key Example o Team (TeamName, Address, ManagerID, ManagerHireDate) o FD:  TeamName  Address, TeamName  ManagerID  (TeamName  )ManagerID  ManagerHireDate  Key: {TeamName}  2NF but Not 3NF o To be decomposed  (TeamName, Address, ManagerID), (Manager SS ID, ManagerHireDate) B CA

23 23 Example: 2NF but NOT 3NF TournamentYearWinnerWinner Date of Birth Indiana Invitational1998Al Fredrickson21 July 1975 Cleveland Open1999Bob Albertson28 September 1968 Des Moines Masters1999Al Fredrickson21 July 1975 Indiana Invitational1999Chip Masterson14 March 1977 Candidate Key: {Tournament, Year} 2NF: No Partial Dependency Not 3ND  Transitive Functional Dependency  {Tournament, Year}  Winner  Winner Date of Birth  Should be decomposed (Tournament, Year, Winner), (Player, Birth date}

24 24 Boyce-Codd Normal Form (BCNF) BCNF: For every one of its non-trivial functional dependencies X  Y, X is a super key o Remember: nontrivial means A is not a member of set X. o Remember, a superkey is any superset of a key (not necessarily a proper superset) BCNF is slightly stronger than 3NF

25 25 1NF 2NF 3NF Relationship between 1NF, 2NF, 3NF and BCNF BCNF

26 26 Example: 3NF but NOT BCNF Prof. IDProf. SS IDStudent ID 1078088-51-007431850 1078088-51-007437921 1293096-77-414646224 1480072-21-222331850 A table to show the assignment of students Candidate Keys  {Prof. ID, Student ID}  {Prof. SS ID, Student ID} 1NF 2NF: no partial FD on non-prime attributes on candidate key 3NF: No transitive FD NOT BCNF:  Prof. ID  Prof. SS ID : Functional Dependency but not candidate key  Should be decomposed (Prof. ID, Student ID), (Prof. ID, Prof. SS ID)

27 Decomposition Three Conditions o Elimination of Anomalies  Update  Redundancy  Deletion o Lossless Decomposition  Original Relation by Natural Join o Preservation of Dependencies Relation with two attributes: Always in BCNF (why?) 27

28 BCNF Decomposition Algorithm Algorithm o Input: Relation R 0 and set S 0 of FDs o Output: R 1, R 2, … R n such that R 0 =R 1 R 2 … R n o Process 1. Check R 0 is in BCNF, then return R 0 2. If there is any BCNF violation with X  Y, then compute X +. Then R 1 = X + and R 2 =has the rest attributes and X 3. Decompose FD set S 0 into S 1 and S 2. 4. Repeat 1-3 until no more BCNF violation. Example o Team (TeamName, Address, ManagerID, ManagerHireDate) o FD:  TeamName  Address, TeamName  ManagerID  ManagerID  ManagerHireDate 28


Download ppt "Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li."

Similar presentations


Ads by Google