Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design Theory for Relational Databases

Similar presentations


Presentation on theme: "Design Theory for Relational Databases"— Presentation transcript:

1 Design Theory for Relational Databases
2018, Fall Pusan National University Ki-Joune Li

2 Properties of Table When we design relational DB,
It is a set of relations. Relations can be derived from UML diagram But NOT all relations are correct. We should carefully observe the properties of table Functional Dependency Key Decomposition of Table

3 Definition of Functional Dependency
FD (Functional Dependency) on a Relation R iff A1 A2 A3 … An  B where A1 , A2 , A3 , … , An , B are attributes of R A set of attributes A1 A2 A3 … An functionally determines B More than one B’s A1 A2 A3 … An  B1 A1 A2 A3 … An  B2 … A1 A2 A3 … An  Bk A1 A2 A3 … An  B1 B2 … Bk A1 A2 A3 … An B1 B2 B3 … Bk

4 Functional Dependency: Example
A Relation Movies (title, year, length, filmType, studioName, starName) (title year)  length (title year)  filmType (title year)  studioName (title year)  length filmType studioName ? (title year)  starName : more than one star in a film It is important to discover FD in a relation It helps to decide the correctness of relation design.

5 Key Given a relation R Example
A set of one or more attributes {A1, A2, A3, …, An} is a KEY iff the set functionally determines all other attributes and no proper subset of {A1, A2, A3, …, An} functionally determines other attributes (Minimal) Primary Key: If a relation has more than one keys, a key is defined as primary key Super Key a set of attributes containing a key No minimality condition Example Movies (title, year, length, filmType, studioName, starName) What are keys ?

6 How to discover keys From E-R Diagram: Underlined Attributes
It means that keys are defined based on the understanding of the real world Example: Movies (title, year, length, filmType, studioName, starName) (year, starName) is not key if a star can make more than one film per year (year, starName) is a key if a star is allowed to make only one film per year Relation (A1, A2, B) for relationship between R1 and R2 One-One One-Many Many-One Many-Many

7 Rules about Functional Dependencies
Functional Dependency An important property of Relation (or Table) Some interesting properties or rules of FD Transitive Rule A  B and B  C then A  C Splitting/Combining Rule A1 A2 A3 …An  B1, A1 A2 A3 …An  B2, …, A1 A2 A3 … An  Bk iff A1 A2 A3 … An  B1 B2 … Bk Trivial FD Rule: Given a FD A1 A2 A3 …An  B FD is trivial if B is one of {A1 A2 A3 …An} : really trivial FD is Completely non-trivial: B is not in {A1 A2 A3 …An}

8 Rules about Functional Dependencies
Trivial Dependency Rule A1 A2 … An  B1 B2 … Bm is equivalent to A1 A2 … An  C1 C2 … Ck if {C1 C2 … Ck }  { B1 B2 … Bm } and for any C  {C1 C2 … Ck }, C  {A1 A2 … An } Example: (year, title)  (studioName, year), (year, title)  studioName Unnecessary B1 B2 B3 … Bm A1 A2 A3 … An C1 C2 C3 … Ck

9 Armstrong's Axioms Reflexivity: (Trivial FD) If {C1 C2 … Ck }  { B1 B2 … Bm }, then B1 B2 … Bm  C1 C2 … Ck Augmentation: If A1 A2 … An  B1 B2 … Bm , then A1 A2 … An C1 C2 … Ck  B1 B2 … Bm C1 C2 … Ck Transitivity: A1 A2 … An  B1 B2 … Bm and B1 B2 … Bm  C1 C2 … Ck , then A1 A2 … An  C1 C2 … Ck

10 Closure of Attributes Closure : {A1, A2, … An }+
{A1 A2 … An } is a set of attributes and S is a set of FD Closure of {A1 A2 … An } under FD's in S: set of attributes B such that A1 A2 … An  B That is, under all functional dependencies, every Bi that we derive A1 A2 … An  B1 A1 A2 … An  B2 . . . A1 A2 … An  Bk then {A1 A2 … An }+ = {B1 ,B2 ,… , Bk }

11 Algorithm to Find Closure
Input: Set of Attributes {A1, A2, … An }, and set S of FDs Output: {A1, A2, … An }+ Process 1. Split FDs that each FD has a single attribute on the right. e.g. A1 A2  B C then Split it to A1 A2  B and A1 A2  C 2. Initialize X = {A1, A2, … An } 3. Search for some FD e.g. B1 B2 ... Bm  C such that B1, B2 , .. Bm are in X but C not in X 4. Repeat 3 until no more attribute to add in X Example Given attributes A, B, C, D, E, and F S: A B  C, B C  A D, D  E, and C F  B What is { A, B } + ?

12 Closure and Key If {A1, A2, … An }+ is the set of all attributes of relation R, then A1, A2, … An is a super key Example: R (A, B, C, D, E) and S: A B  C, B C  A D, D  E then { A, B } + = {A, B, C, D, E} : all attributes of R.  {A, B} is a super key of R. if no attribute can be removed to cover the all attributed, then it is a key. Example: if we remove B from {A, B} then {A} + is not {A, B, C, D, E} . therefore {A, B} is a key

13 Closing Set of Functional Dependencies
Closing Set of FD set S: Basis T of S: If we can derive S from a T, then T is a basis of S. Remove all duplicated FDs Minimal Basis B satisfies three conditions All the FD in B have one attribute in right side If any FD is removed from S, then some FD becomes no longer valid. If for any FD in B, we remove one or more attributes from the left side, then the result is no more a basis Example for a S={AB, AC, BA, BC, CA, CB}, what is the minimal basis of S? {ABC, ACB, BCA}?

14 Bad Design: Anomalies Bad Design: Example Redundancy Update Anomaly
Deletion Anomaly Title Year Length Film Type StudioName Starring Star Wars 1977 124 Color Fox Carrie Fisher Mark Hamill Harrison Ford 1980 Billy Dee Williams Mighty Ducks 1991 104 Disney Emilio Estevez Wayne’s World 1992 95 Paramount Dana Carvey Mike Meyers Update 124 to 123 Delete “Emilio Estevez”

15 Decomposing Relations: Example
R={title, year, length, filmType, studioName, starring}  {title, year, length, filmType, studioName} (=R1), {title, year, starring} (=R2) Redundancy Update Anomaly Deletion Anomaly Title Year Length Film Type StudioName Star Wars 1977 124 Color Fox 1980 Mighty Ducks 1991 104 Disney Wayne’s World 1992 95 Paramount Title Year Starring Star Wars 1977 Carrie Fisher Mark Hamill Harrison Ford 1980 Billy Dee Williams Mighty Ducks 1991 Emilio Estevez Wayne’s World 1992 Dana Carvey Mike Meyers

16 Normal Form: Conditions for Good Relation
1st Normal Form (1NF) 2nd Normal Form (2NF) 3rd Normal Form (3NF) Boyce-Codd Normal Form (BCNF)

17 1st Normal Form 1NF: Every component of relation should be ATOMIC
No Table in component No Set No List etc..

18 Part of prime attribute Partial Dependency on non-prime attribute
2nd Normal Form 2NF 1NF and None of the non-prime attributes of the relation is functionally dependent on a part of a candidate key Prime Attribute: attribute belonging to key Partial Dependency on non-prime attribute Example Player (Team, Number, TeamAddress, Name, Position) 1NF but not 2NF non-prime attribute Part of prime attribute A C B Partial Dependency on non-prime attribute

19 Example - 1 Player (Team, Number, TeamAddress, Name, Position)
FD1: Team, Name  Name, Position FD2: Team  TeamAddress Key: {Team, Name}+={Team, Number, TeamAddress, Name, Position} in FD2, TeamAddress (non-prime attribute) is dependent on {Team}, which is a subset of the key and 2NF violation Redundancy (Why?) Update Anomaly and Delete Anomaly

20 Example - 1 Should be decomposed
R1(Team, Number, Name, Position) and R2(Team, TeamAddress) R R2 = R After decomposition, no more redundancy and update anomaly.

21 Example - 2 Candidate Key: {Employee, Skill} Not 2ND
Current Work Location Jones Typing 114 Main Street Shorthand Whittling Roberts Light Cleaning 73 Industrial Way Ellis Alchemy Juggling Harrison Candidate Key: {Employee, Skill} Since Employee  Current Work Location, NOT (Employee  Skill) Not 2ND Partial FD: Employee  Current Work Location Should be decomposed to (Employee, Skill), (Employee, Current Work Location)

22 Example - 2 Redundancy and Update Anomaly
Employee Skill Current Work Location Jones Typing 114 Main Street Shorthand Whittling Roberts Light Cleaning 73 Industrial Way Ellis Alchemy Juggling Harrison Redundancy and Update Anomaly Employee Skill Jones Typing Shorthand Whittling Roberts Light Cleaning Ellis Alchemy Juggling Harrison Employee Current Work Location Jones 114 Main Street Roberts 73 Industrial Way Ellis Harrison No more Redundancy and Update Anomaly

23 3rd Normal Form 2NF: Every non-prime attributes of the relation must be non- transitively dependent on every candidate key Example Team (TeamName, Address, ManagerID, ManagerHireDate) FD: TeamNameAddress, TeamNameManagerID (TeamName  )ManagerID  ManagerHireDate Key: {TeamName} 2NF but Not 3NF To be decomposed (TeamName, Address, ManagerID), (Manager SS ID, ManagerHireDate) A C B

24 Example: 2NF but NOT 3NF Candidate Key: {Tournament, Year}
Winner Winner Date of Birth Indiana Invitational 1998 Al Fredrickson 21 July 1975 Cleveland Open 1999 Bob Albertson 28 September 1968 Des Moines Masters Chip Masterson 14 March 1977 Candidate Key: {Tournament, Year} 2NF: No Partial Dependency Not 3ND Transitive Functional Dependency {Tournament, Year}  Winner  Winner Date of Birth Should be decomposed (Tournament, Year, Winner), (Player, Birth date}

25 Example: 2NF but NOT 3NF Redundancy and Update Anomaly (why?)
Tournament Year Winner Winner Date of Birth Indiana Invitational 1998 Al Fredrickson 21 July 1975 Cleveland Open 1999 Bob Albertson 28 September 1968 Des Moines Masters Chip Masterson 14 March 1977 Redundancy and Update Anomaly (why?) Deletion Anomaly (Why) Tournament Year Winner Indiana Invitational 1998 Al Fredrickson Cleveland Open 1999 Bob Albertson Des Moines Masters Chip Masterson Winner Winner Date of Birth Al Fredrickson 21 July 1975 Bob Albertson 28 September 1968 Chip Masterson 14 March 1977 No Redundancy and Anomalies

26 Boyce-Codd Normal Form (BCNF)
BCNF: For every one of its non-trivial functional dependencies X  Y, X is a super key Remember: nontrivial means A is not a member of set X. Remember, a superkey is any superset of a key (not necessarily a proper superset) BCNF is slightly stronger than 3NF

27 BCNF Example empID empNationality empDept DeptType NoDeptEmp 1001
Austrian Production D001 200 Stores 250 1002 American Design D134 100 Purchasing 600 We suppose an employee can work in multiple departments FD: empIDempNationality, empDeptdeptType,NoDeptEmp Candidate Keys: {empID, empDept}  NOT BCNF since neither empID nor empDept are super keys empID empNationality empDept 1001 Austrian Production Stores 1002 American Design Purchasing empDept DeptType NoDeptEmp Production D001 200 Stores 250 Design D134 100 Purchasing 600

28 BCNF Example empID empNationality empDept 1001 Austrian Production
Stores 1002 American Design Purchasing empDept DeptType NoDeptEmp Production D001 200 Stores 250 Design D134 100 Purchasing 600 FD: empIDempNationality, empDeptdeptType,NoDeptEmp Candidate Keys: {empID, empDept} and {emptDept}  NOT BCNF since empID is NOT a super key FD: empIDempNationality, empDeptdeptType,NoDeptEmp Candidate Keys: {empID, empDept}, {emptDept}, and {empID, empDept} BCNF since empID is a key empDept is a key empID empNationality 1001 Austrian 1002 American empID empDept 1001 Production Stores 1002 Design Purchasing

29 Relationship between 1NF, 2NF, 3NF and BCNF
2NF  BCNF 3NF  BCNF 1NF 2NF 3NF BCNF

30 Example: 3NF but NOT BCNF
Prof. ID Prof. SS ID Student ID 1078 31850 37921 1293 46224 1480 A table to show the assignment of advisors to students (more than one advisors to each student) Candidate Keys {Prof. ID, Student ID} {Prof. SS ID, Student ID} 1NF 2NF: no partial FD of non-prime attributes on candidate key 3NF: No transitive FD NOT BCNF: Prof. ID  Prof. SS ID : Functional Dependency but not candidate key Should be decomposed (Prof. ID, Student ID), (Prof. ID, Prof. SS ID) Prof.ID Prof. SS ID Student ID

31 Decomposing Relations
Decomposition of Bad Relation A good way to remove the problem of bad relations Decomposition: Lossless Decomposition { A1 A2 … An }  { B1 B2 … Bm }, {C1 C2 … Ck } such that { B1 B2 … Bm }  {C1 C2 … Ck } = { A1 A2 … An } and { B1 B2 … Bm }  {C1 C2 … Ck }  {}

32 Lossless Decomposition – Bad Example
R1 R2’ Title Starring Star Wars Carrie Fisher Mark Hamill Harrison Ford Billy Dee Williams Mighty Ducks Emilio Estevez Wayne’s World Dana Carvey Mike Meyers Title Year Length Film Type StudioName Star Wars 1977 124 Color Fox 1980 Mighty Ducks 1991 104 Disney Wayne’s World 1992 95 Paramount R2 Title Year Starring Star Wars 1977 Carrie Fisher Mark Hamill Harrison Ford 1980 Billy Dee Williams Mighty Ducks 1991 Emilio Estevez Wayne’s World 1992 Dana Carvey Mike Meyers R  R R2’ R = R R2

33 Decomposition Three Conditions
Elimination of Anomalies Update Redundancy Deletion Lossless Decomposition Original Relation by Natural Join Preservation of Dependencies Relation with two attributes: Always in BCNF (why?)

34 BCNF Decomposition Algorithm
Input: Relation R0 and set S0 of FDs Output: R1, R2, … Rn such that R0 =R1 R2 … Rn Process 1. Check R0 is in BCNF, then return R If there is any BCNF violation with X  Y, then compute X Then R1= X+ and R2 =has the rest attributes and X 3. Decompose FD set S0 into S1 and S Repeat 1-3 until no more BCNF violation. Example Team (TeamName, Address, ManagerID, ManagerHireDate) FD: TeamNameAddress, TeamNameManagerID ManagerID  ManagerHireDate


Download ppt "Design Theory for Relational Databases"

Similar presentations


Ads by Google