Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalization continued CMSC 461 Michael Wilson. Normalization clarification  Normalization is simply a way of reducing anomalous database behavior 

Similar presentations


Presentation on theme: "Normalization continued CMSC 461 Michael Wilson. Normalization clarification  Normalization is simply a way of reducing anomalous database behavior "— Presentation transcript:

1 Normalization continued CMSC 461 Michael Wilson

2 Normalization clarification  Normalization is simply a way of reducing anomalous database behavior  It’s not a required or programmatically necessary concept  A database will function perfectly fine without normalized tables  The table design will just suck

3 First normal form (1NF)  Each attribute only has atomic values  None of the elements on a relation in 1NF have elements which are sets  The elements cannot be further broken down  A bad 1NF attribute and example value:  phoneNumberAndFirstName: 555-555-5555,Jason

4 First normal form (1NF)  Furthermore  There are no duplicate rows  This means that there must be a key  This is important for higher normalization forms

5 Second normal form (2NF)  Must be in 1NF  Non-prime attributes are dependent on the whole of a candidate key  Not a partial candidate key – not 2NF  Non-prime = attributes not part of a candidate key  One thing to keep in mind  Multiple candidate keys may occur within one table  As long as the non-prime attributes depend on a candidate key, it is sufficiently 2NF

6 Reminder  Candidate key = minimal uniquely identifying set of attributes

7 2NF example EmployeeSkillWork Location BrownLight Cleaning73 Industrial Way BrownTyping73 Industrial Way HarrisonLight Cleaning73 Industrial Way JonesShorthand114 Main Street JonesTyping114 Main Street JonesWhittling114 Main Street

8 2NF example  Jacked shamelessly from wikipedia  Good example, though  Neither Employee or Skill can be a key here  Key must be {Employee, Skill}  Here, the work location depends on the employee alone  How to solve this?

9 Third normal form (3NF)  Must be in 2NF  Every non-prime attribute must be directly dependent on every superkey in a relation  X→A where X is a superkey and A is a non- prime attribute  Must hold for every superkey and every non- prime attribute

10 Reminder  Superkey – uniquely identifying set of attributes

11 Third normal form (3NF)  Another definition:  For every functional dependency X→A, one of the following must hold:  X→A is trivial  X is a superkey  Every element of the set difference between A and X is a prime attribute – part of a candidate key

12 3NF example TournamentYearWinnerWinner DOB Indiana Invitational 1998Al Frederickson 21 July 1975 Cleveland Open 1999Bob Albertson 28 September 1968 Des Moines Masters 1999Al Frederickson 21 July 1975 Indiana Invitational 1999Chip Masterson 14 March 1977

13 3NF example  Also jacked from wikipedia  This table is in 2NF  What are the candidate keys?  What are the superkeys?

14 3NF example  The winner functionally determines the winner date of birth  Transitive dependency of a non-prime attribute  Therefore, 3NF violation  How do we fix this?

15 Boyce-Codd Normal Form  Often called 3.5NF  Only states two things  For every functional dependency of the form X→A, one of the following must hold:  X→A is trivial  X is a superkey for the relation

16 Difference between 3NF and BCNF?  It’s actually pretty straightforward  3NF says that non-prime attributes must be dependent on a key  However, it does not say anything about prime attributes  Parts of the key can be dependent on candidate keys  BCNF tables satisfy 3NF, but not necessarily the reverse

17 3NF and BCNF  BCNF is only slightly more strict than 3NF  Only time you run into issues is when candidate keys overlap in 3NF  Possible to have a 3NF relation that is not BNF when candidate keys overlap

18 What to use?  3NF is very popular, most common  BCNF is also very popular  Recommendation  Shoot for 3NF to begin with  Very sensible way of organizing your data  Tables only have information that describes the key

19 Denormalization  Though normalization helps us rely on our data, denormalization is sometimes required for performance reasons  Often, one will need to re-add redundant data  Minimizes joins, selects, views, etc.  In high performance applications, one extra select could cause crippling response issues

20 When to denormalize?  Not at first!  If you don’t know that you’re going to run into performance issues, then don’t denormalize  Always try to keep things in a normalized form if possible  Later  Once you’ve identified issues through testing and statistics, denormalize if necessary


Download ppt "Normalization continued CMSC 461 Michael Wilson. Normalization clarification  Normalization is simply a way of reducing anomalous database behavior "

Similar presentations


Ads by Google