Presentation is loading. Please wait.

# Dependency preservation, 3NF revisited and BCNF

## Presentation on theme: "Dependency preservation, 3NF revisited and BCNF"— Presentation transcript:

Dependency preservation, 3NF revisited and BCNF

Decomposition - more than one possibility
normalisation  decomposition (non-loss) Modules(M_id, M_name, Type, Value) solution #1 (3NF) Modules_Descr(M_id, M_name, Type) Type_Val(Type, Val) solution #2 (3NF) Module_Val(M_id, Val) are they both non-loss? (apply Heath’s theorem) is there one better than the other?

Decomposition - update anomalies
updates u1: insert the fact that a 3 semester module is worth 1.5cu u2: modify 1 semester modules; they are not worth 0.5cu any longer, they are 0.75cu u3: change the type of a module but forget to change its value solution #2 u1 and u2 are impossible or very difficult to perform u3 is allowed solution #1 u1 and u2 are straightforward u3 is not allowed

Solution #1 vs solution #2
more expressive certain facts cannot be expressed in solution #2; e.g. the value of a new type updates can be independently performed on the two component relations (i.e. all constraints are properly expressed) in solution #2: Type  Value is lost, so this constraint must be enforced by the user by procedural code independent projections updates can be performed independently on each projection, without the danger of ending with inconsistent data

Independent projections
M_name M-id Type Value Solution #1 Solution #2 M-id Type M_name M-id Type M_name Type Value M_id Value all direct : intra all transitive : inter one transitive : intra one direct : lost

Independent projections - Risanen
R1 and R2 are two projections of R; R1 and R2 are independent if and only if every FD in R is a logical consequence of the FDs in R1 and R2 the common attributes of R1 and R2 for a candidate key for at least one of R1 or R2 atomic relation cannot be decomposed into independent projections

Dependency preservation
R was decomposed (normalisation) into R1, …, Rn S - the set of FDs for R S1, …, Sn - the set of FDs for R1, …, Rn (each Si refers to only the attributes of Ri) S’ = S1  …  Sn (usually, S’  S) the decomposition is dependency preserving if (not iff) S’+ = S+

2NF and 3NF - more than one CK
a relation is in 2NF if and only if it is in 1NF and all non-key attributes are irreducibly dependent on the candidate keys 3NF (Zaniolo) R is a relation; X is any set of attributes of R; A is any single attribute of R; consider the following conditions: X contains A X contains a candidate key of R A is contained in a candidate key of R if either of the three is true for every FD X  A then R is in 3NF

Example Assume the supply department in a company is in charge of bringing parts from different manufacturers. A part is uniquely identified by its name and manufacturer; for convenience, a part is also given an id. A separate delivery is necessary for each type of part, from each manufacturer. At most one delivery is made in one day for one type of part from one manufacturer. A “transport” (e.g. van23) is associated with each delivery. Each transport has a unique driver. A driver can drive more than one “transports”.

Relevant FDs CK: (Type, Manufacturer, Date) CK: (Id, Date)
(Type, Manufacturer)  Id Id  (Type, Manufacturer) Transport  Driver Manufacturer  Address Type  Handling_req

2NF 2NF? 2NF Type HR Man Add

3NF 3NF? 3NF Transp Driver

3NF 3NF is not free from update e.g. (Type, Manufacturer)  Id
exercise insert delete update

BCNF a relation is in Boyce/Codd normal form (BCNF) if and only if every non-trivial irreducible FD has a candidate key as its determinant any relation can be non-loss decomposed into an equivalent set of BCNF relations BCNF  3NF  2NF  1NF BCNF is still not guaranteed to be free of any update anomalies caused by FDs example - later

BCNF - examples previous example: one candidate key only
CKs: Id, Name, Photo (what do you think about this?), User_name draw the corresponding FD overlapping CKs: (Name, Contest), (Contest, Position)

Zaniolo’s definitions
R is a relation; X is any set of attributes of R; A is any single attribute of R; consider the following conditions: X contains A X contains a candidate key of R A is contained in a candidate key of R if either of the three is true for every FD X  A then R is in 3NF if either of the first two is true for every FD X  A then R is in BCNF

BCNF again Patient Doctor Disease
a patient is treated by a single doctor for a certain disease each doctor only treats one kind of disease a doctor can treat more than one patient is this relation 3NF? is this relation BCNF? can you identify update anomalies? consider also (Patient, Disease, Doctor, Treatment) with Patient, Disease  Treatment Disease Doctor Patient

Possible decompositions
non-loss? (choose PKs) non-loss? (choose PKs) Heath’s theorem (choose PKs)

BCNF vs dependency preservation
and do not enforce a FD existing in the original specification, namely: e.g. a patient can be given two doctors that treat the same disease (the system will not disallow this); the constraint would have to be maintained by procedural code

BCNF vs dependency preservation
not every FD is expressible through normalisation when the relation was in 3NF (Patient, Disease)  Doctor was expressed a doctor could not be assigned to more than one patient-disease Doctor  Disease was not expressed generated update anomalies in BCNF Doctor  Disease was expressed (Patient, Disease)  Doctor was not expressed generated update anomalies (refer to previous slide) this latter FD would not have been expressed even if the decomposition in all three 2-attribute relations had been considered

Conclusions normal forms : formalisation of common sense BCNF
art  engineering possibility for automation BCNF always achievable not always free of update anomalies (recall previous example), because it cannot always express all the FDs existing in the problem

Similar presentations

Ads by Google