2Decomposition - more than one possibility normalisation decomposition (non-loss)Modules(M_id, M_name, Type, Value)solution #1 (3NF)Modules_Descr(M_id, M_name, Type)Type_Val(Type, Val)solution #2 (3NF)Module_Val(M_id, Val)are they both non-loss? (apply Heath’s theorem)is there one better than the other?
3Decomposition - update anomalies updatesu1: insert the fact that a 3 semester module is worth 1.5cuu2: modify 1 semester modules; they are not worth 0.5cu any longer, they are 0.75cuu3: change the type of a module but forget to change its valuesolution #2u1 and u2 are impossible or very difficult to performu3 is allowedsolution #1u1 and u2 are straightforwardu3 is not allowed
4Solution #1 vs solution #2 more expressivecertain facts cannot be expressed in solution #2; e.g. the value of a new typeupdates can be independently performed on the two component relations (i.e. all constraints are properly expressed)in solution #2: Type Value is lost, so this constraint must be enforced by the user by procedural codeindependent projectionsupdates can be performed independently on each projection, without the danger of ending with inconsistent data
5Independent projections M_nameM-idTypeValueSolution #1Solution #2M-idTypeM_nameM-idTypeM_nameTypeValueM_idValueall direct : intraall transitive : interone transitive : intraone direct : lost
6Independent projections - Risanen R1 and R2 are two projections of R; R1 and R2 are independent if and only ifevery FD in R is a logical consequence of the FDs in R1 and R2the common attributes of R1 and R2 for a candidate key for at least one of R1 or R2atomic relationcannot be decomposed into independent projections
7Dependency preservation R was decomposed (normalisation) into R1, …, RnS - the set of FDs for RS1, …, Sn - the set of FDs for R1, …, Rn (each Si refers to only the attributes of Ri)S’ = S1 … Sn (usually, S’ S)the decomposition is dependency preserving if (not iff) S’+ = S+
82NF and 3NF - more than one CK a relation is in 2NF if and only if it is in 1NF and all non-key attributes are irreducibly dependent on the candidate keys3NF (Zaniolo)R is a relation; X is any set of attributes of R; A is any single attribute of R; consider the following conditions:X contains AX contains a candidate key of RA is contained in a candidate key of Rif either of the three is true for every FD X A then R is in 3NF
9ExampleAssume the supply department in a company is in charge of bringingparts from different manufacturers. A part is uniquely identified by itsname and manufacturer; for convenience, a part is also given an id.A separate delivery is necessary for each type of part, from eachmanufacturer. At most one delivery is made in one day for one type ofpart from one manufacturer. A “transport” (e.g. van23) is associatedwith each delivery. Each transport has a unique driver. A driver can drivemore than one “transports”.
133NF 3NF is not free from update e.g. (Type, Manufacturer) Id exerciseinsertdeleteupdate
14BCNFa relation is in Boyce/Codd normal form (BCNF) if and only if every non-trivial irreducible FD has a candidate key as its determinantany relation can be non-loss decomposed into an equivalent set of BCNF relationsBCNF 3NF 2NF 1NFBCNF is still not guaranteed to be free of any update anomalies caused by FDsexample - later
15BCNF - examples previous example: one candidate key only CKs: Id, Name, Photo (what do you think about this?), User_namedraw the corresponding FDoverlapping CKs: (Name, Contest), (Contest, Position)
16Zaniolo’s definitions R is a relation; X is any set of attributes of R; A is any single attribute of R; consider the following conditions:X contains AX contains a candidate key of RA is contained in a candidate key of Rif either of the three is true for every FD X A then R is in 3NFif either of the first two is true for every FD X A then R is in BCNF
17BCNF again Patient Doctor Disease a patient is treated by a single doctor for a certain diseaseeach doctor only treats one kind of diseasea doctor can treat more than one patientis this relation 3NF?is this relation BCNF?can you identify update anomalies?consider also (Patient, Disease, Doctor, Treatment)with Patient, Disease TreatmentDiseaseDoctorPatient
19BCNF vs dependency preservation anddo not enforce a FD existing in the original specification, namely:e.g. a patient can be given two doctors that treat the same disease (the systemwill not disallow this); the constraint would have to be maintained by procedural code
20BCNF vs dependency preservation not every FD is expressible through normalisationwhen the relation was in 3NF(Patient, Disease) Doctor was expresseda doctor could not be assigned to more than one patient-diseaseDoctor Disease was not expressedgenerated update anomaliesin BCNFDoctor Disease was expressed(Patient, Disease) Doctor was not expressedgenerated update anomalies (refer to previous slide)this latter FD would not have been expressed even if the decomposition in all three 2-attribute relations had been considered
21Conclusions normal forms : formalisation of common sense BCNF art engineeringpossibility for automationBCNFalways achievablenot always free of update anomalies (recall previous example), because it cannot always express all the FDs existing in the problem