Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.

Similar presentations


Presentation on theme: "Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003."— Presentation transcript:

1 Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003

2 Summary of Previous Discussion FD’s are given as part of the schema. You derived keys from them. You can also specify keys directly (they’re just another FD) First, compute the keys and the FD’s that hold on the relation you’re considering. Then, look for violations of BCNF. There may be a few. Choose one.

3 Summary (continued) You may need to decompose the decomposed relations: –To decide that, you need to project the FD’s on the decomposed relations. –The end result may differ, depending on which violation you chose to decompose by. They’re all correct. Relations with 2 attributes are in BCNF.

4 Summary (continued –2) If you have only a single FD representing a key, then the relation is in BCNF. But, you can still have another FD that forces you to decompose. This is all really pretty simple if you think about it long enough.

5 Boyce-Codd Normal Form A simple condition for removing anomalies from relations: In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R. A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1,..., A n  B in R, {A 1,..., A n } is a key for R A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1,..., A n  B in R, {A 1,..., A n } is a key for R

6 Example Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age, hairColor age  hairColor Decompose in BCNF (in class): Step 1: find all keys ssn, phoneNumber SSN, name, age, hairColor SSN, phoneNumber Step 2: now decompose

7 Other Example R(A,B,C,D) A B, B C Keys: Violations of BCNF:

8 Correct Decompositions A decomposition is lossless if we can recover: R(A,B,C) R1(A,B) R2(A,C) R’(A,B,C) should be the same as R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover

9 Correct Decompositions Given R(A,B,C) s.t. A  B, the decomposition into R1(A,B), R2(A,C) is lossless

10 3NF: A Problem with BCNF Unit Company Product Unit Company Unit Product FD’s: Unit  Company; Company, Product  Unit So, there is a BCNF violation, and we decompose. Unit  Company No FDs

11 So What’s the Problem? Unit Company Product Unit CompanyUnit Product Galaga99 UW Galaga99 databases Bingo UW Bingo databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Galaga99 UW databases Bingo UW databases Violates the dependency: company, product -> unit!

12 Solution: 3rd Normal Form (3NF) A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key. A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key.

13 Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language Algebra Implementation SQL, relational calculus Relational algebra Relational bag algebra

14 Relational Algebra Five operators: –Union:  –Difference: - –Selection:  –Projection:  –Cartesian Product:  Derived or auxiliary operators: –Intersection, complement –Joins (natural,equi-join, theta join, semi-join) –Renaming: 

15 1. Union and 2. Difference R1  R2 Example: –ActiveEmployees  RetiredEmployees R1 – R2 Example: –AllEmployees -- RetiredEmployees

16 What about Intersection ? It is a derived operator R1  R2 = R1 – (R1 – R2) Also expressed as a join (will see later) Example –UnionizedEmployees  RetiredEmployees

17 3. Selection Returns all tuples which satisfy a condition Notation:  c (R) Examples –  Salary > 40000 (Employee) –  name = “Smithh” (Employee) The condition c can be =,, , <>

18 Find all employees with salary more than $40,000.  Salary > 40000 (Employee)

19 4. Projection Eliminates columns, then removes duplicates Notation:  A1,…,An (R) Example: project social-security number and names: –  SSN, Name (Employee) –Output schema: Answer(SSN, Name)

20  SSN, Name (Employee)

21 5. Cartesian Product Each tuple in R1 with each tuple in R2 Notation: R1  R2 Example: –Employee  Dependents Very rare in practice; mainly used to express joins

22

23 Relational Algebra Five operators: –Union:  –Difference: - –Selection:  –Projection:  –Cartesian Product:  Derived or auxiliary operators: –Intersection, complement –Joins (natural,equi-join, theta join, semi-join) –Renaming: 

24 Renaming Changes the schema, not the instance Notation:  B1,…,Bn (R) Example: –  LastName, SocSocNo (Employee) –Output schema: Answer(LastName, SocSocNo)

25 Renaming Example Employee NameSSN John999999999 Tony777777777 LastNameSocSocNo John999999999 Tony777777777  LastName, SocSocNo (Employee)

26 Natural Join Notation: R1 ⋈ R2 Meaning: R1 ⋈ R2 =  A (  C (R1  R2)) Where: –The selection  C checks equality of all common attributes –The projection eliminates the duplicate common attributes

27 Natural Join Example Employee NameSSN John999999999 Tony777777777 Dependents SSNDname 999999999Emily 777777777Joe NameSSNDname John999999999Emily Tony777777777Joe Employee Dependents =  Name, SSN, Dname (  SSN=SSN2 (Employee x  SSN2, Dname (Dependents))

28 Natural Join R= S= R ⋈ S= AB XY XZ YZ ZV BC ZU VW ZV ABC XZU XZV YZU YZV ZVW

29 Natural Join Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ? Given R(A, B, C), S(D, E), what is R ⋈ S ? Given R(A, B), S(A, B), what is R ⋈ S ?

30 Theta Join A join that involves a predicate R1 ⋈  R2 =   (R1  R2) Here  can be any condition

31 Eq-join A theta join where  is an equality R1 ⋈ A=B R2 =  A=B (R1  R2) Example: –Employee ⋈ SSN=SSN Dependents Most useful join in practice

32 Semijoin R ⋉ S =  A1,…,An (R ⋈ S) Where A 1, …, A n are the attributes in R Example: –Employee ⋉ Dependents

33 Semijoins in Distributed Databases Semijoins are used in distributed databases SSNName... SSNDnameAge... Employee Dependents network Employee ⋈ ssn=ssn (  age>71 (Dependents)) T =  SSN  age>71 (Dependents) R = Employee ⋉ T Answer = R ⋈ Dependents

34 Complex RA Expressions Person Purchase Person Product  name=fred  name=gizmo  pid  ssn seller-ssn=ssnpid=pidbuyer-ssn=ssn  name

35 Operations on Bags A bag = a set with repeated elements All operations need to be defined carefully on bags {a,b,b,c}  {a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}  C (R): preserve the number of occurrences  A (R): no duplicate elimination Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets ! Reading assignment: 5.3 – 5.4

36 Finally: RA has Limitations ! Cannot compute “transitive closure” Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program Name1Name2Relationship FredMaryFather MaryJoeCousin MaryBillSpouse NancyLouSister


Download ppt "Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003."

Similar presentations


Ads by Google