Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revision for Mid 1 ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Functional Dependencies FDs defined over two sets of attributes: X,

Similar presentations


Presentation on theme: "Revision for Mid 1 ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Functional Dependencies FDs defined over two sets of attributes: X,"— Presentation transcript:

1

2 Revision for Mid 1

3 ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Functional Dependencies FDs defined over two sets of attributes: X, Y  R Notation: X  Y reads as “ X determines Y ” If X  Y, then all tuples that agree on X must also agree on Y XYZXYZ R

4 ©Silberschatz, Korth and Sudarshan7.3Database System Concepts XYZXYZ XYZXYZ Functional Dependencies (example)

5 ©Silberschatz, Korth and Sudarshan7.4Database System Concepts Candidate Keys an attribute (or set of attributes) that uniquely identifies a row primary key is a special candidate key  values cannot be null e.g.  ENROLL (Student_ID, Name, Address, …)  PK = Student_ID  candidate key = Name, Address

6 ©Silberschatz, Korth and Sudarshan7.5Database System Concepts

7 ©Silberschatz, Korth and Sudarshan7.6Database System Concepts2NF a relation is in second normal form if it is in first normal form AND every nonkey attribute is fully functionally dependant on the primary key i.e. remove partial functional dependencies, so no nonkey attribute depends on just part of the key

8 ©Silberschatz, Korth and Sudarshan7.7Database System Concepts EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed) Emp_ID Course_ Title Name Dept_ Name Salary Date_Com p. not fully functionally dependant on the primary key

9 ©Silberschatz, Korth and Sudarshan7.8Database System Concepts Second Normal Form ( 2NF ) it is based on the concept of full functional dependency. full functional dependency A functional dependency X  Y is a full functional dependency, for any attribute A  X, {X - {A}}  Y.

10 ©Silberschatz, Korth and Sudarshan7.9Database System Concepts R (A B C D) ABCDABCD 2 Candidate Keys 2NF (Example) R with key{AB} is NOT 2NF R with key{AC} is NOT 2NF

11 ©Silberschatz, Korth and Sudarshan7.10Database System Concepts Second Normal Form Second normal form: Let R ’ be a relation, and let F be the set of governing FDs. An attribute belongs to R ’ is prime if a key of R ’ contains A. In other words, A is prime in R ’ if there exists K R ’, (2) for all B belongs to K, (K-B)->R ’ not belongs to F+, and (3) A belongs to K

12 ©Silberschatz, Korth and Sudarshan7.11Database System Concepts

13 ©Silberschatz, Korth and Sudarshan7.12Database System Concepts General Definitions of Second Normal Form A relation schema R is in second normal form (2NF) if every nonprime attribute A in R is fully functionally dependent on every key of R.

14 ©Silberschatz, Korth and Sudarshan7.13Database System Concepts Third Normal Form The definition of 3NF is similar to that of BCNF, with the only difference being the third condition. Recall that a key for a relation is a minimal set of attributes that uniquely determines all other attributes.  A must be part of a key (any key, if there are several).  It is not enough for A to be part of a superkey, because this condition is satisfied by every attribute. A relation R is in 3NF if, for all X  A that holds over R A  X ( i.e., X  A is a trivial FD ), or X is a superkey, or A is part of some key for R If R is in BCNF, obviously it is in 3NF.

15 ©Silberschatz, Korth and Sudarshan7.14Database System Concepts Suppose that a dependency X  A causes a violation of 3NF. There are two cases:  X is a proper subset of some key K. Such a dependency is sometimes called a partial dependency. In this case, we store (X,A) pairs redundantly.  X is not a proper subset of any key. Such a dependency is sometimes called a transitive dependency, because it means we have a chain of dependencies K  X  A.

16 ©Silberschatz, Korth and Sudarshan7.15Database System Concepts Key Attributes XAttributes A Key Attributes AAttributes X Key Attributes AAttributes X Partial Dependencies Transitive Dependencies A not in a key A in a key

17 ©Silberschatz, Korth and Sudarshan7.16Database System Concepts Motivation of 3NF  By making an exception for certain dependencies involving key attributes, we can ensure that every relation schema can be decomposed into a collection of 3NF relations using only decompositions.  Such a guarantee does not exist for BCNF relations.  It weaken the BCNF requirements just enough to make this guarantee possible. Unlike BCNF, some redundancy is possible with 3NF.  The problems associate with partial and transitive dependencies persist if there is a nontrivial dependency X  A and X is not a superkey, even if the relation is in 3NF because A is part of a key.

18 ©Silberschatz, Korth and Sudarshan7.17Database System Concepts Reserves Assume: sid  cardno (a sailor uses a unique credit card to pay for reservations). Reserves is not in 3NF  sid is not a key and cardno is not part of a key  In fact, (sid, bid, day) is the only key.  (sid, cardno) pairs are redundantly.

19 ©Silberschatz, Korth and Sudarshan7.18Database System Concepts Reserves Assume: sid  cardno, and cardno  sid (we know that credit cards also uniquely identify the owner). Reserves is in 3NF  (cardno, sid, bid) is also a key for Reserves.  sid  cardno does not violate 3NF.

20 Lecture 12: Further relational algebra, further SQL

21 ©Silberschatz, Korth and Sudarshan7.20Database System Concepts Today’s lecture Where does SQL differ from relational model? What are some other features of SQL? How can we extend the relational algebra to match more closely SQL?

22 ©Silberschatz, Korth and Sudarshan7.21Database System Concepts Duplicate rows Consider our relation instances from lecture 6, Reserves, Sailors and Boats Consider SELECT rating,age FROM Sailors; We get a relation that doesn’t satisfy our definition of a relation! RECALL: We have the keyword DISTINCT to remove duplicates

23 ©Silberschatz, Korth and Sudarshan7.22Database System Concepts Multiset semantics A relation in SQL is really a multiset or bag, rather than a set as in the relational model  A multiset has no order (unlike a list), but allows duplicates  E.g. {1,2,1,3} is a bag  select, project and join work for bags as well as sets  Just work on a tuple-by-tuple basis

24 ©Silberschatz, Korth and Sudarshan7.23Database System Concepts Extended relational algebra Add features needed for SQL 1. Bag semantics 2. Duplicate elimination operator,  3. Sorting operator,  4. Grouping and aggregation operator,  5. Outerjoin operators, o V, V o, o V o

25 ©Silberschatz, Korth and Sudarshan7.24Database System Concepts Duplicate-elimination operator  (R) = relation R with any duplicated tuples removed R=  (R)= This is used to model the DISTINCT feature of SQL AB AB 12 34

26 ©Silberschatz, Korth and Sudarshan7.25Database System ConceptsSorting  L 1,… L n (R) returns a list of tuples of R, ordered according to the attributes L 1, …, L n Note:  does not return a relation R=  B (R)= [(5,2),(1,3),(3,4)] ORDER BY in SQL, e.g. SELECT * FROM Sailors WHERE rating>7 ORDER BY age, sname; AB

27 ©Silberschatz, Korth and Sudarshan7.26Database System Concepts Extended projection SQL allows us to use arithmetic operators SELECT age*5 FROM Sailors; We extend the projection operator to allow the columns in the projection to be functions of one or more columns in the argument relation, e.g. R=  A+B,A,A (R)= AB A+BA.1A

28 ©Silberschatz, Korth and Sudarshan7.27Database System ConceptsArithmetic Arithmetic (and other expressions) can not be used at the top level  i.e. 2+2 is not a valid SQL query How would you get SQL to compute 2+2?

29 ©Silberschatz, Korth and Sudarshan7.28Database System ConceptsAggregation SQL provides us with operations to summarise a column in some way, e.g. SELECT COUNT(rating) FROM Sailors; SELECT COUNT(DISTINCT rating) FROM Sailors; SELECT COUNT(*) FROM Sailors WHERE rating>7; We also have SUM, AVG, MIN and MAX

30 ©Silberschatz, Korth and Sudarshan7.29Database System ConceptsGrouping These aggregation operators have been applied to all qualifying tuples. Sometimes we want to apply them to each of several groups of tuples, e.g.  For each rating, find the average age of the sailors  For each rating, find the age of the youngest sailor

31 ©Silberschatz, Korth and Sudarshan7.30Database System Concepts GROUP BY in SQL SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list ; The target-list contains 1. List of column names 2. Aggregate terms  NOTE: The variables in target-list must be contained in grouping- list

32 ©Silberschatz, Korth and Sudarshan7.31Database System Concepts GROUP BY cont. For each rating, find the average age of the sailors SELECT rating,AVG(age) FROM Sailors GROUP BY rating; For each rating find the age of the youngest sailor SELECT rating,MIN(age) FROM Sailors GROUP BY rating;

33 ©Silberschatz, Korth and Sudarshan7.32Database System Concepts Grouping and aggregation  L (R) where L is a list of elements that are either  Individual column names (“Grouping attributes”), or  Of the form  (A), where  is an aggregation operator (MIN, SUM, …) and A is the column it is applied to For example,  rating,AVG(age) (Sailors)

34 ©Silberschatz, Korth and Sudarshan7.33Database System ConceptsExample Let R= Compute  beer,AVG( price ) (R) barbeerprice Anchor6X2.50 AnchorAdnam’s2.40 Mill6X2.60 MillFosters2.80 EagleFosters2.90

35 ©Silberschatz, Korth and Sudarshan7.34Database System Concepts Example cont. 1. Group according to the grouping attribute, beer : 2. Compute average of price within groups: barbeerprice Anchor6X2.50 Mill6X2.60 AnchorAdnam’s2.40 MillFosters2.80 EagleFosters2.90 beerprice 6X2.55 Adnam’s2.40 Fosters2.85

36 ©Silberschatz, Korth and Sudarshan7.35Database System Concepts NULL values Sometimes field values are unknown (e.g. rating not known yet), or inapplicable (e.g. no spouse name) SQL provides a special value, NULL, for both these situations This complicates several issues  Special operators needed to check for NULL  Is NULL>8? Is (NULL OR TRUE)=TRUE?  We need a three-valued logic  Need to carefully re-define semantics

37 ©Silberschatz, Korth and Sudarshan7.36Database System Concepts NULL values Consider INSERT INTO Sailors (sid,sname) VALUES (101,”Julia”); SELECT * FROM Sailors; SELECT rating FROM Sailors; SELECT sname FROM Sailors WHERE rating>0;

38 ©Silberschatz, Korth and Sudarshan7.37Database System Concepts Entity integrity constraint An entity integrity constraint states that no primary key value can be NULL

39 ©Silberschatz, Korth and Sudarshan7.38Database System Concepts Outer join Note that with the usual join, a tuple that doesn’t ‘join’ with any from the other relation is removed from the resulting relation Instead, we can ‘pad out’ the columns with NULLs This operator is called an full outer join, written o V o

40 ©Silberschatz, Korth and Sudarshan7.39Database System Concepts Example of full outer join Let R= Let S= Then RVS = But R o V o S = AB BC ABC 345 ABC 12 NULL

41 ©Silberschatz, Korth and Sudarshan7.40Database System Concepts Outer joins in SQL SQL/92 has three variants:  LEFT OUTER JOIN (algebra: o V)  RIGHT OUTER JOIN (algebra: V o )  FULL OUTER JOIN (algebra: o V o ) For example: SELECT * FROM Reserves r LEFT OUTER JOIN Sailors s ON r.sid=s.sid;

42 ©Silberschatz, Korth and Sudarshan7.41Database System ConceptsViews A view is a query with a name that can be used in further SELECT statements, e.g. CREATE VIEW ExpertSailors(sid,sname,age) AS SELECT sid,sname,age FROM Sailors WHERE rating>9; Note that ExpertSailors is not a stored relation (WARNING: mysql does not support views  )

43 ©Silberschatz, Korth and Sudarshan7.42Database System Concepts Querying views So an example query SELECT sname FROM ExpertSailors WHERE age>27; is translated by the system to the following: SELECT sname FROM Sailors WHERE rating>9 AND age>27;

44 ©Silberschatz, Korth and Sudarshan7.43Database System Concepts Relational Algebra The Relational Algebra is used to define the ways in which relations (tables) can be operated to manipulate their data. It is used as the basis of SQL for relational databases, and illustrates the basic operations required of any DML. This Algebra is composed of Unary operations (involving a single table) and Binary operations (involving multiple tables).

45 ©Silberschatz, Korth and Sudarshan7.44Database System ConceptsSQL Structured Query Language (SQL)  Standardised by ANSI  Supported by modern RDBMSs Commands fall into three groups  Data Definition Language (DLL)  Create tables, etc  Data Manipulation Language (DML)  Retrieve and modify data  Data Control Language  Control what users can do – grant and revoke privileges

46 ©Silberschatz, Korth and Sudarshan7.45Database System Concepts Selection n The selection or  operation selects rows from a table that satisfy a condition:  n Example:  course = ‘CM’ Students Students stud#namecourse 100FredPH stud#namecourse 200DaveCM 300BobCM

47 ©Silberschatz, Korth and Sudarshan7.46Database System Concepts Projection n The projection or  operation selects a list of columns from a table.  n Example:  stud#, name Students Students stud#namecoursestud#name 100FredPH 100Fred 200 DaveCM 200 Dave 300 BobCM 300 Bob

48 ©Silberschatz, Korth and Sudarshan7.47Database System Concepts Selection / Projection Selection and Projection are usually combined:  stud#, name (  course = ‘CM’ Students) Students stud#namecourse 100FredPH stud#name 200DaveCM 200Dave 300BobCM 300Bob

49 ©Silberschatz, Korth and Sudarshan7.48Database System Concepts Cartesian Product Concatenation of every row in the first relation (R) with every row in the second relation (S): R X S

50 ©Silberschatz, Korth and Sudarshan7.49Database System Concepts Cartesian Product - Example StudentsCourses stud#namecoursecourse#name 100FredPHPHPharmacy 200DaveCMCMComputing 300BobCM Students X Courses = stud#Students.namecoursecourse#Courses.name 100FredPHPHPharmacy 100FredPHCMComputing 200DaveCMPHPharmacy 200DaveCMCMComputing 300BobCMPHPharmacy 300BobCMCMComputing

51 ©Silberschatz, Korth and Sudarshan7.50Database System Concepts Theta Join A Cartesian product with a condition applied: R ⋈ S

52 ©Silberschatz, Korth and Sudarshan7.51Database System Concepts Theta Join - Example StudentsCourses stud#namecoursecourse#name 100FredPHPHPharmacy 200DaveCMCMComputing 300BobCM Students ⋈ stud# = 200 Courses stud#Students.namecoursecourse#Courses.name 200DaveCMPHPharmacy 200DaveCMCMComputing

53 ©Silberschatz, Korth and Sudarshan7.52Database System Concepts Inner Join (Equijoin) A Theta join where the is the match (=) of the primary and foreign keys. R ⋈ S

54 ©Silberschatz, Korth and Sudarshan7.53Database System Concepts Inner Join - Example StudentsCourses stud#namecoursecourse#name 100FredPHPHPharmacy 200DaveCMCMComputing 300BobCM Students ⋈ course = course# Courses stud#Students.namecoursecourse#Courses.name 100FredPHPHPharmacy 200DaveCMCMComputing 300BobCMCMComputing

55 ©Silberschatz, Korth and Sudarshan7.54Database System Concepts Natural Join Inner join produces redundant data (in the previous example: course and course#). To get rid of this duplication:  (Students ⋈ Courses) Or R1= Students ⋈ Courses R2=  R1 The result is called the natural join of Students and Courses

56 ©Silberschatz, Korth and Sudarshan7.55Database System Concepts Natural Join - Example StudentsCourses stud#namecoursecourse#name 100FredPHPHPharmacy 200DaveCMCMComputing 300 BobCM R1= Students ⋈ Courses R2=  R1 stud#Students.namecourseCourses.name 100FredPHPharmacy 200DaveCMComputing 300BobCMComputing

57 ©Silberschatz, Korth and Sudarshan7.56Database System Concepts Outer Joins Inner join + rows of one table which do not satisfy the. Left Outer Join: R S All rows from R are retained and unmatched rows of S are padded with NULL Right Outer Join: R S All rows from S are retained and unmatched rows of R are padded with NULL

58 ©Silberschatz, Korth and Sudarshan7.57Database System Concepts Left Outer Join - Example StudentsCourses stud#namecoursecourse#name 100FredPHPHPharmacy 200DaveCMCMComputing 400 PeterENCHChemistry Students Courses stud#Students.namecoursecourse#Courses.name 100FredPHPHPharmacy 200DaveCMCMComputing 400 PeterENNULLNULL

59 ©Silberschatz, Korth and Sudarshan7.58Database System Concepts Right Outer Join - Example StudentsCourses stud#namecoursecourse#name 100FredPHPHPharmacy 200DaveCMCMComputing 400 PeterENCHChemistry Students Courses stud#Students.namecoursecourse#Courses.name 100 FredPHPHPharmacy 200 DaveCMCMComputing NULL NULLNULLCHChemistry

60 ©Silberschatz, Korth and Sudarshan7.59Database System Concepts Combination of Unary and Join Operations StudentsCourses stud#nameaddresscoursecourse# name 100FredAberdeenPHPH Pharmacy 200DaveDundeeCMCM Computing 300BobAberdeenCM Show the names of students (from Aberdeen) and the names of their courses R1= Students ⋈ Courses R2=  R1 R3=  R2 Students.nameCourses.name FredPharmacy BobComputing

61 ©Silberschatz, Korth and Sudarshan7.60Database System Concepts Union n Takes the set of rows in each table and combines them, eliminating duplicates n Participating relations must be compatible, ie have the same number of columns, and the same column names, domains, and data types RSR  S AB a1b1 a2b2 AB a2b2 a3b3 AB a1b1 a2b2 a3b3

62 ©Silberschatz, Korth and Sudarshan7.61Database System Concepts Intersection n Takes the set of rows that are common to each relation n Participating relations must be compatible RSR  S AB a1b1 a2b2 AB a2b2 a3b3 AB a2b2

63 ©Silberschatz, Korth and Sudarshan7.62Database System Concepts Difference n Takes the set of rows in the first relation but not the second n Participating relations must be compatible RSR - S AB a1b1 a2b2 AB a2b2 a3b3 AB a1b1

64 ©Silberschatz, Korth and Sudarshan7.63Database System Concepts Exercise (May 2004 Exam) EmployeeWorkLoadProject empidnameempid*projid*durationprojidname E100FredE100P00117P001DB E200DaveE200P00112P002Access E300BobE300P00215P003SQL E400Peter Determine the outcome of the following operations: A natural join between Employee and WorkLoad A left outer join between Employee and WorkLoad A right outer join between WorkLoad and Project

65 ©Silberschatz, Korth and Sudarshan7.64Database System Concepts Unary Operations Selection  course = ‘Computing’ Students In SQL: Select * From Students Where course = ‘Computing’; Projection  stud#, name Students In SQL: Select stud#, name From Students; Selection & Projection  stud#, name (  course = ‘Computing’ Students) In SQL: Select stud#, name From students Where course = ‘Computing’;

66 ©Silberschatz, Korth and Sudarshan7.65Database System Concepts Binary Operations/Joins Cartesian Product: Students X Courses In SQL: Select * From Students, Courses; Theta Join: Students ⋈ Courses In SQL: Select * From Students, Courses Where stud# = 200;

67 ©Silberschatz, Korth and Sudarshan7.66Database System Concepts Binary Operations/Joins Inner Join (Equijoin): Students ⋈ Courses In SQL: Select * From Students, Courses Where course=course#; Natural Join: R1= Students ⋈ Courses R2=  R1 In SQL: Select stud#, Students.name, course, Courses.name From Students, Courses Where course=course#;

68 ©Silberschatz, Korth and Sudarshan7.67Database System Concepts Outer Joins Left Outer Join Students Courses In SQL: Select * From Students, Courses Where course = course#(+) Right Outer Join Students Courses In SQL: Select * From Students, Courses Where course(+) = course#

69 ©Silberschatz, Korth and Sudarshan7.68Database System Concepts Combination of Unary and Join Operations R1= Students ⋈ Courses R2=  R1 R3=  R2 In SQL: Select Students.name, Courses.name From Students, Courses Where course=course# AND address=“Aberdeen”;

70 ©Silberschatz, Korth and Sudarshan7.69Database System Concepts Set Operations Union: R  S In SQL: Select * From R Union Select * From S; Intersection: R  S In SQL: Select * From R Intersect Select * From S; Difference: R - S In SQL: Select * From R Minus Select * From S;

71 ©Silberschatz, Korth and Sudarshan7.70Database System Concepts SQL Operators SELECT * FROM Book WHERE catno BETWEEN 200 AND 400; SELECT * FROM Product WHERE prod_desc BETWEEN ‘C’ AND ‘S’; SELECT * FROM Book WHERE catno NOT BETWEEN 200 AND 400;

72 ©Silberschatz, Korth and Sudarshan7.71Database System Concepts SQL Operators SELECT Catno FROM Loan WHERE Date-Returned IS NULL; SELECT Catno FROM Loan WHERE Date-Returned IS NOT NULL;

73 ©Silberschatz, Korth and Sudarshan7.72Database System Concepts SQL Operators SELECT Name FROM Member WHERE memno IN (100, 200, 300, 400); SELECT Name FROM Member WHERE memno NOT IN (100, 200, 300, 400);

74 ©Silberschatz, Korth and Sudarshan7.73Database System Concepts SQL Operators SELECT Name FROM Member WHERE address NOT LIKE ‘%Aberdeen%’; SELECT Name FROM Member WHERE Name LIKE ‘_ES%’; Note: In MS Access, use * and # instead of % and _

75 ©Silberschatz, Korth and Sudarshan7.74Database System Concepts Selecting Distinct Values Student stud#nameaddress 100FredAberdeen 200DaveDundee 300BobAberdeen SELECT Distinct address FROM Student; address Aberdeen Dundee


Download ppt "Revision for Mid 1 ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Functional Dependencies FDs defined over two sets of attributes: X,"

Similar presentations


Ads by Google