Presentation is loading. Please wait.

Presentation is loading. Please wait.

SQL: Structured Query Language

Similar presentations


Presentation on theme: "SQL: Structured Query Language"— Presentation transcript:

1 SQL: Structured Query Language
Chapter 5: Part 2. Group-By & Joins. The slides for this text are organized into chapters. This lecture covers Chapter 5. This is one of the most important chapters in any discussion of database systems. Students must acquire a solid grasp of SQL. In particular, learning how to write queries in SQL is important, and comes only with practice. The slides present the concepts through examples. The chapter contains several additional examples with in-depth explanations; assign these as additional readings. The exercises contain numerous further examples, and come with supporting online material. If you need additional time to cover this material, consider abbreviating the earlier discussion of algebra and calculus, and reinforcing the same concepts in the context of SQL. Note that some new SQL:1999 features for the HAVING clause are covered in these slides. (This material is not covered in the 2nd edition.) Also, material on cursors and other programmatic aspects of SQL has been moved to Chapter 6, following the revisions in the 3rd edition.

2 Running Example Instances of the Sailors and Reserves relations in our examples. R1 S1 S2

3 Aggregation and Having Clauses

4 Aggregate Operators Significant extension of relational algebra.
COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) Why no Distinct?

5 Aggregate Operators COUNT (*) COUNT ( [DISTINCT] A)
SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) Aggregate Operators SELECT COUNT (*) FROM Sailors S SELECT AVG (S.age) FROM Sailors S WHERE S.rating=10 SELECT COUNT (DISTINCT S.rating) FROM Sailors S WHERE S.sname=‘Bob’

6 Find name and age of the oldest sailor(s)
SELECT S.sname, MAX (S.age) FROM Sailors S What does this query do ? Is this query legal? No! Why not ? SELECT S.sname, S.age FROM Sailors S WHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2) What does this query do ? Is this query legal?

7 Find name and age of the oldest sailor(s)
SELECT S.sname, S.age FROM Sailors S WHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2) SELECT S.sname, S.age FROM Sailors S WHERE (SELECT MAX (S2.age) FROM Sailors S2) = S.age Example queries are equivalent in SQL/92 But 2nd one does not always work in some systems

8 Motivation for Grouping
Find age of youngest sailor for each rating level. SELECT MIN (S.age) FROM Sailors S WHERE S.rating = level For level = 1, 2, ... , 10: What are the problems with above ? We may not know how many rating levels exist. Nor what the rating values for these levels are. Serious performance overhead due DB to programming connections

9 Add Group By Clause to SQL

10 Queries With GROUP BY SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list GROUP BY: A group is a set of tuples that each have the same value for all attributes in grouping-list.

11 Are GROUP BY queries valid or not ?
SELECT avg ( S.salary) FROM Sailors S GROUP BY S.rating SELECT S.name FROM Sailors S GROUP BY S.rating

12 Guidelines on Attributes: GROUP BY
SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list target-list contains : (i) attribute names from grouping-list, or (ii) aggregate-op (column-name) REQUIREMENT: - Each answer tuple of a group must have single value.

13 Find age of youngest sailor for each rating
SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S GROUP BY S.rating Is Target List Valid ?

14 Find age of the youngest sailor with age >= 18, for each rating
SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating

15 Queries With GROUP BY and HAVING
SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification HAVING: A restriction on each group.

16 Query With Having Clause
SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 What does query below mean ? Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors in the group

17 GroupBy --- Conceptual Evaluation
Compute the cross-product of relation-list (From) Discard tuples that fail qualification (Where) Delete `unnecessary’ fields Partition the remaining tuples into groups by the value of attributes in grouping-list. (GroupBy) Eliminate groups using the group-qualification (Having) Apply selection to each group to produce output tuple (Select) Reminder: one answer tuple is generated per qualifying group.

18 GroupBy --- Conceptual Evaluation
Step by step example.

19 Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors
Sailors instance: SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1

20 Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors.

21 Now: Find age of the youngest sailor with age 18,
Again: Find age of youngest sailor with age 18, for each rating with at least 2 such sailors SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 Now: Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors and with every sailor under 60. Options: Put 60 age condition into WHERE clause ? Put 60 age condition into HAVING clause ?

22 HAVING COUNT (*) > 1 AND EVERY (S.age <=60)
Find age of youngest sailor with age >= 18, for each rating with at least 2 such sailors and with every sailor under 60. HAVING COUNT (*) > 1 AND EVERY (S.age <=60) EVERY : Must hold for all tuples in the group.

23 Now can check age<=60 before making groups !
Find age of the youngest sailor with age 18, for each rating with at least 2 sailors between 18 and 60. Sailors instance: SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S WHERE S.age >= 18 ??? GROUP BY S.rating HAVING COUNT (*) > 1 ??? Now can check age<=60 before making groups !

24 Find age of the youngest sailor with age 18, for each rating with at least 2 sailors between 18 and 60. Sailors instance: SELECT S.rating, MIN (S.age) AS min-age FROM Sailors S WHERE S.age >= 18 AND S.age <= 60 GROUP BY S.rating HAVING COUNT (*) > 1 Answer relation: Check age<=60 before making groups.

25 Join, GroupBy and Nesting.

26 For each red boat, find the number of reservations for this boat
Sailors : sid, name, … Boats : bid, color, … Reserves: sid, bid, day

27 For each red boat, find the number of reservations for this boat
SELECT B.bid, COUNT (*) AS s-count FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ GROUP BY B.bid Grouping over Join of three relations.

28 For each red boat, find the number of reservations for this boat
Q: What if we move B.color=‘red’ from WHERE to HAVING? SELECT B.bid, COUNT (*) AS s-count FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid GROUP BY B.bid HAVING (B.color=‘red’) Illegal !!! Only column in GroupBy can appear in Having clause, unless in aggregate operator of Having clause; E.g., HAVING count (B.color = ‘red’ ) > 1; E.g., HAVING EVERY (B.color = ‘red’ );

29 SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age > 18
Find age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age) Hint : HAVING clause can also contain a subquery. SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age > 18 GROUP BY S.rating HAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S2.rating=S.rating)

30 Find age of the youngest sailor, for each rating with at least 2 sailors with age over 18;
SELECT S.rating, MIN (S.age) FROM Sailors S GROUP BY S.rating HAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S2.age > 18 and S.rating=S2.rating)

31 Aggregation with nesting

32 Above query has a problem ! What ?
Find those ratings for which the average age is the minimum over all ratings SELECT S.rating FROM Sailors S WHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2) Above query has a problem ! What ? Aggregate operations cannot be nested!

33 Find those ratings for which the average age is the minimum over all ratings
Correct solution (in SQL/92): SELECT Temp.rating, Temp.avg-age FROM (SELECT S.rating, AVG (S.age) AS avg-age FROM Sailors S GROUP BY S.rating) AS Temp WHERE Temp.avg-age = (SELECT MIN (Temp.avg-age) FROM Temp) Note: Not all SQL engines use the “AS” syntax for naming a temporary relation. Then just drop “AS”.

34 Special Joins

35 Outer Joins : Special Operators
Left Outer Join; SELECT S.sid, R.bid FROM Sailors S LEFT OUTER JOIN Reserves R WHERE S.sid = R.sid Right Outer Join; and Full Outer Join Sailors rows (left) without a matching Reserves row (right) appear in result, but not vice versa. SELECT S.sid, R.bid FROM Sailors S NATURAL LEFT OUTER JOIN Reserves R

36 Null Values

37 Null Values Field values in a tuple are sometimes :
Unknown (e.g., a rating has not been assigned) or Inapplicable (e.g., no maiden-name when male) SQL provides special value null for such situations. The presence of null complicates many issues.

38 Allowing Null Values SQL special operators IS NULL or IS NOT NULL to check if value is/is not null. Disallow NULL value: rating INTEGER NOT NULL We need a 3-valued logic : condition can be true, false or unknown.

39 Working with NULL values
Question : Predicate (S.rating = 8) can be TRUE or FALSE. What if S.rating value is a null value? Comparison operators on NULL return UNKNOWN Recall: A result is only returned if WHERE clause is TRUE (not FALE nor not UNKNOWN)

40 Working with NULL values
Question : Arithmetic expression (S.rating + 8) usually is an INT. What if S.rating value is a null value? Arithmetic operations on NULL return NULL.

41 Truth table with UNKNOWN
WHERE clause is satisfied only when it evaluates to TRUE. UNKNOWN AND TRUE = UNKNOWN UNKNOWN OR TRUE = TRUE UNKNOWN AND FALSE = FALSE UNKNOWN OR FALSE = UNKNOWN UNKNOWN AND UNKNOWN = UNKNOWN UNKNOWN OR UNKNOWN = UNKNOWN NOT UNKNOWN = UNKNOWN

42 Summary: To Recap SQL easy to understand language, yet very powerful for expressing complex requests SQL clauses : Nested subqueries AGGREGATION, GROUPBY and HAVING Special joins Handling NULLS Many alternative ways to write same query: optimizer required to find efficient evaluation plan. In practice, users should be aware of how queries are optimized and evaluated for best results.


Download ppt "SQL: Structured Query Language"

Similar presentations


Ads by Google