Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2003-2008 Curt Hill Queries in SQL More options.

Similar presentations


Presentation on theme: "Copyright © 2003-2008 Curt Hill Queries in SQL More options."— Presentation transcript:

1 Copyright © 2003-2008 Curt Hill Queries in SQL More options

2 Copyright © 2003-2008 Curt Hill Duplicates A select usually joins several tables creating large unique tuples Temporary table has an unspecified key If the select removes portions of the key, then duplicates can occur Consider the query that links faculty to the students taking any of their classes

3 The query SELECT f_name, s_name FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid This produces 238 rows What is the key? Copyright © 2003-2008 Curt Hill

4 The Key Does not need to be specified In this case it is the linking fields –F_naid (or ct_naid) –Ct_dept –Ct_number –S_id Since some of these fields will be removed by the Select duplicates occur Copyright © 2003-2008 Curt Hill

5 Removing duplicates In this query duplicates occurs when a student takes multiple classes from the teacher The result is not a set (which eliminates duplicates) but a multi-set (which allows duplicates) Placing the reserved word DISTINCT immediately after the Select removes these The new query follows: Copyright © 2003-2008 Curt Hill

6 Revised query SELECT DISTINCT f_name, s_name FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid This produces 213 rows Copyright © 2003-2008 Curt Hill

7 How does this work? Removing duplicates is not trivial There are several ways, but all are work One possibility is to sort the tuples –Duplicates then must be adjacent Another is to hash them –Duplicates have the same key Small queries could be done in memory, larger ones cannot We will consider sorting and hashing later Copyright © 2003-2008 Curt Hill

8 Deception The difference between the two queries is just one keyword That keyword forces the DBMS to do substantial extra work Looks like no big deal but actually is Hence the query is deceptively different However, make the database do its job Copyright © 2003-2008 Curt Hill

9 All The opposite of the Distinct is the All Specifies that duplicates should not be eliminated Since elimination is expensive, it is usually not done –Thus All gives same result whether present or absent

10 Order The order of the output table is dependent on many unpredictable things Different DBMSs may give different orderings, even with same data –Based on how they process the data The order of the above queries is different on Oracle and MySQL Worse yet neither will put all the students from one faculty together Copyright © 2003-2008 Curt Hill

11 Order by clause Order by follows the Where It specifies a sort order for the output May specify one or more fields Fields do not have to be displayed

12 Sorted query 1 SELECT DISTINCT f_name, s_name FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid ORDER BY f_name, S_name Copyright © 2003-2008 Curt Hill

13 Sorting The default behavior is to sort: –Case sensitive way –Ascending order (lowest to highest) Usually we sort on the display values –Oracle only allows this –SQL Server and MySQL allow sorts on other fields Copyright © 2003-2008 Curt Hill

14 Sorted query 2 SELECT DISTINCT f_name, s_name FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid ORDER BY f_naid, S_id Copyright © 2003-2008 Curt Hill

15 Sort Order The default is sort in ascending order for all sort keys The key may be followed by ASC or DESC ASC makes ascending order DESC is descending order These may not be spelled out If left out ASC is default Copyright © 2003-2008 Curt Hill

16 Sorted query 3 SELECT DISTINCT f_name, s_name FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid ORDER BY f_name DESC, s_name ASC Copyright © 2003-2008 Curt Hill

17 Aggregate operations We can collapse several rows into one This produces a summary report Several rows of table become one row of output This requires the Group By clause with Aggregate functions The Group By follows Where Aggregate functions are in Select Copyright © 2003-2008 Curt Hill

18 Group By and Aggregate functions Each of these Aggregate functions specify a field: –Count –Avg –Sum –Max –Min Usually used with Group by but not always Group by follows Where Specifies the groups as changes in fields Copyright © 2003-2008 Curt Hill

19 Grouped Query 1 SELECT f_name, count(s_name) FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid GROUP BY f_name This produces 16 rows Copyright © 2003-2008 Curt Hill

20 Commentary Group by forces a sort This is only means to ensure that the items are together The DISTINCT keyword may be used within aggregate functions: –Count –Avg –Sum

21 Grouped Query 2 SELECT f_name, count(DISTINCT s_name) FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid GROUP BY f_name This produces 16 rows but different counts Copyright © 2003-2008 Curt Hill

22 Secondary Selection The Where does an initial selection –It eliminates numerous combinations of tuples of no interest We may also wish to remove aggregated rows This must occur after the Where but before final table This is done with the HAVING clause of the GROUP BY

23 Having The Having clause follows the Group By fields It gives a selection criteria for rows Usually based upon the aggregate functions Form: Having comparison See following Copyright © 2003-2008 Curt Hill

24 Grouped Query 3 SELECT f_name, count(DISTINCT s_name) FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid GROUP BY f_name HAVING count(*)>10 Copyright © 2003-2008 Curt Hill

25 Commentary This produces 9 rows Notice the * is the parameter of count Other Aggregate functions could be used as well A Having without a Group By is like a Where Copyright © 2003-2008 Curt Hill

26 Ungrouped Query Suppose we just want a count or sum Then we can use an aggregate function without Group By This will generally collapse the entire table into a single row Consider the next screen Copyright © 2003-2008 Curt Hill

27 Aggregates Counting rows: Select count(*) from faculty –Results in one row with count of 19 Sum of student balances: Select sum(s_balance) from students –Results in one row with the sum: 93240.34 Copyright © 2003-2008 Curt Hill

28 Variations Recall this query SELECT f_name, count(DISTINCT s_name) … GROUP BY f_name Suppose f_naid were included in the Select SELECT f_name, f_naid, In Oracle and SQL Server it would also have to be part of the Group By –But not in MySQL Copyright © 2003-2008 Curt Hill

29 Bad Oracle Query SELECT f_name, f_naid, count(DISTINCT s_name) FROM faculty, c_teach, students, grades WHERE f_naid = ct_naid AND ct_dept = g_dept AND ct_number = g_course AND s_id = g_naid GROUP BY f_name –Receives an error: ORA-00979: not a GROUP BY expression Copyright © 2003-2008 Curt Hill


Download ppt "Copyright © 2003-2008 Curt Hill Queries in SQL More options."

Similar presentations


Ads by Google