Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.

Similar presentations


Presentation on theme: "Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations."— Presentation transcript:

1 Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations

2 Topics for Today Set Operations UNION [ALL/DISTINCT] Subqueries WHERE clause HAVING clause FROM clause SELECT clause Correlated and Nested Subqueries

3 Set Operations UNION (supported by MySQL)‏ INTERSECT (not supported by MySQL)‏ EXCEPT/MINUS (not supported by MySQL)‏

4 UNION Combines the results from multiple SELECT statements into a single result set

5 UNION Syntax Two tables: SELECT... UNION [DISTINCT | ALL] SELECT... Three or more tables: SELECT... UNION [DISTINCT | ALL] SELECT... UNION [DISTINCT | ALL] SELECT......

6 UNION Restrictions Each SELECT clause must contain the same number of columns -- This is an error!!!! SELECT FirstName, LastName FROM People UNION DISTINCT SELECT BirthFirstName FROM People;

7 UNION Example UNION can also be used to append aggregate information. For each gender, list the number of accounts held by members of that gender and append a total member count to the bottom of the list.

8 UNION Example Solution: SELECT Gender, COUNT(Gender) FROM Members GROUP BY Gender UNION ALL SELECT 'Total', COUNT(Gender) FROM Members;

9 UNION and Duplicates By default, the UNION keyword alone removes duplicates (UNION DISTINCT is the default) To remove duplicates explicitly, use: UNION DISTINCT To keep duplicates use: UNION ALL Example: SELECT FirstName FROM People UNION ALL SELECT BirthFirstName FROM People;

10 UNION and Sorting To order all results in a UNION query, use a single ORDER BY clause that orders on one or more column aliases from the first UNION Note: You must ORDER BY. Referring to an actual column name in a UNION’s ORDER BY clause will generate an error. To avoid confusion, it is common to surround the individual selection queries in a UNION with parentheses

11 UNION and Sorting Examples: (SELECT FirstName FROM People) UNION DISTINCT (SELECT BirthFirstName FROM People) ORDER BY FirstName; (SELECT FirstName AS Names FROM People) UNION DISTINCT (SELECT BirthFirstName FROM People) ORDER BY Names;

12 UNION and Sorting How about sorting individual tables? Given A UNION B, is it possible to have A sorted and B sorted, with A’s rows always on top of B’s rows? Given a UNION of two tables A and B, SQL does not guarantee anything about the order of the results (it will mix up the rows in A and B) To make sure table A’s rows always come before table B’s rows, you can add an extra sort column

13 UNION and Sorting Does this work? NO! (SELECT FirstName FROM People ORDER BY FirstName) UNION DISTINCT (SELECT BirthFirstName FROM People ORDER BY BirthFirstName);

14 UNION and Sorting Does this work? YES! (SELECT 1 AS sort_col, FirstName AS first_name FROM People) UNION DISTINCT (SELECT 2, BirthFirstName FROM People) ORDER BY sort_col, first_name;

15 UNION and Sorting Example as on slides 7 – 8 should also be rewritten so that the order is displayed correctly (SELECT 1 AS S, Gender AS G, COUNT(Gender) FROM Members GROUP BY Gender) UNION ALL (SELECT 2, 'Total', COUNT(Gender) FROM Members) ORDER BY S, G DESC;

16 Subqueries Subqueries are queries within queries Also called inner queries A query that contains a subquery is called an outer query A subquery must be surrounded by parentheses

17 Subquery Example Example (subquery in red): -- List all the names of all people who are not actors. SELECT FirstName, LastName FROM People WHERE PersonID NOT IN (SELECT ActorID FROM XRefActorsMovies);

18 When to Use Subqueries Use a subquery when: When it is impossible to solve the problem using a single query When a subquery solution to the problem runs faster than an equivalent non-subquery solution to the problem (rare with the current version of MySQL)

19 Subqueries are UGLY Example: SELECT Z.Type1, Z.Type2, CONCAT('$', TRUNCATE(Z.AvgPriceDifference, 2)) AS MaxAvgPriceDifference FROM (SELECT X.type AS Type1, Y.type AS Type2, ABS(X.AveragePrice - Y.AveragePrice) AS AvgPriceDifference FROM (SELECT type, AVG(price) AS AveragePrice FROM titles GROUP BY type) X JOIN (SELECT type, AVG(price) AS AveragePrice FROM titles GROUP BY type) Y WHERE X.type <> Y.type AND STRCMP(X.type, Y.type) Y.type AND STRCMP(X.type, Y.type) < 0) U)) Z;

20 Types of Subqueries Single Value Subqueries Subquery returns a single value (one column, one row) List Subqueries Subquery returns a list (one column, multiple rows) Table Subqueries Subquery returns a table (multiple columns and rows)

21 How to Solve Subquery Problems To solve subquery problems: Always think substitution Analyze the question, looking for subqueries within the question Replace subqueries in the original question with substitution variables such as X, Y, and Z Write queries for your substitution variables Write a query to that solves the original question using your substitution variables Replace substitution variables with your subqueries

22 WHERE Clause Subqueries Use a subquery in the WHERE clause when you want to filter records from the outer query using a single value or list of values returned from one or more subqueries Single value subqueries are OK List subqueries are OK Table subqueries are NOT OK Do not use a table subquery directly in a WHERE clause

23 WHERE Clause Subquery Example Example #1: -- List all movie titles produced by Paramount Pictures or Twentieth Century-Fox. Do not use a join and do not hard-code company IDs.

24 WHERE Clause Subquery Example Outer and Inner Queries: The outer query... SELECT Title FROM Movies WHERE CompanyID = (X) OR CompanyID = (Y); Inner query X... SELECT CompanyID FROM Companies WHERE Name = 'Paramount Pictures'; Inner query Y... SELECT CompanyID FROM Companies WHERE Name = 'Twentieth Century-Fox';

25 WHERE Clause Subquery Example Solution: SELECT Title FROM Movies WHERE CompanyID = (SELECT CompanyID FROM Companies WHERE Name = ‘Paramount Pictures’) OR CompanyID = (SELECT CompanyID FROM Companies WHERE Name = ‘Twentieth Century-Fox’);

26 WHERE Clause Subquery Example Example #2: -- List all movie titles with a runtime greater than the average runtime of all movies.

27 WHERE Clause Subquery Example Solution: SELECT Title FROM Movies WHERE Runtime > (SELECT AVG(Runtime) FROM Movies);

28 IN and NOT IN Use the IN keyword to test if an expression matches any items in a list (typically returned by a subquery)‏ Syntax: expression IN (list subquery)‏ expression NOT IN (list subquery)‏

29 IN Example Example: -- List the names of all actors (do not use a join).

30 IN Example Solution: The outer query... SELECT FirstName, LastName FROM People WHERE PersonID IN (X); The inner query... SELECT ActorID FROM XRefActorsMovies; Substitute to get the solution... SELECT FirstName, LastName FROM People WHERE PersonID IN (SELECT ActorID FROM XRefActorsMovies);

31 ALL and ANY ALL The condition must hold true for all elements in the list. Syntax: expression operator ALL (list subquery)‏ ANY The condition may hold true for at least one element in the list. Syntax: expression operator ANY (list subquery)‏

32 ALL and ANY Examples Example: -- List the usernames of all members whose join dates are earlier than all of the members from Germany and Australia.

33 ALL and ANY Examples Outer and inner queries: -- Outer query... SELECT Username FROM Accounts WHERE JoinDate < ALL (X) AND JoinDate < ALL (Y); -- Inner query X... SELECT JoinDate FROM Accounts WHERE Country = ‘DEU’; -- Inner query Y... SELECT JoinDate FROM Accounts WHERE Country = ‘AUS’;

34 ALL and ANY Examples Substitute to get final solution: SELECT Username FROM Accounts WHERE JoinDate < ALL (SELECT JoinDate FROM Accounts WHERE Country = ‘DEU’) AND JoinDate < ALL (SELECT JoinDate FROM Accounts WHERE Country = ‘AUS’);

35 HAVING Clause Subqueries Like the WHERE clause, you can have subqueries in the HAVING clause as well Think substitution as well List only those countries for which the number of accounts in each country outnumber the total number of accounts from Australia.

36 HAVING Clause Subqueries Example: -- List only those countries for which the number of members in each country outnumber the total number of members from Australia.

37 HAVING Clause Subqueries Outer and Inner Queries: Outer Query: SELECT Country FROM Accounts GROUP BY Country HAVING COUNT(*) > (X) Inner Query: SELECT COUNT(*) FROM Accounts WHERE Country = ‘AUS’;

38 HAVING Clause Subqueries Substitute to get final solution: SELECT Country FROM Accounts GROUP BY Country HAVING COUNT(*) > (SELECT COUNT(*) FROM Accounts WHERE Country = ‘AUS’);

39 FROM Clause Subqueries FROM clause arguments are tables You can have subqueries in the FROM clause Always wrap your subqueries in parentheses Always define a table alias for any table returned by a subquery in the FROM clause FROM ( )

40 FROM Clause Subquery Usage Use a subquery in the FROM clause when you you need a complex table in the FROM clause that can only be computed using a separate query (i.e. joining tables involving aggregate calculations and unions).

41 FROM Clause Subquery Example Example: -- For each movie, list the movie title and the difference between the number of males who rated the movie better than 7 and the number of females who rated the movie better than 7. For example, if 5 males rated Star Trek: Generations better than 7 and only 2 females rated Star Trek: Generations better than 7, the displayed difference should be 3.

42 FROM Clause Subquery Solution The Outer Query: SELECT X.Title, X.MaleCount – Y.FemaleCount FROM (X) X INNER JOIN (Y) Y USING(MovieID);

43 FROM Clause Subquery Solution The Inner Queries: Inner query X... SELECT MovieID, Title, SUM(Rating > 7) AS MaleCount FROM Movies LEFT JOIN Ratings USING(MovieID) LEFT JOIN Accounts USING(AccountID) WHERE Gender = 'M’ GROUP BY MovieID; Inner query Y... SELECT MovieID, Title, SUM(Rating > 7) AS FemaleCount FROM Movies LEFT JOIN Ratings USING(MovieID) LEFT JOIN Accounts USING(AccountID) WHERE Gender = 'F’ GROUP BY MovieID;

44 FROM Clause Subquery Solution Now substitute to get the final solution: SELECT X.Title, X.MaleCount – Y.FemaleCount FROM (SELECT MovieID, Title, SUM(Rating > 7) AS MaleCount FROM Movies LEFT JOIN Ratings USING(MovieID) LEFT JOIN Accounts USING(AccountID) WHERE Gender = 'M’ GROUP BY MovieID) X INNER JOIN (SELECT MovieID, Title, SUM(Rating > 7) AS FemaleCount FROM Movies LEFT JOIN Ratings USING(MovieID) LEFT JOIN Accounts USING(AccountID) WHERE Gender = 'F’ GROUP BY MovieID) Y USING(MovieID);

45 Sample Problems Problems: -- List the names of all actors in the movie archive database that are older than all of the actors from the movie ‘The X Files.’

46 SELECT Clause Subqueries A SELECT clause subquery must return a single value (not a list or table)‏ Examples: SELECT (SELECT 1) + (SELECT 2); -- 3 SELECT (SELECT COUNT(*) FROM Movies); -- 6 SELECT (SELECT * FROM Movies); -- ERROR!!!

47 SELECT Clause Subqueries SELECT clause subqueries are good for single- value calculations, such as percentages Example: -- What percent of member are male?

48 SELECT Clause Subqueries Example -- OUTER QUERY SELECT 100*(X)/(Y); -- INNER QUERY X = number of male accounts SELECT COUNT(*) FROM Members WHERE Gender = 'M'; -- INNER QUERY Y = number of total accounts SELECT COUNT(*) FROM Members; -- SOLUTION SELECT 100*(SELECT COUNT(*) FROM Members WHERE Gender = 'M')/(SELECT COUNT(*) FROM Members);

49 SELECT Clause Subqueries A SELECT clause subquery is even more useful when the outer query and inner query are correlated (the inner query is dependent on data from the outer query)‏

50 Correlated Subqueries Previous subqueries have been non-correlated. non-correlated means ‘no dependencies’ which means you can run the inner query separately Correlated subqueries are inner queries that are ‘dependent’ on data from outer queries. correlated means ‘with dependencies’ which means you can’t run the inner query separately the result of the inner query ‘depends on’ data given to it from the outer query

51 Correlated Subqueries Some FROM clause subquery problems can be rewritten using correlated subqueries in the SELECT clause. Let’s try an example: List each movie title along with the number of ratings and the number of genres for that movie. +--------------------------------+---------+--------+ | Title | Ratings | Genres | +--------------------------------+---------+--------+ | Star Trek: Generations | 10 | 4 | | X-Men | 12 | 4 | | X-Men: The Last Stand | 12 | 4 | | Things We Lost in the Fire | 9 | 1 | | The X Files | 12 | 5 | | The X Files: I Want to Believe | 11 | 3 | +--------------------------------+---------+--------+

52 Correlated Subqueries Non-correlated solution: SELECT Title, X.Ratings, Y.Genres FROM Movies M LEFT JOIN (SELECT MovieID, COUNT(Rating) AS Ratings FROM Ratings GROUP BY MovieID) X ON M.MovieID = X.MovieID LEFT JOIN (SELECT MovieID, COUNT(Genre) AS Genres FROM XRefGenresMovies GROUP BY MovieID) Y ON M.MovieID = Y.MovieID; Take out the inner queries and try running them. Both run because they are independent of the outer query!

53 Correlated Subqueries Correlated setup: -- OUTER QUERY SELECT Title, (X) AS Ratings, (Y) AS Genres FROM Movie M; -- INNER QUERY X SELECT COUNT(Rating) FROM Ratings WHERE MovieID = M.MovieID; -- INNER QUERY Y SELECT COUNT(Genre) FROM XRefGenresMovies WHERE MovieID = M.MovieID;

54 Correlated Subqueries Correlated solution (dependencies are underlined): SELECT Title, (SELECT COUNT(Rating) FROM Ratings WHERE MovieID = M.MovieID) AS Ratings, (SELECT COUNT(Genre) FROM XRefGenresMovies WHERE MovieID = M.MovieID) AS Genres FROM Movies M; Take out the inner queries and try running them. They won’t run because they are dependent on the MovieID attribute from the outer query.

55 Correlated Subqueries So which one do you choose? Subqueries in the FROM clause or correlated subqueries in the SELECT clause? Whichever one runs faster! Our database is too small to do any real testing. Notice the correlated version is shorter and looks nicer.


Download ppt "Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations."

Similar presentations


Ads by Google