Presentation is loading. Please wait.

Presentation is loading. Please wait.

Subqueries Example Find the name of the producer of ‘Star Wars’.

Similar presentations


Presentation on theme: "Subqueries Example Find the name of the producer of ‘Star Wars’."— Presentation transcript:

1 Subqueries Example Find the name of the producer of ‘Star Wars’.
Movie(title, year, length, inColor, studioName, producerC) MovieExec(name, address, cert, netWorth) We can do: SELECT name FROM Movie, MovieExec WHERE title ='Star Wars' AND producerC =cert; Or we can have a subquery: FROM MovieExec WHERE cert =(SELECT producerC FROM Movie WHERE title = 'Star Wars'); If we can deduce that there will be only a single value produced by the subquery, then we can use this expression, surrounded by parentheses, as if it were a constant.

2 Conditions Involving Relations
There are a number of SQL operators that we can apply to a relation R and produce a Boolean result. Typically used in the WHERE clause. 1. EXISTS R is a condition that is true if R is not empty. 2. s IN R is true if s is equal to one of the tuples in R. Likewise, s NOT IN R is true if and only if s is equal to no tuple in R.

3 Example Give all the producers of movies in which Julia Roberts stars. SELECT name FROM MovieExec WHERE cert IN (SELECT producerC FROM Movie WHERE (title, year) IN (SELECT movieTitle, movieYear FROM StarsIn WHERE starName = 'Julia Roberts'));

4 Remark The previous nested query can, like many nested queries, be written as a single SELECT-FROM-WHERE expression. SELECT name FROM MovieExec, Movie, StarsIn WHERE cert = producerC AND title = movieTitle AND year = movie Year AND starName = 'Julia Roberts';

5 (Continued) 3. s > ALL R is true if s is greater than every value in the unary (one column) relation R. Similarly, the > operator could be replaced by any other comparison operators with the analogous meaning. For instance, s <> ALL R is the same as s NOT IN R. 4. s > ANY R is true if s is greater than at least one value in unary relation R. Similarly we can use any other comparison operators in place of >. For instance, s =ANY R is the same as s IN R. EXISTS, ALL, and ANY operators can be negated by putting NOT in front of the entire expression.

6 Bag Semantics and Union, Intersection and Difference
Although the SELECT-FROM-WHERE statement uses bag semantics, the default for union, intersection, and difference is set semantics. Motivation? When doing projection in relational algebra, it is easier to avoid eliminating duplicates. Just work tuple-at-a-time. When doing intersection or difference, it is most efficient to sort the relations first. At that point you may as well eliminate the duplicates anyway.

7 Controlling Duplicate Elimination
Force the result to be a set by SELECT DISTINCT . . . Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in UNION ALL . . . Only UNION ALL supported in ORACLE. Example Find all the different studios producing movies: Movie(title, year, length, inColor, studioName, producerC), SELECT DISTINCT studioname FROM Movie; Notice that without DISTINCT, a studioname would be listed as many times as there were movies from that studio.

8 Aggregations SUM, AVG, COUNT, MIN, and MAX can be applied to a column in a SELECT clause to produce that aggregation on the column. Example Find the average length of movies from Disney. SELECT AVG(length) FROM Movie WHERE studioName = 'Disney'; Remark We can also use COUNT(*) which counts the number of tuples in the relation constructed from the FROM and WHERE clauses of the query.

9 Eliminating Duplicates in an Aggregation
DISTINCT inside an aggregation causes duplicates to be eliminated before the aggregation. Example Find the number of different producers for Disney movies. SELECT COUNT(DISTINCT producerc) FROM Movie WHERE studioname = 'Disney'; This is not the same as: SELECT DISTINCT COUNT(producerc) DISTINCT here is useless! Why?

10 Not only in COUNT… SELECT AVG(DISTINCT length) FROM Movie WHERE studioname = 'Disney'; This will produce the average of only the distinct values for length.

11 NULL’s Ignored in Aggregation
NULL never contributes to a sum, average, or count, and can never be the minimum or maximum of a column. SELECT SUM(networth) FROM moviestar NATURAL FULL OUTER JOIN movieexec; But if there are no non-NULL values in a column, then the result of the aggregation is NULL.

12 Example: Effect of NULL’s
SELECT count(*) FROM Movie WHERE studioName = 'Disney'; SELECT count(length) The number of movies from Disney. The number of movies from Disney with a known length.

13 Grouping We may follow a SELECT-FROM-WHERE expression by GROUP BY and a list of attributes. The relation that results from the SELECT-FROM-WHERE is grouped according to the values of all the listed attributes in GROUP BY, and any aggregation is applied only within each group. Example From the Movie relation, find the average length for each studio. SELECT studioName, AVG(length) FROM Movie GROUP BY studioName;

14 Another Example From Movie and MovieExec, find the producer’s total length of film produced: SELECT name, SUM(length) FROM Movie, MovieExec WHERE producerc = cert GROUP BY name; Compute those tuples first, then group by name.

15 Restriction on SELECT Lists With Aggregation
If any aggregation is used, then each element of the SELECT list must be either: Aggregated, or An attribute on the GROUP BY list.

16 Illegal Query Example SELECT title, length
We might think we could find the shortest movie of Disney as: SELECT title, MIN(length) FROM Movie WHERE studioName = 'Disney'; But this query is illegal in SQL. Because title is neither aggregated nor on the GROUP BY list. We should do instead: SELECT title, length WHERE studioName = 'Disney' AND length = (SELECT MIN(length) WHERE studioName = 'Disney');

17 Or… …resembling Relational Algebra. SELECT title, length
FROM Movie NATURAL JOIN (SELECT MIN(length) AS length FROM Movie WHERE studioName = 'Disney') WHERE studioName = 'Disney'; …resembling Relational Algebra.

18 HAVING Clauses HAVING <condition> may follow a GROUP BY clause.
If so, the condition applies to each group, and groups not satisfying the condition are eliminated. Example Consider again the query SELECT name, SUM(length) FROM Movie, MovieExec WHERE producerc = cert GROUP BY name; Suppose we didn’t wish to include all the producers in our table of aggregated lengths. We want those producers with networth less than 1,000,000, and that have at least one movie before 1973. Solution FROM MovieExec, Movie WHERE producerc = cert AND networth< GROUP BY name HAVING MIN(year) < 1973;

19 Requirements on HAVING Conditions
These conditions may refer to any relation in the FROM clause. They may refer to attributes of those relations, as long as the attribute makes sense within a group; i.e., it is either: A grouping attribute, or Aggregated attribute.

20 “Having” is a special kind of 
The previous query can also be written as: SELECT name, sumLength FROM ( SELECT name, MIN(year) AS minYear, SUM(length) AS sumLength FROM MovieExec, Movie WHERE producerc = cert AND networth < GROUP BY name) WHERE minYear < 1973;

21 Correlated Subqueries
Suppose StarsIn relation has an additional attribute “salary” StarsIn(movieTitle, movieYear, starName, salary) Now, find the stars who were paid for some movie more than the average salary for that movie. SELECT starName, movieTitle, movieYear FROM StarsIn X WHERE salary > (SELECT AVG(salary) FROM StarsIn WHERE movieTitle = X.movieTitle AND movieYear=X.movieYear); Remarks Outer query cannot reference any columns in the subquery. Subquery references the tuple in the outer query. Value of the tuple changes by row of the outer query, so the database must rerun the subquery for each row comparison.

22 Another Solution (Nesting in FROM)
SELECT X.starName, X.movieTitle, X.movieYear FROM StarsIn X, (SELECT movieTitle, movieYear, AVG(salary) AS avgSalary FROM StarsIn GROUP BY movieTitle, movieYear) Y WHERE X.salary>Y.avgSalary AND X.movieTitle=Y.movieTitle AND X.movieYear=Y.movieYear;

23 Exercise Product(maker, model, type)
PC(model, speed, ram, hd, rd, price) Laptop(model, speed, ram, hd, screen, price) Printer(model, color, type, price) Find those manufacturers that sell Laptops, but not PC's. Find those hard-disk sizes that occur in two or more PC's. Find those manufacturers of at least two different computers (PC or Laptops) with speed of at least 700. Find the manufacturers who sell exactly three different models of PC.


Download ppt "Subqueries Example Find the name of the producer of ‘Star Wars’."

Similar presentations


Ads by Google