Aggregates, Ordering, Grouping, Subqueries and Data Definition

Presentation on theme: "Aggregates, Ordering, Grouping, Subqueries and Data Definition"— Presentation transcript:

Aggregates, Ordering, Grouping, Subqueries and Data Definition
Lecture 6 More SQL Aggregates, Ordering, Grouping, Subqueries and Data Definition

Aggregates SELECT MAX(Fine) as maxfine FROM Loan; maxfine £75.00
Loan# catno Memno LoanDate DueDate Fine L0002 B0001 M /10/97 04/12/97 £62.10 L0003 B0002 M /12/97 05/03/98 £53.00 L0004 B0003 M /12/97 05/03/98 £53.00 L0006 B0004 M /12/97 13/03/98 £52.20 L0008 B0000 M /01/98 16/04/98 £48.80 L0009 B0005 M /08/99 18/11/99 £75.00 L0010 B0006 M /08/99 20/11/99 NULL SELECT MAX(Fine) as maxfine FROM Loan; maxfine £75.00 SELECT Count(*) FROM Loan; 7 SELECT SUM(Fine) FROM Loan; SELECT Count(Fine) FROM Loan; £354:10 6 SELECT AVG(Fine) FROM Loan; £59.02

Ordering SELECT Memno, Fine FROM Loan ORDER BY Memno, Fine;
SELECT Memno, Fine FROM Loan ORDER BY Memno, Fine DESC;

Grouping SELECT Memno, COUNT(*) AS num_loans FROM Loan GROUP BY Memno;
Loan# Book# Memno L0002 B0001 M0001 L0003 B0002 M0001 L0004 B0003 M0001 L0006 B0004 M0002 L0008 B0000 M0002 How many loans does each member have? SELECT Memno, COUNT(*) AS num_loans FROM Loan GROUP BY Memno; Memno num_loans M M

Grouping SELECT memno, SUM(fine) AS total_fine FROM Loan
memno catno fine M0001 B0002 £53.00 M0001 B0003 £53.00 M0002 B0004 £52.20 M0003 B0005 £75.00 What is the total fine paid by each member? SELECT memno, SUM(fine) AS total_fine FROM Loan GROUP BY memno ; memno total_fine M £ M £52.20 M £75.00

memno catno fine M0001 B0002 £53.00 M0001 B0003 £53.00 M0002 B0004 £52.20 M0003 B0005 £75.00 The Select attributes can only contain the attribute grouped on + aggregate functions SELECT memno, sum(fine) as memfine, catno FROM Loans GROUP BY memno memno memfine M £106.00 M £52.20 M £75.00 Catno B002, B003 B004 B005

Condition on the group - Having
memno catno fine M0001 B0002 £53.00 M0001 B0003 £53.00 M0002 B0004 £52.20 M0003 B0005 £75.00 What is the total fine paid by each member? Only display members with total fine > £100. SELECT memno, sum(fine) as memfine FROM Loans GROUP BY memno HAVING sum(fine) > 100 ; memno memfine M £106.00

Subqueries

Subqueries Consider the following tables from which we will do subqueries: Books catno title author publisher category C100 Physics Handbook Jones Wiley Physics C200 Simply the Best Advacaat Rangers Football C300 Database Design Wilson McCall Computing C400 Business Society Neal Wiley Business C500 The Metro Abbey Wiley Leisure C600 Graphics Sedge Maxwell Computing C700 Cell Biology Norton West Biology Loans catno memno borrowed date_ret fine C100 M100 12/09/01 20/09/01 NULL C300 M100 01/09/01 NULL NULL C400 M200 04/06/01 16/09/01 £16.30 C500 M200 04/08/01 16/09/01 £16.30 C600 M250 02/10/01 24/10/01 £30.00 C700 M300 10/09/01 19/10/01 NULL Members memno name address age M100 Fred Aberdeen 22 M150 Colin Stirling 31 M200 Dave Dundee 21 M250 Betty Aberdeen 67 M300 Jean Dundee 17

Subqueries Let’s say you wanted the names of all members who have borrowed a business or a computing book - a possible query is as follows: SELECT name FROM Books, Members, Loans WHERE Books.catno = Loans.catno AND Members.memno = Loans.memno AND category IN (“Business”, “Computing”); The problem here is that the join in the query (i.e Book.catno = Loan.catno AND Member.memno = Loan.memno) creates the intermediate table as shown below: catno title author publisher category catno memno borrowed date_ret fine memno name address age C100 Physics Handbook Jones Wiley Physics C M /09/ /09/01 NULL M Fred Aberdeen C300 Database Design Wilson McCall Computing C M /09/01 NULL NULL M Fred Aberdeen C400 Business Society Neal Wiley Business C400 M /06/ /09/01 £ M Dave Dundee C500 The Metro Abbey Wiley Leisure C M /08/ /09/01 £ M Dave Dundee C600 Graphics Sedge Maxwell Computing C600 M /10/ /10/01 £ M Betty Aberdeen C700 Cell Biology Norton West Biology C M /09/ /10/01 NULL M Jean Dundee With more loans the above table can become huge - this is inefficient - better to use subqueries

Subqueries Subqueries are SELECT statements embedded within another SELECT statement the results of the inner SELECT statement (or subselect) are used in the outer statement to help determine the contents of the final result inner to outer means evaluating statements from right to left a subselect can be used in the WHERE and HAVING clauses of an outer SELECT statement

Subqueries Subqueries can be used with a number of operators:
Relational operators (=, <, >, <=, >=, < >) IN, NOT IN ALL SOME, ANY EXISTS, NOT EXISTS

Relational and Aggregate Operators
Relational Operators (=, <, >, <=, >=, <>) can only be used if the result of the subquery returns a single value i.e subquery must be a scalar subquery. In general, relational operators are used in conjunction with aggregate operators i.e sum, avg, count, max, min EXAMPLE What is the name of the oldest Member SELECT name FROM Members WHERE age = (SELECT MAX(age) FROM Members); Which equates to: SELECT name FROM Members WHERE age = 67; Scalar subquery returns 67 SELECT MAX(age) FROM Members;

IN and NOT IN Operators Let us again say you wanted the names of all members who have borrowed a business or a computing book - a possible solution using subqueries and the IN operator SELECT name FROM Members WHERE memno IN (SELECT memno FROM Loans WHERE catno IN (SELECT catno FROM Books WHERE category IN (‘Business’, ‘Computing’))); Works backwards i.e from inner to outer statements

IN and NOT IN Operators The previous query works as follows:
SELECT name FROM Members WHERE memno IN {M100, M200, M250}; Table subquery returns {M100, M200, M250} SELECT memno FROM Loan WHERE catno IN {C300, C400, C600}; Table subquery returns {C300,C400,C600} SELECT catno FROM Book WHERE category IN (‘Business’, ‘Computing’);

ALL Operator The ALL operator may be used with subqueries that produce a single column of numbers. If the subquery is preceded by the keyword ALL, the condition will only be TRUE if it is satisfied by all the values produced by the subquery EXAMPLE What is the name of the oldest member SELECT name FROM Member WHERE age >= ALL (SELECT age FROM Member); SELECT name FROM Member WHERE age >= ALL {22, 31, 21, 67, 17}; SELECT age FROM Member; look for the rows in Members whose age is greater than or equal to all the values in list

Example Staff (staffNo, staffName, salary, branchNo*)
Branch (branchNo, branchAddress) What does this query do? Select staffName, salary From Staff Where salary > ALL (Select salary Where branchNo = ‘B003’); List all staff whose salary is larger than the salary of every member of staff at branch B003.

SOME/ANY Operator EXAMPLE
The SOME operator may be used with subqueries that produce a single column of numbers. SOME and ANY can be used interchangeably. If the subquery is preceded by the keyword SOME, the condition will only be TRUE if it is satisfied by any (one or more) values produced by the subquery EXAMPLE List the names of members who have borrowed books (i.e., members who appear in the Loan table) SELECT name FROM Member WHERE memno = ANY (SELECT DISTINCT memno FROM Loans); SELECT DISTINCT memno FROM Loans SELECT name FROM Member WHERE memno = ANY (“M100”, “M200”, “M250”, “M300”);

Example Staff (staffNo, staffName, salary, branchNo*)
Branch (branchNo, branchAddress) What does this query do? Select staffName, salary From Staff Where salary > ANY (Select salary Where branchNo = ‘B003’); List all staff whose salary is larger than the salary of at least one member of staff at branch B003.

EXISTS and NOT EXISTS Operators - Correlated Queries
EXISTS and NOT EXISTS produce a simple TRUE/FALSE result. EXISTS is TRUE if and only if there exists at least one row in the result table returned by the subquery; it is FALSE if the subquery returns an empty result table. NOT EXISTS is the opposite of EXISTS EXAMPLE List the titles that have been borrowed by members SELECT title FROM Book B WHERE EXISTS (SELECT * FROM Loan L WHERE L.catno = B.catno); The outer query iterates through all the books, testing if each book appears in the Loan table

Features of correlated queries
A table in the outer query is referenced in the WHERE field of the inner query. The query runs by iterating though the records of the outer FROM table. The inner query is evaluated once for each record, and passes the result back to the outer query.

Some questions can be answered using joins or queries
Member Loan Book List names of members who have borrowed books on History or Computing memno memno catno catno SELECT Member.name FROM Book, Member, Loan WHERE Book.catno =Loan.catno AND Member.memno=Loan.memno AND Book.category IN ("History", "Computing") ; SELECT name FROM Member WHERE memno IN (SELECT memno FROM Loan WHERE catno IN (SELECT catno FROM Book WHERE category IN (“History”, “Computing”)));

Equivalent ways of using Subqueries
memno catno fine M0001 B0002 £53.00 M0002 B0004 £52.00 M0003 B0005 £75.00 M0004 B0007 £26.00 Which member paid the smallest fine? SELECT memno, fine FROM Loan WHERE fine <= ALL (SELECT fine FROM Loan); WHERE fine = (SELECT Min(fine) FROM Loan);

Exercise (May 2004 Exam) Employee(empid, name) Project(projid, name)
WorkLoad(empid*, projid*, duration) List the number of projects that each employee (empid) is working on. List, in descending order, employee names that are working on the project called “Databases”. List the employee (empid) who spent the longest duration working on a project. List the employees (name) who have not worked on any project.

Solution Select empid, count(projid) From WorkLoad Group by empid;
Select E.name From Employee E, Project P, WorkLoad L Where E.empid=L.empid AND L.projid=P.projid AND P.name=’Databases’ Order By E.name Desc; Select empid Where duration = ( Select Max(duration) From WorkLoad); Select name From Employee E Where Not Exists (Select * From WorkLoad L Where E.empid = L.empid);

SQL (Data Definition)

SQL Data Types Data Type Declarations Boolean BOOLEAN
Character CHAR VARCHAR Exact Numeric NUMERIC DECIMAL INTEGER ... Approx. Numeric FLOAT REAL … Date/Time DATE TIME …

Creating Domains and Tables
CREATE DOMAIN name AS CHAR(12); CREATE TABLE Dept ( deptcode CHAR(3), deptname CHAR(12) ); name CREATE TABLE Driver ( first_name CHAR(12), second_name CHAR(12), age INTEGER(2) ); name, OK, have covered data manipulation. Now will do a bit of data definition. Look at first table example. Explain example. Note difference between ANSI and Microsoft SQL. This is an example of a built in data-type, called character. You can also define your own data-type, called domain. Demonstrate use of domain CREATE DOMAIN foo AS CHARACTER(20) CREATE TABLE Staff ( name foo ) Can use foo in lots of places. Now look at final example using user-defined domains. This table doesn’t have a primary key. Put in by third line PRIMARY KEY (identity) name,

Creating Domains and Tables
CREATE DOMAIN student_id AS CHAR(5) CHECK (VALUE LIKE ‘S%’); S followed by any number of characters CREATE DOMAIN student_id AS CHAR (5) CHECK (VALUE LIKE ‘S_ _ _ _’); S followed by exactly four characters CREATE TABLE Student ( identity student_id, extension integer(4) UNIQUE, student_name name NOT NULL); OK, have covered data manipulation. Now will do a bit of data definition. Look at first table example. Explain example. Note difference between ANSI and Microsoft SQL. This is an example of a built in data-type, called character. You can also define your own data-type, called domain. Demonstrate use of domain CREATE DOMAIN foo AS CHARACTER(20) CREATE TABLE Staff ( name foo ) Can use foo in lots of places. Now look at final example using user-defined domains. This table doesn’t have a primary key. Put in by third line PRIMARY KEY (identity) Attribute name Attribute domain Constraint

Constraints in tables Name of the constraint
CREATE TABLE Dept ( deptcode CHAR(3) CONSTRAINT dep_con1 PRIMARY KEY, deptname CHAR(12) ); Name of the constraint CREATE TABLE Dept ( deptcode CHAR(3), deptname CHAR(12), CONSTRAINT dep_con1 PRIMARY KEY (deptcode) ); CREATE TABLE Loan ( bookcode CHAR(5), memname CHAR(15), constraint dep_con1 PRIMARY KEY (bookcode, memcode) ); Constraints can be written either in the same line as the attribute they’re constraining, or at the end of the table. Composite primary key must be entered as table constraint, i.e., separately from the attributes

More Constraints in tables
CREATE TABLE Staff ( Staffcode CHAR(4), StaffTitle CHAR(3) CONSTRAINT s1_con CHECK (StaffTitle IN (‘Mr’, ‘Ms’, ‘Dr’)) ); CREATE TABLE Staff ( Staffcode CHAR(4), StaffTitle CHAR(3), CONSTRAINT s1_con CHECK (StaffTitle IN (‘Mr’, ‘Ms’, ‘Dr’)) );

Foreign Keys CREATE TABLE Department ( Deptcode CHAR(4), Deptname CHAR(3), CONSTRAINT dep_con1 PRIMARY KEY (Deptcode) ); Optional, if it’s the primary key CREATE TABLE Staff ( Staffcode CHAR(4), StaffTitle CHAR(3), Dept CHAR(4) REFERENCES Department(Deptcode) ); CREATE TABLE Staff ( Staffcode CHAR(4), StaffTitle CHAR(3), Dept CHAR(4), FOREIGN KEY (Dept) REFERENCES Department ); Can be multiple valued, to match composite primary key

Link Properties: On Delete, On Update
SID Name S Fred S Bill S Jim Staff DID* D1 D2 DID Name D Art D Computing Dept NULL Link properties On delete: Cascade When you define a link between tables, you have to define cascade and on delete properties, as you’ll already have discovered from the Access labs. In a moment I’ll show you the SQL, but I’ll first explain what it does On delete: Set Null On delete: Set Default

Link Properties: On Delete, On Update
SID Name S Fred S Bill S Jim Staff DID* D1 D2 Dept DID Name D Art D Computing D42 D42 D42 D79 NULL Link properties On update: Cascade Each Junior in exactly one team, so can include TID in Junior table. Can you include JuniorID as a foreign key in the Teams table? No, cos it would be multiple valued. Next consider the situation if participation at Junior end is optional. Then a Junior may not have a team, so would get nulls. Can’t include either primary key as foreign key in other table, so need to create new table. On update: Set Null On delete: Set Default