M.P. Johnson, DBMS, Stern/NYU, Spring 20051 C20.0046: Database Management Systems Lecture #12 M.P. Johnson Stern School of Business, NYU Spring, 2005.

Slides:



Advertisements
Similar presentations
SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: SQL92, SQL2, SQL3. Vendors support.
Advertisements

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
Relational Algebra (end) SQL April 19 th, Complex Queries Product ( pid, name, price, category, maker-cid) Purchase (buyer-ssn, seller-ssn, store,
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
Algebraic and Logical Query Languages Spring 2011 Instructor: Hassan Khosravi.
1 Lecture 12: SQL Friday, October 26, Outline Simple Queries in SQL (5.1) Queries with more than one relation (5.2) Subqueries (5.3) Duplicates.
1 Lecture 03: Advanced SQL. 2 Outline Unions, intersections, differences Subqueries, Aggregations, NULLs Modifying databases, Indexes, Views Reading:
Subqueries Example Find the name of the producer of ‘Star Wars’.
Oct 28, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #9 M.P. Johnson Stern School of Business, NYU Spring, 2008.
Structured Query Language – Continued Rose-Hulman Institute of Technology Curt Clifton.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #11 M.P. Johnson Stern School of Business, NYU Spring, 2008.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #9 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Matthew P. Johnson, OCL3, CISDD CUNY, June OCL3 Oracle 10g: SQL & PL/SQL Session #4 Matthew P. Johnson CISDD, CUNY June, 2005.
1 Lecture 03: SQL Friday, January 7, Administrivia Have you logged in IISQLSRV yet ? HAVE YOU CHANGED YOUR PASSWORD ? Homework 1 is now posted.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #11 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #10 M.P. Johnson Stern School of Business, NYU Spring, 2008.
Matthew P. Johnson, OCL4, CISDD CUNY, Sept OCL4 Oracle 10g: SQL & PL/SQL Session #3 Matthew P. Johnson CISDD, CUNY June, 2005.
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #15 M.P. Johnson Stern School of Business, NYU Spring, 2005.
Database Systems More SQL Database Design -- More SQL1.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #11 M.P. Johnson Stern School of Business, NYU Spring, 2005.
Correlated Queries SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); Movie (title, year, director, length)
1 Lecture 3: More SQL Friday, January 9, Agenda Homework #1 on the web site today. Sign up for the mailing list! Next Friday: –In class ‘activity’
Murali Mani Relational Algebra. Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQL’s DML (Data Manipulation.
Union, Intersection, Difference (SELECT name FROM Person WHERE City=“Seattle”) UNION (SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The.
One More Normal Form Consider the dependencies: Product Company Company, State Product Is it in BCNF?
Integrity Constraints An important functionality of a DBMS is to enable the specification of integrity constraints and to enforce them. Knowledge of integrity.
Exercises Product ( pname, price, category, maker) Purchase (buyer, seller, store, product) Company (cname, stock price, country) Person( per-name, phone.
1 Lecture 4: More SQL Monday, January 13th, 2003.
1 SQL cont.. 2 Outline Unions, intersections, differences (6.2.5, 6.4.2) Subqueries (6.3) Aggregations (6.4.3 – 6.4.6) Hint for reading the textbook:
IM433-Industrial Data Systems Management Lecture 5: SQL.
1 Database Systems Lecture #7 Yan Pan School of Software, SYSU 2011.
1 ICS 184: Introduction to Data Management Lecture Note 10 SQL as a Query Language (Cont.)
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
Aggregation SELECT Sum(price) FROM Product WHERE manufacturer=“Toyota” SQL supports several aggregation operations: SUM, MIN, MAX, AVG, COUNT Except COUNT,
1 Lecture 03: SQL Monday, January 9, Project t/Default.aspxhttp://iisqlsrv.cs.washington.edu/444/Projec.
Subqueries CIS 4301 Lecture Notes Lecture /23/2006.
Slides are reused by the approval of Jeffrey Ullman’s
Cours 7: Advanced SQL.
Lecture 04: SQL Monday, January 10, 2005.
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
Database Systems Subqueries, Aggregation
Lecture 2 (cont’d) & Lecture 3: Advanced SQL – Part I
CS 405G: Introduction to Database Systems
IST 210: Organization of Data
Introduction to Database Systems CSE 444 Lecture 03: SQL
Lecture 4: Advanced SQL – Part II
Instructor: Mohamed Eltabakh
Introduction to Database Systems CSE 444 Lecture 03: SQL
CSE544 SQL Wednesday, March 31, 2004.
SQL Introduction Standard language for querying and manipulating data
Lecture 12: SQL Friday, October 20, 2000.
Lectures 3: Introduction to SQL Part II
Lectures 5: Introduction to SQL 4
Lecture 4: SQL Thursday, January 11, 2001.
Lectures 6: Introduction to SQL 5
Lecture 3 Monday, April 8, 2002.
Lecture 4: SQL Wednesday, April 10, 2002.
5.1 Relational Operations on Bags
Lecture 03: SQL Friday, October 3, 2003.
Lecture 3: Relational Algebra and SQL
Lecture 04: SQL Monday, October 6, 2003.
Lecture 14: SQL Wednesday, October 31, 2001.
Presentation transcript:

M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #12 M.P. Johnson Stern School of Business, NYU Spring, 2005

M.P. Johnson, DBMS, Stern/NYU, Spring Confession Relations aren’t really sets! They’re bags!

M.P. Johnson, DBMS, Stern/NYU, Spring Bag theory SELECT/WHERE: no duplicate elimination Cross, join: no duplicate elimination  |R1xR2| = |R1|*|R2| Can convert to sets when necessary  DISTINCT Allowing duplicates by default is cheaper  Union  Projection How hard is removing duplicates?

M.P. Johnson, DBMS, Stern/NYU, Spring Bag theory Bags: like sets but elements may repeat  “multisets” Set ops change somewhat when applied to bags  intuition: pretend identical elements are distinct {a,b,b,c}  {a,b,b,b,e,f,f} = {a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b} {a,b,b,b,c,c}  {b,c,c,c,d} = {b,c,c}

M.P. Johnson, DBMS, Stern/NYU, Spring Some surprises in bag theory Be careful about your set theory laws – not all hold in bag theory (R  S) – T = (R – T)  (S – T)  always true in set theory  But true in bag theory?  suppose x is in R, S and T

M.P. Johnson, DBMS, Stern/NYU, Spring Set/bag ops in SQL Orthodox SQL has set operators:  UNION, INTERSECT, EXCEPT And bag operators:  UNION ALL, INTERSECT ALL, EXCEPT ALL

M.P. Johnson, DBMS, Stern/NYU, Spring New topic: Subqueries Powerful feature of SQL: one clause can contain other SQL queries  Anywhere where a value or relation is allowed Several ways:  Selection  single constant (scalar) in SELECT  Selection  single constant (scalar) in WHERE  Selection  relation in WHERE  Selection  relation in FROM

M.P. Johnson, DBMS, Stern/NYU, Spring Subquery motivation Consider standard multi-table example:  Purchase(prodname, buyerssn, etc.)  Person(name, ssn, etc.)  What did Christo buy? As usual, need to AND on equality identifying ssn’s row and buyerssn’s row SELECT Purchase.prodname FROM Purchase, Person WHERE buyerssn = ssn AND name = 'Christo'

M.P. Johnson, DBMS, Stern/NYU, Spring Subquery motivation Purchase(prodname, buyerssn, etc.) Person(name, ssn, etc.) What did Conrad buy? Natural intuition:  Go find Conrad’s ssn  Then find purchases SELECT ssn FROM Person WHERE name = 'Christo' SELECT Purchase.prodname FROM Purchase WHERE buyerssn = Christo’s-ssn

M.P. Johnson, DBMS, Stern/NYU, Spring Subqueries Subquery: copy in Conrad’s selection for his ssn: The subquery returns one value, so the = is valid If it returns more (or fewer), we get a run-time error SELECT Purchase.prodname FROM Purchase WHERE buyerssn = (SELECT ssn FROM Person WHERE name = 'Christo') SELECT Purchase.prodname FROM Purchase WHERE buyerssn = (SELECT ssn FROM Person WHERE name = 'Christo')

M.P. Johnson, DBMS, Stern/NYU, Spring Operators on subqueries Several new operators applied to (unary) selections: 1. IN R 2. EXISTS R 3. UNIQUE R 4. s > ALL R 5. s > ANY R 6. x IN R > is just an example op Each expression can be negated with NOT

M.P. Johnson, DBMS, Stern/NYU, Spring Subqueries with IN Product(name,maker), Person(name,ssn), Purchase(buyerssn,product) Q: Find companies Martha bought products from Strategy: 1. Find Martha’s ssn 2. Find products listed with that ssn as buyer 3. Find company names of those products SELECT DISTINCT Product.maker FROM Product WHERE Product.name IN (SELECT Purchase.product FROM Purchase WHERE Purchase.buyerssn = (SELECT ssn FROM Person WHERE name = 'Martha')) SELECT DISTINCT Product.maker FROM Product WHERE Product.name IN (SELECT Purchase.product FROM Purchase WHERE Purchase.buyerssn = (SELECT ssn FROM Person WHERE name = 'Martha'))

M.P. Johnson, DBMS, Stern/NYU, Spring Subqueries returning relations Equivalent to: SELECT DISTINCT Product.maker FROM Product, Purchase, People WHERE Product.name = Purchase.product AND Purchase.buyerssn = ssn AND name = 'Martha' SELECT DISTINCT Product.maker FROM Product, Purchase, People WHERE Product.name = Purchase.product AND Purchase.buyerssn = ssn AND name = 'Martha'

M.P. Johnson, DBMS, Stern/NYU, Spring FROM subqueries Motivation for another way:  suppose we’re given Martha’s purchases  Then could just cross with Products to get product makers  Substitute (named) subquery for Martha’s purchases SELECT Product.maker FROM Product, (SELECT Purchase.product FROM Purchase WHERE Purchase.buyerssn = (SELECT ssn FROM Person WHERE name = 'Martha')) Marthas WHERE Product.name = Marthas.product SELECT Product.maker FROM Product, (SELECT Purchase.product FROM Purchase WHERE Purchase.buyerssn = (SELECT ssn FROM Person WHERE name = 'Martha')) Marthas WHERE Product.name = Marthas.product

M.P. Johnson, DBMS, Stern/NYU, Spring ALL op Employees(name, job, divid, salary) Find which employees are paid more than all the programmers SELECT name FROM Employees WHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer') SELECT name FROM Employees WHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer')

M.P. Johnson, DBMS, Stern/NYU, Spring ANY/SOME op Employees(name, job, divid, salary) Find which employees are paid more than at least one vice president SELECT name FROM Employees WHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP') SELECT name FROM Employees WHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP')

M.P. Johnson, DBMS, Stern/NYU, Spring ANY/SOME op Employees(name, job, divid, salary) Find which employees are paid more than at least one vice president SELECT name FROM Employees WHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP') SELECT name FROM Employees WHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP')

M.P. Johnson, DBMS, Stern/NYU, Spring Existential/Universal Conditions Employees(name, job, divid, salary) Division(name, id, head) Find all divisions with an employee whose salary is > Existential: easy! SELECT DISTINCT Division.name FROM Employees, Division WHERE salary > AND divid=id SELECT DISTINCT Division.name FROM Employees, Division WHERE salary > AND divid=id

M.P. Johnson, DBMS, Stern/NYU, Spring Existential/Universal Conditions Employees(name, job, divid, salary) Division(name, id, head) Find all divisions in which everyone makes > Existential: easy!

M.P. Johnson, DBMS, Stern/NYU, Spring Existential/universal with IN 2. Select the divisions we didn’t find 1. Find the other divisions: in which someone makes <= SELECT name FROM Division WHERE id IN (SELECT divid FROM Employees WHERE salary <= SELECT name FROM Division WHERE id IN (SELECT divid FROM Employees WHERE salary <= SELECT name FROM Division WHERE id NOT IN (SELECT divid FROM Employees WHERE salary <= SELECT name FROM Division WHERE id NOT IN (SELECT divid FROM Employees WHERE salary <=

M.P. Johnson, DBMS, Stern/NYU, Spring Acc(name,bal,type…) Q: Who has the largest balance? Can we do this with subqueries?

M.P. Johnson, DBMS, Stern/NYU, Spring Last time: Acc(name,bal,type,…) Q: Find holder of largest account SELECT name FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc) SELECT name FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc) Correlated Queries

M.P. Johnson, DBMS, Stern/NYU, Spring Correlated Queries So far, subquery executed once;  result used for higher query More complicated: correlated queries “[T]he subquery… [is] evaluated many times, once for each assignment of a value to some term in the subquery that comes from a tuple variable outside the subquery” (Ullman, p286). Q: What does this mean? A: That subqueries refer to vars from outer queries

M.P. Johnson, DBMS, Stern/NYU, Spring Last time: Acc(name,bal,type,…) Q2: Find holder of largest account of each type SELECT name, type FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type) SELECT name, type FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type) Correlated Queries correlation

M.P. Johnson, DBMS, Stern/NYU, Spring Last time: Acc(name,bal,type,…) Q2: Find holder of largest account of each type Note: 1. scope of variables 2. this can still be expressed as single SFW SELECT name, type FROM Acc a1 WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type) SELECT name, type FROM Acc a1 WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type) Correlated Queries correlation

M.P. Johnson, DBMS, Stern/NYU, Spring EXCEPT and INTERSECT (SELECT R.A, R.B FROM R) INTERSECT (SELECT S.A, S.B FROM S) (SELECT R.A, R.B FROM R) INTERSECT (SELECT S.A, S.B FROM S) (SELECT R.A, R.B FROM R) EXCEPT (SELECT S.A, S.B FROM S) (SELECT R.A, R.B FROM R) EXCEPT (SELECT S.A, S.B FROM S) SELECT R.A, R.B FROM R WHERE EXISTS(SELECT * FROM S WHERE R.A=S.A and R.B=S.B) SELECT R.A, R.B FROM R WHERE EXISTS(SELECT * FROM S WHERE R.A=S.A and R.B=S.B) SELECT R.A, R.B FROM R WHERE NOT EXISTS(SELECT * FROM S WHERE R.A=S.A and R.B=S.B) SELECT R.A, R.B FROM R WHERE NOT EXISTS(SELECT * FROM S WHERE R.A=S.A and R.B=S.B)

M.P. Johnson, DBMS, Stern/NYU, Spring Grouping & Aggregation ops In SQL:  aggregation operators in SELECT,  Grouping in GROUP BY clause Recall aggregation operators:  sum, avg, min, max, count strings, numbers, dates  Each applies to scalars  Count also applies to row: count(*)  Can DISTINCT inside aggregation op: count(DISTINCT x) Grouping: group rows that agree on single value  Each group becomes one row in result

M.P. Johnson, DBMS, Stern/NYU, Spring Aggregation functions Numerical: SUM, AVG, MIN, MAX Char: MIN, MAX  In lexocographic/alphabetic order Any attribute: COUNT  Number of values SUM(B) = 10 AVG(A) = 1.5 MIN(A) = 1 MAX(A) = 3 COUNT(A) = 4 AB

M.P. Johnson, DBMS, Stern/NYU, Spring Acc(name,bal,type) Q: Who has the largest balance? Can we do this with aggregation functions?

M.P. Johnson, DBMS, Stern/NYU, Spring Straight aggregation In R.A.  sum(x)  total (R) In SQL: Just put the aggregation op in SELECT NB: aggreg. ops applied to each non-null val  count(x) counts the number of nun-null vals in field x  Use count(*) to count the number of rows SELECT SUM(x) total FROM R SELECT SUM(x) total FROM R

M.P. Johnson, DBMS, Stern/NYU, Spring Straight aggregation example COUNT applies to duplicates, unless otherwise stated: Better: Can we say: same as Count(*), except excludes nulls SELECT Count(category) FROM Product WHERE year > 1995 SELECT Count(category) FROM Product WHERE year > 1995 SELECT COUNT(DISTINCT category) FROM Product WHERE year > 1995 SELECT COUNT(DISTINCT category) FROM Product WHERE year > 1995 SELECT category, COUNT(category) FROM Product WHERE year > 1995 SELECT category, COUNT(category) FROM Product WHERE year > 1995

M.P. Johnson, DBMS, Stern/NYU, Spring Straight aggregation example Purchase(product, date, price, quantity) Q: Find total sales for the entire database: Q: Find total sales of bagels: SELECT SUM(price * quantity) FROM Purchase SELECT SUM(price * quantity) FROM Purchase SELECT SUM(price * quantity) FROM Purchase WHERE product = 'bagel' SELECT SUM(price * quantity) FROM Purchase WHERE product = 'bagel'