Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.

Slides:



Advertisements
Similar presentations
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
Advertisements

 Database is SQL1.mdb ◦ import using MySQL Migration Toolkit 
Group functions cannot be used in the WHERE clause: SELECT type_code FROM d_songs WHERE SUM (duration) = 100; (this will give an error)
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
4c. Structured Query Language - Built-in Functions Lingma Acheson Department of Computer and Information Science IUPUI CSCI N207 Data Analysis with Spreadsheets.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Chapter 6 Set Functions.
Chapter 11 Group Functions
LECTURE 10.  Group functions operate on sets of rows to give one result per group.
Instructor: Craig Duckett CASE, ORDER BY, GROUP BY, HAVING, Subqueries
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
Structured Query Language – Continued Rose-Hulman Institute of Technology Curt Clifton.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Midterm Review.
Week 2 Normalization and Queries
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 5: Subqueries and Set Operations.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Mary K. Olson PS Reporting Instance – Query Tool 101.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 4: Joins Part II.
SQL By: Toan Nguyen. Download Download the software at During the installation –Skip sign up for fast installation.
Computer Science 101 Web Access to Databases SQL – Extended Form.
Database Programming Sections 5– GROUP BY, HAVING clauses, Rollup & Cube Operations, Grouping Set, Set Operations 11/2/10.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 8: Subqueries.
Xin  Syntax ◦ SELECT field1 AS title1, field2 AS title2,... ◦ FROM table1, table2 ◦ WHERE conditions  Make a query that returns all records.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 5: Functions.
Using Relational Databases and SQL John Hurley Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
1 ICS 184: Introduction to Data Management Lecture Note 10 SQL as a Query Language (Cont.)
1 Agenda – 03/25/2014 Login to SQL Server 2012 Management Studio. Answer questions about HW#7 – display answers. Exam is 4/1/2014. It will be in the lab.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-column subqueries, Multiple-row Subqueries, Correlated Subqueries 11/2/10,
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
Using Special Operators (LIKE and IN)
Structured Query Language. Group Functions What are group functions ? Group Functions Group functions operate on sets of rows to give one result per group.
Database Systems Microsoft Access Practical #3 Queries Nos 215.
Oracle DML Dr. Bernard Chen Ph.D. University of Central Arkansas.
1 Agenda – 10/24/2013 Answer questions from lab on 10/22. Present SQL View database object. Present SQL UNION statement.
SQL for Data Retrieval. Running Example IST2102 Data Preparation Login to SQL server using your account Select your database – Your database name is.
Intro to SQL Management Studio. Please Be Sure!! Make sure that your access is read only. If it isn’t, you have the potential to change data within your.
DATA RETRIEVAL WITH SQL Goal: To issue a database query using the SELECT command.
SQL Aggregation Oracle and ANSI Standard SQL Lecture 9.
DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Aliya Farheen October 29,2015.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
© Jalal Kawash Database Queries Peeking into Computer Science.
Single-Table Queries 2: Advanced Topics CS 320. Review: Retrieving Data From a Single Table Syntax: Limitation: Retrieves "raw" data SELECT field1, field2,
Structured Query Language SQL Unit 4 Solving Problems with SQL.
Agenda for Class - 03/04/2014 Answer questions about HW#5 and HW#6 Review query syntax. Discuss group functions and summary output with the GROUP BY statement.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-row Subqueries, Correlated Subqueries.
SQL: Interactive Queries (2) Prof. Weining Zhang Cs.utsa.edu.
CS122 Using Relational Databases and SQL Huiping Guo Department of Computer Science California State University, Los Angeles 2. Single Table Queries.
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
Lecture 7: Subqueries Tarik Booker California State University, Los Angeles.
CS122 Using Relational Databases and SQL Huiping Guo Department of Computer Science California State University, Los Angeles 4. Subqueries and joins.
CS122 Using Relational Databases and SQL
Instructor: Craig Duckett Lecture 09: Tuesday, April 25th, 2017
Tarik Booker California State University, Los Angeles October 21, 2014
Chapter 3 Introduction to SQL(3)
Using Relational Databases and SQL
Group Functions Lab 6.
CS 405G: Introduction to Database Systems
SQL – Entire Select.
Chapter 4 Summary Query.
CS122 Using Relational Databases and SQL
Section 4 - Sorting/Functions
Joins and other advanced Queries
Aggregate Functions.
CS122 Using Relational Databases and SQL
Presentation transcript:

Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates

Miscellany Midterm Questions? Too easy? Too hard?

Topics for Today Aggregate (Set) Functions (Pages 49 – 54)‏ GROUP BY Clause (Pages 54 – 55)‏ HAVING Clause (Pages 55 – 58)‏ WITH ROLLUP (not in book)‏

Aggregate Functions The SQL standard calls these Set Functions Aggregate/Non-aggregate similarities Both take some kind of input Both perform operations on the input Both have an single output. Aggregate/Non-aggregate differences Input to an aggregate function is a set of data Input to a non-aggregate function is a single item

Examples Function Example: SELECT LEFT(ArtistName, 1) AS 'First Letter of Artist Name' FROM Artists; Aggregate Example: SELECT COUNT(ArtistName) AS 'Artist Count' FROM Artists;

Aggregate Functions COUNT(*), COUNT(fieldname)‏ AVG(fieldname)‏ MIN(fieldname), MAX(fieldname)‏ SUM(fieldname)‏

COUNT COUNT(*)‏ Counts the number of rows in a table Excludes NULLs (doesn't count them)‏ -- This query returns 11. SELECT COUNT(*) AS 'Number of Artists' FROM Artists; COUNT(fieldname)‏ Same as above -- This query also returns 11. SELECT COUNT(ArtistID) AS 'Number of Artists' FROM Artists;

AVG AVG(fieldname)‏ Averages all the data under fieldname Excludes NULLs (doesn't count NULL as 0). -- Averages all track lengths. SELECT AVG(LengthSeconds) AS 'AvgLength' FROM Tracks;

MIN and MAX MIN(fieldname)‏ Returns the minimum value under fieldname -- Returns the minimum track length. SELECT MIN(LengthSeconds) AS 'Shortest Track' FROM Tracks; MAX(fieldname)‏ Returns the maximum value under fieldname -- Returns the maximum track length. SELECT MAX(LengthSeconds) AS 'Longest Track' FROM Tracks;

SUM SUM(fieldname)‏ Sums all the data under fieldname Excludes NULLs (doesn't count NULL as 0). -- Sums all of the track lengths. SELECT SUM(LengthSeconds) AS 'Total Length' FROM Tracks;

More Aggregate Function The SQL99 standard only requires the first five aggregate functions we talked about so far More MySQL specific ones are here.here

Filtering Aggregate Calculations To exclude items from being aggregated, you may use the WHERE clause. Example: Count the number of male members. SELECT COUNT(*) FROM Members WHERE Gender = 'M'; Example: Count the number of female members. SELECT COUNT(*) FROM Members WHERE Gender = 'F';

Mixing Field Types Can we calculate both with a single query? Well, we would need to mix non-aggregated fieldnames with aggregated ones -- Example: What does this do? Does it work? No! SELECT Gender, COUNT(*) FROM Members;

Grouping Tables You can mix non-aggregated and aggregated fieldnames and get aggregates to return multiple values per table by grouping the table -- Groups the members table by Gender. SELECT * FROM Members GROUP BY Gender; -- Groups and counts the members table by Gender. SELECT Gender, COUNT(*) FROM Members GROUP BY Gender;

How GROUP BY Works GROUP BY begins by sorting the table based on the grouping attribute (in our case, Gender)‏ If any aggregates are present, GROUP BY causes each aggregate to be applied per-group rather than per-table GROUP BY then condenses the table so that each group only appears once in the table (if listed) and displays any aggregated values along with it

GROUP BY Example

Grouping on Multiple Fields GROUP BY can use multiple fieldnames (similar to how you can sort using multiple fieldnames)‏ -- Example: Report the number of members by region and gender. SELECT Region, Gender, COUNT(*) FROM Members GROUP BY Region, Gender;

Filtering Based on Aggregates Can we use aggregate functions in the WHERE clause? -- List all titles (names of titles, not title ids) that have an average track length of over 3 mintues. SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) WHERE AVG(LengthSeconds) > 5*60 GROUP BY TitleID; The answer is no because a WHERE clause condition is executed once per row; an aggregate isn't finished calculating until all after all of the rows have been processed!

The HAVING Clause Solution is to use the HAVING clause Example: -- List all titles (names of titles, not title ids) that have an average track length of over 3 mintues. SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) GROUP BY TitleID HAVING AVG(LengthSeconds) > 5*60;

How HAVING Works In previous example: This is calculated first... SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) GROUP BY TitleID; Then those results are filtered by the HAVING clause... SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) GROUP BY TitleID HAVING AVG(LengthSeconds) > 5*60;

How HAVING Works So in other words: WHERE filters per row (filters during aggregation)‏ HAVING filters per aggregated group (filters after aggregation)‏ Since HAVING filters on groups: You cannot use just any fieldname you want to in a HAVING clause; only the ones you choose to display and group by Example on next page...

Having Examples Works: SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) GROUP BY TitleID HAVING AVG(LengthSeconds) > 5*60; Doesn't work: SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) GROUP BY TitleID HAVING LengthSeconds < AVG(LengthSeconds) ;

Having Examples Why doesn't it work? Because LengthSeconds is a property of a track, and not a property of a group. You can only use group properties in a HAVING clause. In other words, since TitleID is a property of the aggregated group (since we are grouping by TitleID), we can use it in the HAVING clause. SELECT Title, AVG(LengthSeconds) FROM Titles JOIN Tracks USING(TitleID) GROUP BY TitleID HAVING AVG(LengthSeconds) > 5*60 AND TitleID > 6;

HAVING Summary So in a HAVING clause: You can use aggregate functions You can use constant values You can use group properties Anything else and... Happy error time! Usually “ERROR 1111 (HY000): Invalid use of group function”

An Advanced HAVING Problem List the region, country, and average member age of all members located within that region and country, for only those regions and countries that have an average member age greater than 40. Remember that nobody every says “I'm years old!”

Solution SELECT Region, Country, TRUNCATE(AVG(TRUNCATE(DATEDIFF(C urDate(), Birthday)/365, 0)), 0) AS 'Average Age' FROM Members GROUP BY Region, Country HAVING TRUNCATE(AVG(TRUNCATE(DATEDIFF(C urDate(), Birthday)/365, 0)), 0) > 40;

WITH ROLLUP Used to perform extra data analysis For example, let's say you also wanted to display the average age of all members from any region and country: SELECT Region, Country, TRUNCATE(AVG(TRUNCATE(DATEDIFF(CurDa te(), Birthday)/365, 0)), 0) AS 'Average Age' FROM Members GROUP BY Region, Country WITH ROLLUP; To get this extra data, you would normally have to run another query or use a union.

Pre-Lab Bonus Do problems from book, chapter 3, page 65, problems 1 – 9. Due before lab, R 11:30 am. For #4, you should get 'Alvarez.' For #6, use a join instead of a subquery. For #7, use a join and aggregates only. No subqueries. This is a tricky problem. For #8, better get IN and TX. For #9, use a LEFT JOIN instead of a subquery. +3 points to midterm grade for 1 – 6 and 8 – points to midterm grade for 7.