Presentation is loading. Please wait.

Presentation is loading. Please wait.

04 | Grouping and Aggregating Data Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager.

Similar presentations


Presentation on theme: "04 | Grouping and Aggregating Data Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager."— Presentation transcript:

1 04 | Grouping and Aggregating Data Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager

2 Querying Microsoft SQL Server 2012 Jump Start 01 | Introducing SQL Server 2012 SQL Server types of statements; other SQL statement elements; basic SELECT statements 02 | Advanced SELECT Statements DISTINCT, Aliases, scalar functions and CASE, using JOIN and MERGE; Filtering and sorting data, NULL values 03 | SQL Server Data Types Introduce data types, data type usage, converting data types, understanding SQL Server function types 04 | Grouping and Aggregating Data Aggregate functions, GROUP BY and HAVING clauses, subqueries; self-contained, correlated, and EXISTS; Views, inline-table valued functions, and derived tables | Lunch Break Eat, drink, and recharge for the afternoon session

3 Aggregate functions GROUP BY and HAVING clauses Subqueries (self-contained, correlated, and EXISTS) Working with table functions Module Overview

4 Aggregate Functions

5 Common built-in aggregate functions STDEV STDEVP VAR VARP STDEV STDEVP VAR VARP SUM MIN MAX AVG COUNT COUNT_BIG CHECKSUM_AGG GROUPING GROUPING_ID CommonStatistical Other

6 Working with aggregate functions Aggregate functions: Return a scalar value (with no column name) Ignore NULLs except in COUNT(*) Can be used in SELECT, HAVING, and ORDER BY clauses Frequently used with GROUP BY clause UniqueOrders Avg_UnitPrice Min_OrderQty Max_LineTotal ------------- ------------ ------------ ------------- 31465 465.0934 1 27893.619000 UniqueOrders Avg_UnitPrice Min_OrderQty Max_LineTotal ------------- ------------ ------------ ------------- 31465 465.0934 1 27893.619000 SELECT COUNT (DISTINCT SalesOrderID) AS UniqueOrders, AVG(UnitPrice) AS Avg_UnitPrice, MIN(OrderQty)AS Min_OrderQty, MAX(LineTotal) AS Max_LineTotal FROM Sales.SalesOrderDetail; SELECT COUNT (DISTINCT SalesOrderID) AS UniqueOrders, AVG(UnitPrice) AS Avg_UnitPrice, MIN(OrderQty)AS Min_OrderQty, MAX(LineTotal) AS Max_LineTotal FROM Sales.SalesOrderDetail;

7 Using DISTINCT with aggregate functions Use DISTINCT with aggregate functions to summarize only unique values DISTINCT aggregates eliminate duplicate values, not rows (unlike SELECT DISTINCT) Compare (with partial results): SELECT SalesPersonID, YEAR(OrderDate) AS OrderYear, COUNT(CustomerID) AS All_Custs, COUNT(DISTINCT CustomerID) AS Unique_Custs FROM Sales.SalesOrderHeader GROUP BY SalesPersonID, YEAR(OrderDate); SELECT SalesPersonID, YEAR(OrderDate) AS OrderYear, COUNT(CustomerID) AS All_Custs, COUNT(DISTINCT CustomerID) AS Unique_Custs FROM Sales.SalesOrderHeader GROUP BY SalesPersonID, YEAR(OrderDate); SalesPersonID OrderYear All_Custs Unique_custs ----------- ----------- ----------- ------------ 289 2006 84 48 281 2008 52 27 285 2007 9 8 277 2006 140 57 SalesPersonID OrderYear All_Custs Unique_custs ----------- ----------- ----------- ------------ 289 2006 84 48 281 2008 52 27 285 2007 9 8 277 2006 140 57

8 Using the GROUP BY clause GROUP BY creates groups for output rows, according to unique combination of values specified in the GROUP BY clause GROUP BY calculates a summary value for aggregate functions in subsequent phases Detail rows are “lost” after GROUP BY clause is processed SELECT FROM WHERE GROUP BY ; SELECT FROM WHERE GROUP BY ; SELECT SalesPersonID, COUNT(*) AS Cnt FROM Sales.SalesOrderHeader GROUP BY SalesPersonID; SELECT SalesPersonID, COUNT(*) AS Cnt FROM Sales.SalesOrderHeader GROUP BY SalesPersonID;

9 Using Aggregate functions Demo

10 GROUP BY and HAVING

11 GROUP BY and logical order of operations HAVING, SELECT, and ORDER BY must return a single value per group All columns in SELECT, HAVING, and ORDER BY must appear in GROUP BY clause or be inputs to aggregate expressions If a query uses GROUP BY, all subsequent phases operate on the groups, not source rows

12 Using GROUP BY with aggregate functions Aggregate functions are commonly used in SELECT clause, summarize per group: Aggregate functions may refer to any columns, not just those in GROUP BY clause SELECT productid, MAX(OrderQty) AS largest_order FROM Sales.SalesOrderDetail GROUP BY productid; SELECT productid, MAX(OrderQty) AS largest_order FROM Sales.SalesOrderDetail GROUP BY productid; SELECT CustomerID, COUNT(*) AS cnt FROM Sales.SalesOrderHeader GROUP BY CustomerID; SELECT CustomerID, COUNT(*) AS cnt FROM Sales.SalesOrderHeader GROUP BY CustomerID;

13 Filtering grouped data using HAVING Clause HAVING clause provides a search condition that each group must satisfy HAVING clause is processed after GROUP BY SELECT CustomerID, COUNT(*) AS Count_Orders FROM Sales.SalesOrderHeader GROUP BY CustomerID HAVING COUNT(*) > 10; SELECT CustomerID, COUNT(*) AS Count_Orders FROM Sales.SalesOrderHeader GROUP BY CustomerID HAVING COUNT(*) > 10;

14 Compare HAVING to WHERE clauses WHERE filters rows before groups created Controls which rows are placed into groups HAVING filters groups Controls which groups are passed to next logical phase Using a COUNT(*) expression in HAVING clause is useful to solve common business problems: Show only customers that have placed more than one order: Show only products that appear on 10 or more orders: SELECT Cust.Customerid, COUNT(*) AS cnt FROM Sales.Customer AS Cust JOIN Sales.SalesOrderHeader AS Ord ON Cust.CustomerID = ORD.CustomerID GROUP BY Cust.CustomerID HAVING COUNT(*) > 1; SELECT Cust.Customerid, COUNT(*) AS cnt FROM Sales.Customer AS Cust JOIN Sales.SalesOrderHeader AS Ord ON Cust.CustomerID = ORD.CustomerID GROUP BY Cust.CustomerID HAVING COUNT(*) > 1; SELECT Prod.ProductID, COUNT(*) AS cnt FROM Production.Product AS Prod JOIN Sales.SalesOrderDetail AS Ord ON Prod.ProductID = Ord.ProductID GROUP BY Prod.ProductID HAVING COUNT(*) >= 10; SELECT Prod.ProductID, COUNT(*) AS cnt FROM Production.Product AS Prod JOIN Sales.SalesOrderDetail AS Ord ON Prod.ProductID = Ord.ProductID GROUP BY Prod.ProductID HAVING COUNT(*) >= 10;

15 Using GROUP BY and HAVING Demo

16 Subqueries

17 Working with subqueries Subqueries are nested queries or queries within queries Results from inner query are passed to outer query Inner query acts like an expression from perspective of outer query Subqueries can be self-contained or correlated Self-contained subqueries have no dependency on outer query Correlated subqueries depend on values from outer query Subqueries can be scalar, multi-valued, or table-valued

18 Writing scalar subqueries Scalar subquery returns single value to outer query Can be used anywhere single-valued expression can be used: SELECT, WHERE, etc. If inner query returns an empty set, result is converted to NULL Construction of outer query determines whether inner query must return a single value SELECT SalesOrderID, ProductID, UnitPrice, OrderQty FROM Sales.SalesOrderDetail WHERE SalesOrderID = (SELECT MAX(SalesOrderID) AS LastOrder FROM Sales.SalesOrderHeader); SELECT SalesOrderID, ProductID, UnitPrice, OrderQty FROM Sales.SalesOrderDetail WHERE SalesOrderID = (SELECT MAX(SalesOrderID) AS LastOrder FROM Sales.SalesOrderHeader);

19 Writing multi-valued subqueries Multi-valued subquery returns multiple values as a single column set to the outer query Used with IN predicate If any value in the subquery result matches IN predicate expression, the predicate returns TRUE May also be expressed as a JOIN (test both for performance) SELECT CustomerID, SalesOrderId,TerritoryID FROM Sales.SalesorderHeader WHERE CustomerID IN ( SELECT CustomerID FROM Sales.Customer WHERE TerritoryID = 10); SELECT CustomerID, SalesOrderId,TerritoryID FROM Sales.SalesorderHeader WHERE CustomerID IN ( SELECT CustomerID FROM Sales.Customer WHERE TerritoryID = 10);

20 Writing queries using EXISTS with subqueries The keyword EXISTS does not follow a column name or other expression. The SELECT list of a subquery introduced by EXISTS typically only uses an asterisk (*). SELECT CustomerID, PersonID FROM Sales.Customer AS Cust WHERE EXISTS ( SELECT * FROM Sales.SalesOrderHeader AS Ord WHERE Cust.CustomerID = Ord.CustomerID); SELECT CustomerID, PersonID FROM Sales.Customer AS Cust WHERE EXISTS ( SELECT * FROM Sales.SalesOrderHeader AS Ord WHERE Cust.CustomerID = Ord.CustomerID); SELECT CustomerID, PersonID FROM Sales.Customer AS Cust WHERE NOT EXISTS ( SELECT * FROM Sales.SalesOrderHeader AS Ord WHERE Cust.CustomerID = Ord.CustomerID); SELECT CustomerID, PersonID FROM Sales.Customer AS Cust WHERE NOT EXISTS ( SELECT * FROM Sales.SalesOrderHeader AS Ord WHERE Cust.CustomerID = Ord.CustomerID);

21 Using subqueries Demo

22 Table Functions

23 Creating simple views Views are saved queries created in a database by administrators and developers Views are defined with a single SELECT statement ORDER BY is not permitted in a view definition without the use of TOP, OFFSET/FETCH, or FOR XML To sort the output, use ORDER BY in the outer query View creation supports additional options beyond the scope of this class CREATE VIEW HumanResources.EmployeeList AS SELECT BusinessEntityID, JobTitle, HireDate, VacationHours FROM HumanResources.Employee; SELECT * FROM HumanResources.EmployeeList CREATE VIEW HumanResources.EmployeeList AS SELECT BusinessEntityID, JobTitle, HireDate, VacationHours FROM HumanResources.Employee; SELECT * FROM HumanResources.EmployeeList

24 Creating simple inline table-valued functions Table-valued functions are created by administrators and developers Create and name function and optional parameters with CREATE FUNCTION Declare return type as TABLE Define inline SELECT statement following RETURN CREATE FUNCTION Sales.fn_LineTotal (@SalesOrderID INT) RETURNS TABLE AS RETURN SELECT SalesOrderID, CAST((OrderQty * UnitPrice * (1 - SpecialOfferID)) AS DECIMAL(8, 2)) AS LineTotal FROM Sales.SalesOrderDetail WHERE SalesOrderID = @SalesOrderID ; CREATE FUNCTION Sales.fn_LineTotal (@SalesOrderID INT) RETURNS TABLE AS RETURN SELECT SalesOrderID, CAST((OrderQty * UnitPrice * (1 - SpecialOfferID)) AS DECIMAL(8, 2)) AS LineTotal FROM Sales.SalesOrderDetail WHERE SalesOrderID = @SalesOrderID ;

25 Writing queries with derived tables Derived tables are named query expressions created within an outer SELECT statement Not stored in database – represents a virtual relational table When processed, unpacked into query against underlying referenced objects Allow you to write more modular queries Scope of a derived table is the query in which it is defined SELECT FROM( ) AS ; SELECT FROM( ) AS ;

26 Guidelines for derived tables Derived Tables Must Have an alias Have names for all columns Have unique names for all columns Not use an ORDER BY clause (without TOP or OFFSET/FETCH) Not be referred to multiple times in the same query Derived Tables May Use internal or external aliases for columns Refer to parameters and/or variables Be nested within other derived tables

27 Passing arguments to derived tables Derived tables may refer to arguments Arguments may be: Variables declared in the same batch as the SELECT statement Parameters passed into a table-valued function or stored procedure DECLARE @emp_id INT = 9; SELECT orderyear, COUNT(DISTINCT custid) AS cust_count FROM ( SELECT YEAR(orderdate) AS orderyear, custid FROM Sales.Orders WHERE empid=@emp_id ) AS derived_year GROUP BY orderyear; DECLARE @emp_id INT = 9; SELECT orderyear, COUNT(DISTINCT custid) AS cust_count FROM ( SELECT YEAR(orderdate) AS orderyear, custid FROM Sales.Orders WHERE empid=@emp_id ) AS derived_year GROUP BY orderyear;

28 Creating queries with common table expressions Use WITH clause to create a CTE: Define the table expression in WITH clause Reference the CTE in the outer query Assign column aliases (inline or external) Pass arguments if desired WITH CTE_year AS ( SELECT YEAR(OrderDate) AS OrderYear, customerID FROM Sales.SalesOrderHeader ) SELECT orderyear, COUNT(DISTINCT CustomerID) AS CustCount FROM CTE_year GROUP BY OrderYear; WITH CTE_year AS ( SELECT YEAR(OrderDate) AS OrderYear, customerID FROM Sales.SalesOrderHeader ) SELECT orderyear, COUNT(DISTINCT CustomerID) AS CustCount FROM CTE_year GROUP BY OrderYear;

29 Table functions Demo

30 Summary Aggregate functions are used in SELECT, HAVING, and ORDER By clauses, but are most frequently used with the GROUP BY clause and returns a scalar value Common built-in aggregate functions include STDEV STDEVP VAR VARP STDEV STDEVP VAR VARP SUM MIN MAX AVG COUNT COUNT_BIG CHECKSUM_AGG GROUPING GROUPING_ID CommonStatistical Other

31 Summary Use DISTINCT with aggregate functions to only summarize the unique values as it will eliminate duplicate values, not rows GROUP BY creates groups for output rows, according to unique combination of values specified in the GROUP BY clause. GROUP BY also calculates a summary value for aggregate functions in subsequent phases HAVING clause provides a search condition that each group must satisfy and is processed after the GROUP BY clause

32 Summary Subqueries are nested queries or queries within queries where the results from inner query are passed to the outer query Type of subqueries include Scalar subqueries Multi-valued subqueries Subqueries with the EXISTS clause

33 Summary Views are named tables expressions with definitions stored in a database that can be referenced in a SELECT statement just like a table Views are defined with a single SELECT statement and then saved in the database as queries Table-valued functions are created with the CREATE FUNCTION. They contain a RETURN type of table Derived tables allow you to write more modular queries as named query expressions that are created within an outer SELECT statement. They represent a virtual relational table so are not stored in the database CTEs are similar to derived tables in scope and naming requirements but unlike derived tables, CTEs support multiple definitions, multiple references, and recursion

34 Course Topics Querying Microsoft SQL Server 2012 Jump Start 01 | Introducing SQL Server 2012 SQL Server types of statements; other SQL statement elements; basic SELECT statements 02 | Advanced SELECT Statements DISTINCT, Aliases, scalar functions and CASE, using JOIN and MERGE; Filtering and sorting data, NULL values 03 | SQL Server Data Types Introduce data types, data type usage, converting data types, understanding SQL Server function types 04 | Grouping and Aggregating data Aggregate functions, GROUP BY and HAVING clauses, subqueries; self-contained, correlated, and EXISTS; Views, inline-table valued functions, and derived tables | Lunch Break Eat, drink, and recharge for the afternoon session

35 ©2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Office, Azure, System Center, Dynamics and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.


Download ppt "04 | Grouping and Aggregating Data Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager."

Similar presentations


Ads by Google