Group functions using SQL Additional information in speaker notes!
1 update first_pay 2 set bonus = null 3 where name = 'Donald Brown’; Group functions SQL> SELECT * FROM first_pay; PAY_ NAME JO STARTDATE SALARY BONUS Linda Costa CI 15-JAN John Davidson IN 25-SEP Susan Ash AP 05-FEB Stephen York CM 03-JUL Richard Jones CI 30-OCT Joanne Brown IN 18-AUG Donald Brown CI 05-NOV Paula Adams IN 12-DEC For these exercises, I wanted a null value in one of the fields. The update above put a null value in the bonus field for Donald Brown.
Group functions SQL> SELECT COUNT(*) 2 FROM first_pay; COUNT(*) SQL> SELECT COUNT(name) 2 FROM first_pay; COUNT(NAME) SQL> SELECT COUNT(bonus) 2 FROM first_pay; COUNT(BONUS) COUNT * essentially does a count on everything and ignores null values. I think of it as a count of rows/records. COUNT(name) counts the names. In this case, each row/record has a name so the result is 8. As shown on the previous slide, bonus now has one record where the bonus is null. Therefore the COUNT(bonus) returns 7.
Group functions SQL> SELECT COUNT(NVL(bonus,0)) 2 FROM first_pay; COUNT(NVL(BONUS,0)) SQL> SELECT COUNT(NVL(bonus,1000)) 2 FROM first_pay; COUNT(NVL(BONUS,1000)) In these examples, I am replacing null values in bonus with a value. In the first example, I replaced it with 0 and in the second example I replaced it with It doesn’t matter what the replacement is, what matters is that it is no longer a null value and therefore it shows I the count.
Group functions SQL> SELECT SUM(bonus) 2 FROM first_pay; SUM(BONUS) SQL> SELECT SUM(NVL(bonus,0)) 2 FROM first_pay; SUM(NVL(BONUS,0)) SQL> SELECT SUM(NVL(bonus, 1000)) 2 FROM first_pay; SUM(NVL(BONUS,1000)) In this SUM, the record with a null value is ignored. The total is SQL> SELECT * FROM first_pay; PAY_ NAME JO STARTDATE SALARY BONUS Linda Costa CI 15-JAN John Davidson IN 25-SEP Susan Ash AP 05-FEB Stephen York CM 03-JUL Richard Jones CI 30-OCT Joanne Brown IN 18-AUG Donald Brown CI 05-NOV Paula Adams IN 12-DEC In this SUM, the record with the null value is set to 0. It is now included in the sum but has no impact because it is 0. In this SUM, the record with the null value is set to It is included in the sum and clearly impacts the total which is now 1000 bigger.
Group functions SQL> SELECT SUM(bonus), AVG(bonus) 2 FROM first_pay; SUM(BONUS) AVG(BONUS) SQL> SELECT SUM(NVL(bonus,0)), AVG(NVL(bonus,0)) 2 FROM first_pay; SUM(NVL(BONUS,0)) AVG(NVL(BONUS,0)) SQL> SELECT SUM(NVL(bonus,1000)), AVG(NVL(bonus,1000)) 2 FROM first_pay; SUM(NVL(BONUS,1000)) AVG(NVL(BONUS,1000)) In this example, the sum is taken of the 7 rows that do not contain null values and the sum is divided by the count of the 7 rows that do not contain null values to yield the average. In this example, the average is taken using all 8 columns because the NVL put a 0 in the column that contained a null. The sum divided by 8 is shown as the average. This time I am including the 1000 in the average so the sum for the average is 1000 higher and the division is still by 8 giving me the answer of 1500.
SQL> SELECT MIN(salary), MAX(salary) 2 FROM first_pay; MIN(SALARY) MAX(SALARY) SQL> SELECT MIN(bonus), MAX(bonus) 2 FROM first_pay; MIN(BONUS) MAX(BONUS) SQL> SELECT MIN(NVL(bonus,0)), MAX(NVL(bonus,0)) 2 FROM first_pay; MIN(NVL(BONUS,0)) MAX(NVL(BONUS,0)) Group functions This statement extracts the minimum salary and the maximum salary from the first_pay table. This extract the minimum bonus and the maximum bonus from the first_pay table. Note that there is a null value in this column that is not dealt with. In this example, the null value is replaced by 0 in both the MIN and MAX function. This means that the MIN field now sees the field with 0 as the minimum.
Group functions SQL> SELECT jobcode, count(name) 2 FROM first_pay 3 GROUP BY jobcode; JO COUNT(NAME) AP 1 CI 3 CM 1 IN 3 In this example, I want to get a count of how many people there are with each jobcode. This mean I need to GROUP BY jobcode. Because I am grouping on job code and therefore looking for a total by jobcode, I am allowed to SELECT the jobcode field. Since I want a count of the number of people with a specific jobcode I need to do a count. I put name in count because I was thinking of counting the people. Note that I could have used COUNT(*) as shown below. SQL> SELECT jobcode, count(*) 2 FROM first_pay 3 GROUP BY jobcode; JO COUNT(*) AP 1 CI 3 CM 1 IN 3
Group functions SQL> SELECT * FROM first_pay; PAY_ NAME JO STARTDATE SALARY BONUS Linda Costa CI 15-JAN John Davidson IN 25-SEP Susan Ash AP 05-FEB Stephen York CM 03-JUL Richard Jones CI 30-OCT Joanne Brown IN 18-AUG Donald Brown CI 05-NOV Paula Adams IN 12-DEC SQL> SELECT jobcode, COUNT(name) 2 FROM first_pay 3 WHERE salary <= GROUP BY jobcode; JO COUNT(NAME) AP 1 CI 2 CM 1 IN 2 In this example, I want to only include people in the groups when their salary is <= As you can see this excludes one record from the CI group and one record from the IN group.
SQL> SELECT jobcode, COUNT(name) 2 FROM first_pay 3 WHERE salary <= GROUP BY jobcode 5 ORDER BY jobcode desc; JO COUNT(NAME) IN 2 CM 1 CI 2 AP 1 SQL> SELECT jobcode, COUNT(name) 2 FROM first_pay 3 WHERE salary <= GROUP BY jobcode 5 ORDER BY COUNT(name); JO COUNT(NAME) AP 1 CM 1 CI 2 IN 2 Group functions In this example I want the output to be ordered by jobcode in descending order. The ORDER BY clause can be used to achieve this goal. Note on the previous slide, the results were in default order which is in ascending order by the GROUP BY column/field. In this example, I want to order by the count instead of by the group by field/column. Again, the GROUP BY clause can be used to achieve this goal. Because I did not specify ascending or descending, the default of ascending is used.
Group functions SQL> SELECT * FROM first_pay; PAY_ NAME JO STARTDATE SALARY BONUS Linda Costa CI 15-JAN John Davidson IN 25-SEP Susan Ash AP 05-FEB Stephen York CM 03-JUL Richard Jones CI 30-OCT Joanne Brown IN 18-AUG Donald Brown CI 05-NOV Paula Adams IN 12-DEC SQL> SELECT jobcode, COUNT(name) 2 FROM first_pay 3 WHERE jobcode != 'IN' 4 GROUP BY jobcode; JO COUNT(NAME) AP 1 CI 3 CM 1 In this example I want to group by jobcode except that in doing the grouping, I want to exclude all records where the jobcode = ‘IN’ As you can see the results are correct.
SQL> SELECT jobcode, bonus, SUM(salary) 2 FROM first_pay 3 GROUP BY jobcode, bonus; JO BONUS SUM(SALARY) AP CI CI CI CM IN IN Group functions SQL> SELECT * FROM first_pay; PAY_ NAME JO STARTDATE SALARY BONUS Linda Costa CI 15-JAN John Davidson IN 25-SEP Susan Ash AP 05-FEB Stephen York CM 03-JUL Richard Jones CI 30-OCT Joanne Brown IN 18-AUG Donald Brown CI 05-NOV Paula Adams IN 12-DEC This example groups by jobcode and then bonus within jobcode. In fact there are only two records with the the same jobcode and the same bonus, record 6666 and record They are shown at the bottom. For all of the other groupings there happens to be only one record.
SQL> SELECT * FROM donor; IDNO NAME STADR CITY ST ZIP DATEFST YRGOAL CONTACT Stephen Daniels 123 Elm St Seekonk MA JUL John Smith Jennifer Ames 24 Benefit St Providence RI MAY Susan Jones Carl Hersey 24 Benefit St Providence RI JAN-98 Susan Jones Susan Ash 21 Main St Fall River MA MAR Amy Costa Nancy Taylor 26 Oak St Fall River MA MAR John Adams Robert Brooks 36 Pine St Fall River MA APR Amy Costa 6 rows selected. SQL> SELECT state, contact, SUM(yrgoal) 2 FROM donor 3 GROUP BY state, contact; ST CONTACT SUM(YRGOAL) MA Amy Costa 150 MA John Adams 50 MA John Smith 500 RI Susan Jones 400 Group functions This shows grouping by state and then contact within state. Two records go into MA Amy Costa and two records go into RI Susan Jones. The other two totals are made up from one record each.
Group functions SQL> SELECT jobcode, MIN(salary), MAX(salary) 2 FROM first_pay 3 GROUP BY jobcode; JO MIN(SALARY) MAX(SALARY) AP CI CM IN This shows the minimum and maximum salary for each jobcode. SQL> SELECT jobcode, AVG(salary) 2 FROM first_pay 3 GROUP BY jobcode; JO AVG(SALARY) AP CI CM IN Shows the average salary of each jobcode group.
Group functions SQL> SELECT jobcode, AVG(salary) 2 FROM first_pay 3 GROUP BY jobcode; JO AVG(SALARY) AP CI CM IN From previous slide. SQL> SELECT jobcode, MIN(AVG(salary)), MAX(AVG(salary)) 2 FROM first_pay 3 GROUP BY jobcode; SELECT jobcode, MIN(AVG(salary)), MAX(AVG(salary)) * ERROR at line 1: ORA-00937: not a single-group group function SQL> SELECT MIN(AVG(salary)), MAX(AVG(salary)) 2 FROM first_pay 3 GROUP BY jobcode; MIN(AVG(SALARY)) MAX(AVG(SALARY)) Jobcode can not be used in this context. This returns the minimum group average and the maximum group average.
Group functions SQL> SELECT jobcode, SUM(salary), SUM(bonus) 2 FROM first_pay 3 GROUP BY jobcode; JO SUM(SALARY) SUM(BONUS) AP CI CM IN SQL> SELECT jobcode, SUM(salary), SUM(bonus) 2 FROM first_pay 3 GROUP BY jobcode 4 HAVING SUM(salary) > OR SUM(bonus) > 3000; JO SUM(SALARY) SUM(BONUS) CI IN This example shows the sum of salary and sum of bonus for all jobcodes. Now I decided I only wanted to see those groups where either the sum of the salary was greater than or the sum of the bonus was greater than This excludes AP because it meets neither criteria and it excludes CM because it also meets neither criteria. Because I am testing the groups after they have been formed, I have to use the HAVING clause.
SQL> SELECT jobcode, SUM(salary), SUM(bonus) 2 FROM first_pay 3 WHERE SUM(salary) > OR SUM(bonus) > GROUP BY jobcode; WHERE SUM(salary) > OR SUM(bonus) > 3000 * ERROR at line 3: ORA-00934: group function is not allowed here Group function This is the error that results from using the WHERE clause inappropriately. The HAVING clause should have been used here as shown on the previous slide. SQL> SELECT jobcode, SUM(salary), SUM(bonus) 2 FROM first_pay 3 GROUP BY jobcode 4 HAVING SUM(salary) > OR SUM(bonus) > 3000; JO SUM(SALARY) SUM(BONUS) CI IN Correct code using the HAVING clause (copied from previous slide).