Download presentation
Presentation is loading. Please wait.
1
18/03/2016szabo.zsolt@nik.uni-obuda.hu1 DATABASES http://nik.uni-obuda.hu/szabozs/
2
18/03/2016szabo.zsolt@nik.uni-obuda.hu2 GROUPING SETS
3
18/03/2016szabo.zsolt@nik.uni-obuda.hu3 SELECT Order of suffixes 1.FROM 2.WHERE 3.GROUP BY 4.HAVING 5.UNION/MINUS 6.INTERSECT 7.ORDER BY 8.INTO
4
18/03/2016szabo.zsolt@nik.uni-obuda.hu4 GROUP BY Group by, Having – one-field use is "trivial": e.g. average salary for job or department Multiple fields: complex grouping, e.g. average salary for job AND department Still: only the grouped field and the grouping functions are allowed in the selection list!!!
5
18/03/2016szabo.zsolt@nik.uni-obuda.hu5 SELECT job, deptno, avg(sal) FROM emp GROUP BY job, deptno; JOB DEPTNO AVG(SAL) --------- ---------- ---------- CLERK 10 1300 MANAGER 10 2450 PRESIDENT 10 5000 ANALYST 20 3000 CLERK 20 950 MANAGER 20 2975 CLERK 30 950 MANAGER 30 2850 SALESMAN 30 1400
6
18/03/2016szabo.zsolt@nik.uni-obuda.hu6 SELECT mgr, job, deptno, avg(sal) FROM emp GROUP BY job, deptno, mgr; MGR JOB DEPTNO AVG(SAL) ---------- --------- ---------- ---------- 7839 MANAGER 30 2850 7839 MANAGER 10 2450 7782 CLERK 10 1300 7698 SALESMAN 30 1400 7839 MANAGER 20 2975 7902 CLERK 20 800 7698 CLERK 30 950 PRESIDENT 10 5000 7566 ANALYST 20 3000 7788 CLERK 20 1100
7
18/03/2016szabo.zsolt@nik.uni-obuda.hu7 DISADVANTAGES OF A SINGLE GROUP BY Not flexible enough One grouping per query, thus multiple queries are needed even if groupings are similar Slower Aim: One query, multiple groupings GROUPING SETS SELECT job, deptno, avg(sal) FROM emp GROUP BY GROUPING SETS ( (job, deptno) );
8
18/03/2016szabo.zsolt@nik.uni-obuda.hu8 NVL – Type matching! SELECT nvl(mgr, 'Nope'), deptno, avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno) ); SELECT nvl(to_char(mgr), 'Nope'), deptno, avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno) ); SELECT nvl(mgr, 0), deptno, avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno) );
9
18/03/2016szabo.zsolt@nik.uni-obuda.hu9 GROUP BY GROUPING SETS We can define multiple groupings inside one query, sub-results can be cached E.g. performing an MGR, DEPTNO and a JOB, DEPTNO grouping in ONE query: SELECT nvl(mgr, 0), deptno, nvl(job, 'Nope'), avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno), (deptno, job) );
10
18/03/2016szabo.zsolt@nik.uni-obuda.hu10 GROUP BY GROUPING SETS SELECT nvl(mgr, 0), nvl(deptno,0), nvl(job, 'NO'), avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno), (deptno, job), (mgr) ); SELECT nvl(mgr, 0), nvl(deptno,0), nvl(job, 'NO'), avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno), (deptno, job), (mgr), () ); Why do we have 0 as the mgr ???
11
18/03/2016szabo.zsolt@nik.uni-obuda.hu11
12
18/03/2016szabo.zsolt@nik.uni-obuda.hu12 GROUPING Using the GROUPING special "grouping function" we can determine if the given field is used for a grouping in a record Grouping function: allowed in the selection list Special: It can only work with a grouped field!
13
18/03/2016szabo.zsolt@nik.uni-obuda.hu13 GROUPING 0 = TRUE ? When using with a single and multi-field simple GROUP BY, it returns with 0 SELECT job, avg(sal), grouping(job) FROM emp GROUP BY job; SELECT deptno, job, avg(sal), grouping(job) FROM emp GROUP BY job, deptno; When using with grouping sets: grouping = 0 means that the field is being used in the aggregation for that record
14
18/03/2016szabo.zsolt@nik.uni-obuda.hu14 GROUPING SELECT mgr, deptno, job, avg(sal), GROUPING(mgr) as GMGR, GROUPING(deptno) as GDEPTNO, GROUPING(job) as GJOB FROM emp GROUP BY GROUPING SETS ( (mgr, deptno), (deptno, job), (mgr), () );
15
18/03/2016szabo.zsolt@nik.uni-obuda.hu15
16
18/03/2016szabo.zsolt@nik.uni-obuda.hu16 GROUPING SELECT CASE WHEN GROUPING(mgr)=0 THEN mgr ELSE 0 END as MGR, CASE WHEN GROUPING(deptno)=0 THEN deptno ELSE 0 END as DEPTNO, CASE WHEN GROUPING(job)=0 THEN job ELSE 'NO' END as JOB, avg(sal) FROM emp GROUP BY GROUPING SETS ( (mgr, deptno), (deptno, job), (mgr), ());
17
18/03/2016szabo.zsolt@nik.uni-obuda.hu17
18
18/03/2016szabo.zsolt@nik.uni-obuda.hu18 GROUPING_ID Unique identifier for each possible grouping column configuration SELECT mgr, deptno, job, avg(sal), GROUPING_ID(mgr, deptno, job) as GID FROM emp GROUP BY GROUPING SETS ( (mgr, deptno), (deptno, job), (mgr), () );
19
18/03/2016szabo.zsolt@nik.uni-obuda.hu19
20
18/03/2016szabo.zsolt@nik.uni-obuda.hu20 GROUP BY GROUPING SETS DRAWBACKS Too complicated, too long When do we need a query with three totally different grouping sets? What kind of caching can we do here? Usually, there are hierarchical relations between the grouping fields more meaning, more caching ROLLUP and CUBE GROUPING és GROUPING_ID can be used the same way
21
18/03/2016szabo.zsolt@nik.uni-obuda.hu21 CUBE GROUP BY CUBE (a, b, c) = GROUP BY GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( )). CUBE(field1, field2) the two fields have the same rank, all permutations are shown CUBE(job, deptno): In addition for the simple two-field grouping, we get the job- averages, the department-averages, and the total average
22
18/03/2016szabo.zsolt@nik.uni-obuda.hu22 SELECT job, deptno, avg(sal) FROM emp GROUP BY CUBE(job, deptno);
23
18/03/2016szabo.zsolt@nik.uni-obuda.hu23 ROLLUP GROUP BY ROLLUP (a, b, c) = GROUPING SETS ( (a, b, c), (a, b), (a), ( )) ROLLUP(field1, field2) the first field is hierarchically more important, we only take the permutations where it is used ROLLUP(job, deptno): In addition for the simple two-field grouping, we get the job- averages and the total average
24
18/03/2016szabo.zsolt@nik.uni-obuda.hu24 SELECT job, deptno, avg(sal) FROM emp GROUP BY ROLLUP(job, deptno); JOB DEPTNO AVG(SAL) --------- ---------- ---------- CLERK 10 1300 MANAGER 10 2450 PRESIDENT 10 5000 ANALYST 20 3000 CLERK 20 950 MANAGER 20 2975 CLERK 30 950 MANAGER 30 2850 SALESMAN 30 1400 ANALYST 3000 CLERK 1037,5 MANAGER 2758,33333 PRESIDENT 5000 SALESMAN 1400 2073,21429
25
18/03/2016szabo.zsolt@nik.uni-obuda.hu25 MIXTURE OF GROUPINGS GROUP BY a, CUBE (b, c) = GROUP BY GROUPING SETS ( (a, b, c), (a, b), (a, c), (a) ) GROUP BY a, ROLLUP (b, c) = GROUP BY GROUPING SETS ( (a, b, c), (a, b), (a) )
26
18/03/2016szabo.zsolt@nik.uni-obuda.hu26 ANALYTICAL FUNCTIONS
27
18/03/2016szabo.zsolt@nik.uni-obuda.hu27 SELECT Order of suffixes 1.FROM 2.WHERE 3.GROUP BY 4.HAVING 5.UNION/MINUS 6.INTERSECT 7.ORDER BY 8.INTO
28
18/03/2016szabo.zsolt@nik.uni-obuda.hu28 RANK FUNCTIONS
29
18/03/2016szabo.zsolt@nik.uni-obuda.hu29 BASIC PROBLEMS Functions: in the selection list Order by, group by: always executed after functions, so we might need sub-queries ROWNUM s*cks Solution: special functions, that can work together with the ordering / grouping of records
30
18/03/2016szabo.zsolt@nik.uni-obuda.hu30 RANK FUNCTIONS SELECT ROW_NUMBER() OVER (ORDER BY ENAME ASC) AS RNUM, ENAME FROM EMP; Simple rank functions: RANK() 1, 2, 2, 4 DENSE_RANK() 1, 2, 2, 3 PERCENT_RANK() percentage, [0..1] NO PARAMETERS!
31
18/03/2016szabo.zsolt@nik.uni-obuda.hu31 LET'S TRY THOSE… SELECT ename, sal, RANK() over (ORDER BY sal desc) FROM emp; + DENSE_RANK(), PERCENT_RANK()
32
18/03/2016szabo.zsolt@nik.uni-obuda.hu32 RANK WITHIN A GROUP SELECT deptno, ename, sal, RANK() OVER ( PARTITION BY deptno ORDER BY sal ) as RANG FROM emp;
33
18/03/2016szabo.zsolt@nik.uni-obuda.hu33 RANK WITHIN A GROUP SELECT deptno, job, ename, sal, RANK() OVER ( PARTITION BY deptno, job ORDER BY sal ) as RANG FROM emp; + ORDER BY …
34
18/03/2016szabo.zsolt@nik.uni-obuda.hu34 GROUPING FUNCTIONS WITH ANALYTICAL CLOSURES
35
18/03/2016szabo.zsolt@nik.uni-obuda.hu35 GROUPING FUNCTIONS WITH ANALYTICAL CLOSURES SELECT ename, sal, SUM(SAL) OVER (order by sal) as MySAL FROM emp; Ordered list! SELECT ename, sal, AVG(SAL) OVER (order by sal) as MySAL FROM emp;
36
18/03/2016szabo.zsolt@nik.uni-obuda.hu36 GROUPING FUNCTIONS WITH ANALYTICAL CLOSURES SELECT deptno, ename, sal, SUM(SAL) OVER ( partition by deptno order by ename ) as MySum FROM emp ORDER BY deptno, ename;
37
18/03/2016szabo.zsolt@nik.uni-obuda.hu37 GROUPING FUNCTIONS WITH ANALYTICAL CLOSURES alter session set nls_date_format='YYYY-MM-DD'; select ename, hiredate, sal from emp order by hiredate; select ename, hiredate, sal, sum(sal) over (order by hiredate) as OSSZ from emp order by hiredate; select ename, hiredate, sal, sum(sal) over (partition by to_char(hiredate, 'YYYY') order by hiredate) as OSSZ from emp order by hiredate;
38
18/03/2016szabo.zsolt@nik.uni-obuda.hu38 SUBSET (Sliding window) SELECT ename, sal, avg(SAL) OVER ( order by sal rows between 1 preceding and 2 following ) as MyAvg FROM emp;
39
18/03/2016szabo.zsolt@nik.uni-obuda.hu39 SUBSET (Sliding window) SELECT deptno, ename, sal, sum(SAL) OVER ( partition by deptno order by sal rows between 0 preceding and 1 following ) as MySum FROM emp;
40
18/03/2016szabo.zsolt@nik.uni-obuda.hu40 SUBSET (Sliding window) We can use the RANGE keyword SELECT deptno, ename, sal, sum(SAL) OVER ( order by sal range between current row and unbounded following ) as MySum FROM emp;
41
18/03/2016szabo.zsolt@nik.uni-obuda.hu41 OTHER ANALYTICAL FUNCTIONS FIRST_VALUE(), LAST_VALUE() RATIO_TO_REPORT() Ratio compared to the sum value SELECT ename, sal, RATIO_TO_REPORT(sal) OVER () FROM emp ORDER BY sal desc; + PARTITION BY
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.