If you've never heard of or used Oracle analytic functions please read this article first. It's by far the best article I've read at explaining what analytic functions are and how to use them. The Oracle analytic function documentation can be found here.
RATIO_TO_REPORT
RATIO_TO_REPORT compares the current value against the sum of the other set of values. The following example shows the percentage of each employees salary when compared to the sum of their department's salary. You'll notice that you can calculate the same value using a different method (sal_dept_ratio2) which may be why I haven't seen RATIO_TO_REPORT used that often.
SELECT d.dname, e.ename, e.sal, round(ratio_to_report(sal) OVER (PARTITION BY e.deptno),2) sal_dept_ratio, round(sal / sum(e.sal) OVER (PARTITION BY e.deptno),2) sal_dept_ratio2 FROM emp e, dept d WHERE e.deptno = d.deptno; -- Result DNAME ENAME SAL SAL_DEPT_RATIO SAL_DEPT_RATIO2 -------------- ---------- ---------- -------------- --------------- ACCOUNTING CLARK 2450 .28 .28 ACCOUNTING MILLER 1300 .15 .15 ACCOUNTING KING 5000 .57 .57 RESEARCH FORD 3000 .28 .28 RESEARCH SCOTT 3000 .28 .28 RESEARCH JONES 2975 .27 .27 RESEARCH SMITH 800 .07 .07 RESEARCH ADAMS 1100 .1 .1 SALES WARD 1250 .13 .13 SALES MARTIN 1250 .13 .13 SALES TURNER 1500 .16 .16 SALES JAMES 950 .1 .1 SALES ALLEN 1600 .17 .17 SALES BLAKE 2850 .3 .3
NTH_VALUE
NTH_VALUE returns the nth row in the window clause. It's extremely important that you explicitly define the window or your values may not make sense. I'll write another post explaining this later. Update: NTH_VALUE Windowing Clause
The following example shows the 2nd highest salary in each department.
SELECT d.dname, e.ename, e.sal, nth_value(e.sal, 2) OVER ( PARTITION BY e.deptno ORDER BY e.sal DESC -- Don't forget to define the window ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) sec_high_sal_dept FROM emp e, dept d WHERE e.deptno = d.deptno; -- Result DNAME ENAME SAL SEC_HIGH_SAL_DEPT -------------- ---------- ---------- ----------------- ACCOUNTING KING 5000 2450 ACCOUNTING CLARK 2450 2450 ACCOUNTING MILLER 1300 2450 RESEARCH FORD 3000 3000 RESEARCH SCOTT 3000 3000 RESEARCH JONES 2975 3000 RESEARCH ADAMS 1100 3000 RESEARCH SMITH 800 3000 SALES BLAKE 2850 1600 SALES ALLEN 1600 1600 SALES TURNER 1500 1600 SALES WARD 1250 1600 SALES MARTIN 1250 1600 SALES JAMES 950 1600
LISTAGG
LISTAGG was a highly requested feature that was implemented in 11gR2. Overall LISTAGG allows you to group values in multiple rows and put them in a comma delimited column on one row.
The following example (which is not an analytic function) shows all the employees for each department:
SELECT d.dname, listagg(e.ename, ',') WITHIN GROUP (ORDER BY e.ename) dept_emp_list FROM emp e, dept d WHERE e.deptno = d.deptno GROUP BY d.dname; -- Result DNAME DEPT_EMP_LIST -------------- ---------------------------------------- ACCOUNTING CLARK,KING,MILLER RESEARCH ADAMS,FORD,JONES,SCOTT,SMITH SALES ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARDUsing the LISTAGG as an analytic function you can see each employee's colleagues in their department, including themselves.
SELECT d.dname, e.ename, listagg(e.ename, ',') WITHIN GROUP (ORDER BY e.ename) OVER (PARTITION BY e.deptno) dept_colleagues FROM emp e, dept d WHERE e.deptno = d.deptno; -- Result DNAME ENAME DEPT_COLLEAGUES -------------- ---------- -------------------------------------- ACCOUNTING CLARK CLARK,KING,MILLER ACCOUNTING KING CLARK,KING,MILLER ACCOUNTING MILLER CLARK,KING,MILLER RESEARCH ADAMS ADAMS,FORD,JONES,SCOTT,SMITH RESEARCH FORD ADAMS,FORD,JONES,SCOTT,SMITH RESEARCH JONES ADAMS,FORD,JONES,SCOTT,SMITH RESEARCH SCOTT ADAMS,FORD,JONES,SCOTT,SMITH RESEARCH SMITH ADAMS,FORD,JONES,SCOTT,SMITH SALES ALLEN ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD SALES BLAKE ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD SALES JAMES ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD SALES MARTIN ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD SALES TURNER ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD SALES WARD ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
NTILE
NTILE allows you to divide your values into buckets and see which bucket the current row belongs to. For educational institutions this is a very good function to let students know that they're in the top x% of the class.
Update (12-Apr-2014): NTILE vs WIDTH_BOX describes the different ways to bucket data in Oracle. It also explains why SAL 1250 exists in both buckets 2 and 3.
The following example shows the 3 tiers (or buckets) of salaries across the entire company. It allows you to easily see who's in the top 33% of salaries.
SELECT d.dname, e.ename, e.sal, ntile (3) over (order by sal desc) three_tier_sal FROM emp e, dept d WHERE e.deptno = d.deptno; -- Result DNAME ENAME SAL THREE_TIER_SAL -------------- ---------- ---------- -------------- ACCOUNTING KING 5000 1 RESEARCH SCOTT 3000 1 RESEARCH FORD 3000 1 RESEARCH JONES 2975 1 SALES BLAKE 2850 1 ACCOUNTING CLARK 2450 2 SALES ALLEN 1600 2 SALES TURNER 1500 2 ACCOUNTING MILLER 1300 2 SALES WARD 1250 2 SALES MARTIN 1250 3 RESEARCH ADAMS 1100 3 SALES JAMES 950 3 RESEARCH SMITH 800 3
Really cool post Martin! Discovered new things today :) I've used in the past before discovering LISTAGG the undocumented wm_concat() which basically groups&concatenates comma separated rows but LISTAGG is the way to go here!
ReplyDeleteBrilliant and useful; thanks so much.
ReplyDeleteGreat post. I tested and deployed listagg() in an interactive report within ten minutes of reading your article.
ReplyDeleteIt was exactly what I'd been looking for.
Many thanks. See you at KScope12!
Realy interesting analytic functions for me, too. I using and very like ROW_NUMBER(). NTH_VALUE() function new in (oracle database 11g).
ReplyDeleteThanks for this interesting post.
Regards
Mahir M. Quluzade
Good Stuff but I do not understand how the same salary can be in two different buckets.
ReplyDeleteHi Rick,
DeleteI'm not sure what your question is regarding the two different buckets. Can you please be more specific.
Martin
Hi Martin
ReplyDeleteIt is probably my misunderstanding. In the ex. Pertaining to salaries in the 3 buckets I was curious how the salary 1250 is in both buckets 2 and 3. It is probably just a function of sorted 15 rows / 3
Hi Rick,
DeleteExcellent question. Long story short, the same salary can exist in two buckets because of how Oracle "buckets data". I've written a follow up article detailing this: http://www.talkapex.com/2014/04/ntile-vs-widthbucket.html I've also updated the NTILE notes referencing that article in case anyone else has the same question.
Martin