If you've never heard of or used Oracle analytic functions please read this article first. It's by far the best article I've read at explaining what analytic functions are and how to use them. The Oracle analytic function documentation can be found here.
RATIO_TO_REPORT
RATIO_TO_REPORT compares the current value against the sum of the other set of values. The following example shows the percentage of each employees salary when compared to the sum of their department's salary. You'll notice that you can calculate the same value using a different method (sal_dept_ratio2) which may be why I haven't seen RATIO_TO_REPORT used that often.
SELECT d.dname, e.ename, e.sal, round(ratio_to_report(sal) OVER (PARTITION BY e.deptno),2) sal_dept_ratio, round(sal / sum(e.sal) OVER (PARTITION BY e.deptno),2) sal_dept_ratio2 FROM emp e, dept d WHERE e.deptno = d.deptno; -- Result DNAME ENAME SAL SAL_DEPT_RATIO SAL_DEPT_RATIO2 -------------- ---------- ---------- -------------- --------------- ACCOUNTING CLARK 2450 .28 .28 ACCOUNTING MILLER 1300 .15 .15 ACCOUNTING KING 5000 .57 .57 RESEARCH FORD 3000 .28 .28 RESEARCH SCOTT 3000 .28 .28 RESEARCH JONES 2975 .27 .27 RESEARCH SMITH 800 .07 .07 RESEARCH ADAMS 1100 .1 .1 SALES WARD 1250 .13 .13 SALES MARTIN 1250 .13 .13 SALES TURNER 1500 .16 .16 SALES JAMES 950 .1 .1 SALES ALLEN 1600 .17 .17 SALES BLAKE 2850 .3 .3
NTH_VALUE
NTH_VALUE returns the nth row in the window clause. It's extremely important that you explicitly define the window or your values may not make sense. I'll write another post explaining this later. Update: NTH_VALUE Windowing Clause
The following example shows the 2nd highest salary in each department.
SELECT d.dname, e.ename, e.sal,
nth_value(e.sal, 2) OVER (
PARTITION BY e.deptno ORDER BY e.sal DESC
-- Don't forget to define the window
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) sec_high_sal_dept
FROM emp e, dept d
WHERE e.deptno = d.deptno;
-- Result
DNAME ENAME SAL SEC_HIGH_SAL_DEPT
-------------- ---------- ---------- -----------------
ACCOUNTING KING 5000 2450
ACCOUNTING CLARK 2450 2450
ACCOUNTING MILLER 1300 2450
RESEARCH FORD 3000 3000
RESEARCH SCOTT 3000 3000
RESEARCH JONES 2975 3000
RESEARCH ADAMS 1100 3000
RESEARCH SMITH 800 3000
SALES BLAKE 2850 1600
SALES ALLEN 1600 1600
SALES TURNER 1500 1600
SALES WARD 1250 1600
SALES MARTIN 1250 1600
SALES JAMES 950 1600
LISTAGG
LISTAGG was a highly requested feature that was implemented in 11gR2. Overall LISTAGG allows you to group values in multiple rows and put them in a comma delimited column on one row.
The following example (which is not an analytic function) shows all the employees for each department:
SELECT d.dname, listagg(e.ename, ',') WITHIN GROUP (ORDER BY e.ename) dept_emp_list FROM emp e, dept d WHERE e.deptno = d.deptno GROUP BY d.dname; -- Result DNAME DEPT_EMP_LIST -------------- ---------------------------------------- ACCOUNTING CLARK,KING,MILLER RESEARCH ADAMS,FORD,JONES,SCOTT,SMITH SALES ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARDUsing the LISTAGG as an analytic function you can see each employee's colleagues in their department, including themselves.
SELECT d.dname, e.ename,
listagg(e.ename, ',') WITHIN GROUP (ORDER BY e.ename)
OVER (PARTITION BY e.deptno) dept_colleagues
FROM emp e, dept d
WHERE e.deptno = d.deptno;
-- Result
DNAME ENAME DEPT_COLLEAGUES
-------------- ---------- --------------------------------------
ACCOUNTING CLARK CLARK,KING,MILLER
ACCOUNTING KING CLARK,KING,MILLER
ACCOUNTING MILLER CLARK,KING,MILLER
RESEARCH ADAMS ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH FORD ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH JONES ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH SCOTT ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH SMITH ADAMS,FORD,JONES,SCOTT,SMITH
SALES ALLEN ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES BLAKE ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES JAMES ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES MARTIN ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES TURNER ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES WARD ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
NTILE
NTILE allows you to divide your values into buckets and see which bucket the current row belongs to. For educational institutions this is a very good function to let students know that they're in the top x% of the class.
Update (12-Apr-2014): NTILE vs WIDTH_BOX describes the different ways to bucket data in Oracle. It also explains why SAL 1250 exists in both buckets 2 and 3.
The following example shows the 3 tiers (or buckets) of salaries across the entire company. It allows you to easily see who's in the top 33% of salaries.
SELECT d.dname, e.ename, e.sal, ntile (3) over (order by sal desc) three_tier_sal FROM emp e, dept d WHERE e.deptno = d.deptno; -- Result DNAME ENAME SAL THREE_TIER_SAL -------------- ---------- ---------- -------------- ACCOUNTING KING 5000 1 RESEARCH SCOTT 3000 1 RESEARCH FORD 3000 1 RESEARCH JONES 2975 1 SALES BLAKE 2850 1 ACCOUNTING CLARK 2450 2 SALES ALLEN 1600 2 SALES TURNER 1500 2 ACCOUNTING MILLER 1300 2 SALES WARD 1250 2 SALES MARTIN 1250 3 RESEARCH ADAMS 1100 3 SALES JAMES 950 3 RESEARCH SMITH 800 3
Really cool post Martin! Discovered new things today :) I've used in the past before discovering LISTAGG the undocumented wm_concat() which basically groups&concatenates comma separated rows but LISTAGG is the way to go here!
ReplyDeleteBrilliant and useful; thanks so much.
ReplyDeleteGreat post. I tested and deployed listagg() in an interactive report within ten minutes of reading your article.
ReplyDeleteIt was exactly what I'd been looking for.
Many thanks. See you at KScope12!
Realy interesting analytic functions for me, too. I using and very like ROW_NUMBER(). NTH_VALUE() function new in (oracle database 11g).
ReplyDeleteThanks for this interesting post.
Regards
Mahir M. Quluzade
Good Stuff but I do not understand how the same salary can be in two different buckets.
ReplyDeleteHi Rick,
DeleteI'm not sure what your question is regarding the two different buckets. Can you please be more specific.
Martin
Hi Martin
ReplyDeleteIt is probably my misunderstanding. In the ex. Pertaining to salaries in the 3 buckets I was curious how the salary 1250 is in both buckets 2 and 3. It is probably just a function of sorted 15 rows / 3
Hi Rick,
DeleteExcellent question. Long story short, the same salary can exist in two buckets because of how Oracle "buckets data". I've written a follow up article detailing this: http://www.talkapex.com/2014/04/ntile-vs-widthbucket.html I've also updated the NTILE notes referencing that article in case anyone else has the same question.
Martin