Saturday 25 February 2017

Analytic Functions in Oracle - III

In this post, we will cover RANK, DENSE_RANK and Top N Queries. Oracle 12c is used for all queries.

RANK as the name suggests ranks any measure in a query using sequential numbers starting from 1. If in the ranking process, a tie is encountered, then, the same rank is repeated for all values that tie and the next rank in sequence is used after skipping as many rank values as was repeated in the tie . Let us see that by means of an example. For starters, let us use the same hr.employees table used in the previous posts. We are interested only in the results of the below query:

select * from hr.employees where employee_id <= 109 order by salary desc;

The results returned are shown below:  








The query returns ten records. Let us now run the first query using RANK shown below:

select employee_id, last_name, department_id, salary, rank() over (order by salary desc) "SALARY RANK" from   hr.employees where employee_id <= 109; 

The results returned are shown below:  












The results are as written above: a tie is seen at 17000, so there is a repeat of ranks for 17000 and at the next value of 12008, rank value 3 is skipped for one repeat of 17000 and a rank of 4 is returned. Same is true for the salary value 9000 where rank of 6 is skipped.

DENSE_RANK is similar to RANK except that in case a of tie, the rank values are not skipped. Let us replace RANK with DENSE_RANK in above query to see this contrast.

select employee_id, last_name, department_id, salary, dense_rank() over (order by salary desc) "SALARY DENSE RANK" from  hr.employees where employee_id <= 109;

The results returned are shown below: 











Results show that DENSE_RANK does not skip any values at all and ranks are repeated at ties. 

Let us add the partition keyword and see the results. The query using partition is show below:

select department_id, last_name, salary, rank() over (partition by department_id order by salary desc) "SALARY RANK" from hr.employees where employee_id <= 109;

The results returned are shown below:














The results show that the RANK function is applied department wise. The same can be tried using DENSE_RANK as shown below:

select department_id, last_name, salary, dense_rank() over (partition by department_id order by salary desc) "SALARY DENSE RANK" from hr.employees where employee_id <= 109;

The results returned are shown below:












The results are clearly in line with our understanding of DENSE_RANK as is seen in the ranking of Lorentz.

Next, we will pass values to RANK and DENSE_RANK function as shown below. The results returned are the same as the results shown above.

select rank(24000) within group (order by salary desc) "RANK OF 24000" from hr.employees where employee_id <= 109;






select dense_rank(24000) within group (order by salary desc) "DENSE RANK OF 24000" from hr.employees where employee_id <= 109;





select rank(17000) within group (order by salary desc) "RANK OF 17000" from hr.employees where employee_id <= 109;






select dense_rank(17000) within group (order by salary desc) "DENSE RANK OF 17000" from hr.employees where employee_id <= 109;






select rank(9000) within group (order by salary desc) "RANK OF 9000" from hr.employees where employee_id <= 109; 



select dense_rank(9000) within group (order by salary desc) "DENSE RANK OF 9000" from hr.employees where employee_id <= 109;






If the value passed to RANK or DENSE_RANK function does not exist in the table, then it will return a result such that it fits into the ranking of the existing values in the table. Let us see the below two queries, one for RANK and another for DENSE_RANK:

select rank(16000) within group (order by salary desc) "RANK OF 16000" from hr.employees where employee_id <= 109;







select dense_rank(16000) within group (order by salary desc) "DENSE RANK OF 16000" from hr.employees where employee_id <= 109;



16000 does not exist in the table. So, it is neatly ranked as though it is in the table. The important think to note is that, no errors or nulls are returned.

Last part of this post is the Top N Queries. Such type of queries answer questions like what are top 5 salaries for employees with employee_id less than 109. The query for that question is shown below:

select employee_id, last_name, department_id, salary from hr.employees where employee_id <= 109 order by salary desc fetch first 5 rows only;

The results returned are shown below:









In the same manner, we can use percentage to indicate the number of records returned as shown below:

select employee_id, last_name, department_id, salary from hr.employees where employee_id <= 109 order by salary desc fetch first 50 percent rows only;

The results returned are the same as shown below:









We can ignore the top 2 salary values and then query for the next 5 salaries using OFFSET  as shown below:

select employee_id, last_name, department_id, salary from hr.employees where employee_id <= 109 order by salary desc offset 2 rows fetch first 5 rows only; 

The results returned are the same as shown below: