Tuesday 28 March 2017

Analytic Functions in Oracle - VII

In this post, we will check out Hierarchical queries in Oracle. We use Oracle 12c as in other parts of the series.

Let us first take a look at the results of the simple query below. We will expand on this query later adding the Hierarchical components.

select employee_id, first_name, last_name, job_id, manager_id from hr.employees where employee_id = 206;

The results returned are shown below: 






Now, let us add some new keywords as shown in the query below:

select employee_id, first_name, last_name, job_id, manager_id from hr.employees start with employee_id = 206 connect by prior manager_id = employee_id;

The results returned are shown below: 








Note that the result set starts with employee_id 206 specified in the START WITH clause. The CONNECT BY PRIOR clause specifies the relationship between the parent and the child records. Here, manager_id on each record refers to a employee_id on another record. PRIOR keyword determines the direction of the hierarchy traversal. In the above query, the employee's manager record is returned as the next record and so on till the top record is reached. This can seen from the result set. The manager_id on the first record is employee_id on the second record and this process continues till that record is returned who has no manager_id.

Next, we see a query where we traverse in the opposite direction. We start with an employee with employee_id, 101, and fetch all the records that correspond to the subordinates as shown below:

select employee_id, first_name, last_name, job_id, manager_id from hr.employees start with employee_id = 101 connect by prior employee_id = manager_id;

The results returned are shown below: 














In the above query, the PRIOR keyword is followed by employee_id. So, in this case, all the subordinate records are returned starting with employee_id, 101.

In the next query shown below, we introduce a pseudocolumn, LEVEL that indicates the level from the root record with the root record having 1 as LEVEL. The next immediate children have a LEVEL of 2 and so on.

select employee_id, first_name, last_name, job_id, manager_id, level from hr.employees start with employee_id = 206 connect by prior manager_id = employee_id;

The results returned are shown below:








In the same manner, LEVEL has been added to the earlier query that returns all child records starting from a parent. Note that the level starts with 1 in the child record and and a value of 4 for the final parent.

select employee_id, first_name, last_name, job_id, manager_id, level from hr.employees start with employee_id = 101 connect by prior employee_id = manager_id;

The results returned are shown below:  













From the result, we can see that the parent has a level of 1 and all child records are numbered at their particular level. The levels are also confirmed by the common records in the above two result sets.

The CONNECT_BY_ICYCLE and NOCYCLE keywords go together in a query. When CONNECT_BY_ICYCLE is used and  NOCYCLE is not used, we get below error:

ORA-30930: NOCYCLE keyword is required with CONNECT_BY_ISCYCLE pseudocolumn
30930. 00000 -  "NOCYCLE keyword is required with CONNECT_BY_ISCYCLE pseudocolumn"
*Cause:    CONNECT_BY_ISCYCLE was specifed in a query which does not
           have the NOCYCLE keyword.
*Action:   Remove CONNECT_BY_ISCYCLE or add NOCYCLE.


So, let us add both the keywords in the manner shown below:

select employee_id, first_name, last_name, job_id, manager_id, connect_by_iscycle from hr.employees start with employee_id = 206 connect by nocycle prior manager_id = employee_id;

The results returned are shown below: 







The CONNECT_BY_ICYCLE pseudocolumn returns either a 1 or 0 depending on whether the record has a child who is also parent of the same record in question or not. In the example shown above, a value of 0 is indicative that there is no cyclic hierarchy involved in this set of records.

In the next query, we look at the SYS_CONNECT_BY_PATH. This function returns a delimited breadcrumb from the root to the current row.

select employee_id, first_name, last_name, level, sys_connect_by_path(last_name, '-') "Path" from hr.employees where department_id = 80 start with last_name = 'King'
connect by prior employee_id = manager_id and level <= 2;


The results returned are shown below: 










The result of SYS_CONNECT_BY_PATH is seen under the column, Path. From the root, King, the children along different paths are concatenated using delimiter, -, specified in the
SYS_CONNECT_BY_PATH function.

If we wish to order the above result by any specific criteria, we can use, ORDER SIBLINGS BY as shown in below query:

select employee_id, first_name, last_name, level, sys_connect_by_path(last_name, '~') "Path" from hr.employees where department_id = 80 start with last_name = 'King'
connect by prior employee_id = manager_id and level <= 2 order siblings by employee_id;


The results returned are shown below: 










Compared to the result returned earlier, the records are ordered by employee_id.

Next, we look at CONNECT_BY_ROOT. It returns the root node of the hierarchy with the records of the result set.

select employee_id, first_name, last_name, connect_by_root last_name, job_id, manager_id from hr.employees start with employee_id = 206 connect by prior manager_id = employee_id;

The results returned are shown below:







We can see that the last name, Gietz is returned in all records. In the same manner, Kochhar is returned in all records in below query:

select employee_id, first_name, last_name, connect_by_root last_name, job_id, manager_id from hr.employees start with employee_id = 101 connect by prior employee_id = manager_id;

The results returned are shown below:












CONNECT_BY_ISLEAF is a pseudocolumn that returns a 1 or 0 depending on whether the current record is a leaf or not in the hierarchy.

select employee_id, first_name, last_name, connect_by_isleaf, job_id, manager_id from hr.employees start with employee_id = 101 connect by prior employee_id = manager_id;

The results returned are shown below:












We will conclude this post with talking how to eliminate elements in the hierarchy. First, let us revisit the below query:

select employee_id, first_name, last_name, job_id, manager_id from hr.employees start with employee_id = 206 connect by prior manager_id = employee_id;

The results returned are shown below:







To remove employee_id with value 101 in the above result set, use the below query:

select employee_id, first_name, last_name, job_id, manager_id from hr.employees where employee_id != 101 start with employee_id = 206 connect by prior manager_id = employee_id;  

The results returned are shown below:







We see that record with value 101 is eliminated but the next record in the hierarchy with employee_id as 100 is retained in the result set.

Lastly, to remove the entire branch starting with record employee_id 100, we use the below query:

select employee_id, first_name, last_name, job_id, manager_id from hr.employees start with employee_id = 206 connect by prior manager_id = employee_id and employee_id != 101;

The results returned are shown below:






This concludes the discussion on Hierarchical queries in Oracle.

Monday 20 March 2017

Analytic Functions in Oracle - VI

In this post, we will focus on pivoting operation in Oracle. Pivot operation in Oracle was introduced in 11g version

AS in earlier posts, we will use Pivoting operation in Oracle lets one to aggregate rows followed by rotating the rows into columns.  To get an idea, let us understand run below query:

select department_id, count(*) from hr.employees where department_id <= 40 group by department_id;

The results returned are shown below: 










We can rotate the values in department_id column to be column headers using PIVOT keyword. Let us run below query for starters that will be used in the PIVOT query.

select department_id from hr.employees  where department_id <= 40;

The results returned are shown below: 
















Now, let us use the PIVOT query as shown below:

select * from (select department_id from hr.employees where department_id <= 40) pivot (count(*) for department_id in (10,20,30,40));

The results returned are shown below: 






The results show that an aggregate, count, of the departments is calculated on the fly and then the values in the department column is rotated to be column headers.  The expression following PIVOT keyword can be understood in the following way:

(count(*) for department_id in (10,20,30,40))

COUNT(*) is the aggregated entity, and will appear as a row. department_id following the FOR keyword is the column header. Expression (10,20,30,40) following the IN clause filters the column appearing after FOR keyword, department_id in this case.

The filtering of the columns in the final result can be done in the values in the parenthesis after IN keyword as shown in below query:

 select * from (select department_id from hr.employees) pivot (count(*) for department_id in (10,20,30,40));

The results returned are shown below: 







The results are the same as before.

Let us now add one more column called salary instead of count aggregate. The original query for reference is shown below:

select department_id, sum(salary) from hr.employees where department_id <= 40 group by department_id order by department_id;

The results returned are shown below:








 
The query with PIVOT keyword is shown below:

select * from (select department_id, salary from hr.employees) pivot (sum(salary)  for department_id in (10,20,30,40));

The results returned are shown below: 






The results are as expected.

Next, we take a look at aliasing of the resulting column headers. As before, let us check the reference query first.

select department_id, sum(salary) "Sum of Salary" from hr.employees where department_id <= 40 group by department_id order by department_id;

The results returned are shown below:










We see the salary column is aliased as expected.

The first query we see with PIVOT keyword is shown below:

select * from (select department_id, salary from hr.employees) pivot (sum(salary) "Sum of Salary" for department_id in (10,20,30,40));

The results returned are shown below: 





We see column header is a concatenation of the aggregate alias and the elements in IN clause. The next query has aliases for the elements in the IN clause as shown below:

select * from (select department_id, salary from hr.employees) pivot (sum(salary) "Sum of Salary" for department_id in (10 "A" ,20 "B",30 "C",40 "D") );

The results returned are shown below: 






Now, we see that the column header is a concatenation of the aggregate alias and the aliases in IN clause. Before we invoke the next query, let us run the reference that shows two aggregates:

select department_id, count(*), sum(salary) "Sum of Salary" from hr.employees where department_id <= 40 group by department_id order by department_id;

The results returned are shown below:









Now, we will replicate the above result but with the pivot operation with these two aggregates and also having aliases.

select * from (select department_id,salary from hr.employees ) pivot (count(*) as cnt,sum(salary) as sal for department_id in (10,20,30,40));

The results returned are shown below:





 The results are in line with our expectation that the aggregate alternate each other starting with COUNT aggregate.

The way to filter the results based on input in IN clause is shown in the next few examples. To get an idea, let us understand run below query:

select department_id, sum(salary) "Sum of Salary" from hr.employees where department_id <= 40 group by department_id order by department_id;

The results returned are shown below:










Let us run below query that filters department_id as shown below:

select * from (select department_id, salary from hr.employees) pivot (sum(salary) "Sum of Salary" for department_id in (select department_id from hr.employees where department_id <= 40));

We get below error:

ORA-00936: missing expression
00936. 00000 -  "missing expression"
*Cause:   
*Action:
Error at Line: 42 Column: 121


But, adding XML keyword after PIVOT keyword solves the issue as shown below:

select * from (select department_id, salary from hr.employees) pivot  xml (sum(salary) "Sum of Salary" for department_id in (select department_id from hr.employees where department_id <= 40));

The results returned are shown below: 








We get a result with xml tags that shows the filtered result. Opening the result saved as a .xml file, we see the below result filtered by department_id:



















Lastly, we see the UNPIVOT operation. It is the reverse of the PIVOT operation as shown in below query:

select value_type department_id,value cnt
from
(
    (
      select * from (select department_id from hr.employees)
pivot (count(*) for department_id in (10 "10",20 "20",30  "30",40  "40"))
    )
    unpivot
    (
        value
        for value_type in
            ("10","20","30","40")
    )
);

  
The results returned are shown below: 









The results are the same as returned by the first query in this blog.This concludes the post on pivoting in Oracle.