Let's Refresh: January 2017

Wednesday 25 January 2017

Addition of Graph to Backtesting in Excel

Further to the post on backtesting using Excel, we will add a nice graph that shows the plot of SMA and LMA. Like in the last post, we will work with Excel 2016 for this post as well.

There are at least two reasons why we should be looking at graphs:

1) They are visually appealing. Graphs communicate better than just data.

2) They can be used as a first check to see if something is amiss.

We will continue with the results of this post. Adding the graph is quite simple with the following steps:

1) Add a new worksheet called Graph to the excel file by clicking + button as show below:

2) Then, click on Graph worksheet and invoke the graphing feature as shown below:

3) Right click the empty area in the graph and click on Select Data as shown below:

4) Then, click on appl worksheet and select data in Date, SMA and LMA columns as shown below:

5) Then, click on OK button to confirm the data source

6) This will take one to the Graph worksheet as shown below:

7) Then, extend it as shown below:

8) Click on y-axis and Right Click to select Format Axis as shown below:

9) Enter 80 in Minimum as shown below:

This will alter the graph and it will be more informative on when the SMA line intersects the LMA line.

10) Then, click the + sign and check the box next to Axis Titles and Legend as shown below:

11) Change x-axis title to Date and y-axis title to SMA and LMA. Change the Chart Title to SMA & LMA versus Date as shown below:

12) Lastly, two boxes with text as "BUY" and "SELL" have been added to show the time line around which the transactions happened

The graph looks okay. This concludes the addition of graph to backtesting of Excel post.

Sunday 22 January 2017

Backtesting Trading Strategy in Excel

In this post, we will take a brief look at backtesting a trading strategy in Excel.

Excel is the tool of choice for most Financial Analysts. So, it only natural that we will use Excel for backtesting. Besides, there are many features in Excel that can be used for data analysis. In later posts, we will explore some of these features. Excel also has a programming element called Visual Basic for Applications (VBA). It is assumed that you already are familiar with Excel and its basic features. For all the work in the post, we will use Excel 2016.

For this post, we will consider one of the simplest strategies, Moving Average Crossover strategy using Simple Moving Average. The strategy involves calculation of two simple averages of the closing price and generating trade signals when the Short Term Moving Average(SMA) crosses the Long Term Moving Average(LMA). If the SMA is lower than LMA and crosses LMA to a higher value than LMA, then, a buy signal is generated. Similarly, if the SMA is higher than LMA and crosses LMA to a lower value than LMA, then, a sell signal is generated.

Following are the steps for back testing using this strategy:

1) Getting the data that will be used for back testing

Two common sources of getting historical data are Google Finance and Yahoo Finance. For our example, we have downloaded Apple Inc stock prices on NASDAQ for 2016 from Google Finance. After downloading, you can open the data files in Excel as shown below:

2) Calculating the Technical Indicators as SMA and LMA

We will consider SMA of 5 days and LMA of 20 days

The formula for 5 day SMA is using AVERAGE function on the closing price. Since the first four values will not be available, we will calculate it separately and enter the data manually as we will lose out on data points otherwise.

The formula entered in G6 will be AVERAGE(E2:E6). This can then be extended to the rest of the SMA column. Screenshot is shown below for reference.

Similarly, LMA is calculated but will use the formula AVERAGE(E2:E19) and entered in H21. This formula again will be extended for the rest of the LMA column. Screenshot is shown below for reference.

3) Once the Technical Indicators are calculated, Trade Signals can be generated

A Buy signal is generated if the SMA is lower than LMA and crosses LMA to a higher value than LMA. A Sell signal is generated if the SMA is higher than LMA and crosses LMA to a lower value than LMA. The formula is used is:

=IF(AND(G2<H2,G3>H3),"BUY",IF(AND(G2>H2,G3<H3),"SELL",""))

and is to be entered in cell I3. This formula can then be extended to the rest of the column.

If you are on Excel 2016, then, according to the Microsoft documentation, you should be able to use IFS on the lines described below:

=IFS(AND(G2<H2,G3>H3),"BUY", AND(G2>H2,G3<H3),"SELL",TRUE,"")

Per the documentation, Microsoft claims that IFS function is applicable to Excel 2016. However, I do not see it in the Excel 2016 that I bought, so, let us park this discussion here.

A screen shot with the Trade Signal formula is shown below:

4) Trade Price is next calculation

The logic is quite simple: If we have a trade signal of BUY or SELL, then, we will use the Open Price of the next day. If not, then, we carry over the previous value of the Trade Price.

Below formula is entered in cell I3:

=IF(I2<>"",B3,J2)

This formula is then extended to the rest of the column. Screenshot is shown below:

5) Returns calculation follows the Trade Price

For the purpose of calculating returns, we will assume that we deal with 1 unit at the start of the transaction. In our case, it will be a buy on 22nd February. Then, after that, we deal with 2 units, that is, we have a position on 1 unit at any point. Also, we discount any brokerage or transaction charges that may accompany the trades.

The below formula for the returns is entered in J3:

=IF(AND(I2<>"",J2<>0),IF(I2="BUY",1-J3/J2,J3/J2-1),"")

The column formatting can be converted to percentage. Formula is shown below:

6) After the Returns are calculated, calculate Total Returns

As we see many blanks in the excel sheet where no trades have occurred, we copy the Returns to another sheet and remove the blank rows as described in the steps below:

a) Select the Returns columns as shown below:

b) Click F5

c) In the Dialog Box titled Go To, click Special button

d) In the Dialog Box titled Go To Special, click on Blank Radio Button and click on OK button

This will select all blank rows in the worksheet

e) Right click on selected column and click Delete as shown below

f) Select Entire Row Radio Button in the Dialog Box titled Delete

This will delete all blank rows

Then, as a last step we calculate the Trade metrics as shown below:

Total Positive Trades = COUNTIF(A2:A11,">0")

Total Negative Trades = COUNTIF(A2:A12,"<0")

Total Trades = COUNTIF(A2:A11,"<>""")

Total Returns = =SUM(A2:A11)

The final result is shown below:

It is heartening to note the positive returns.

In future posts, we will see if we can improve on the trading strategy described.

Tuesday 17 January 2017

Ultimate Tips To Avoid Career Stagnation

The target audience for this post are Developers, Programmers, Software Engineers/Analysts, Package Implementation Specialists, Technical Leads, and IT Architects.

During the course of one's career, one would have definitely faced the problem of career stagnation at least once. Some face it early on in their careers while others may face it a few decades later. If not addressed in a timely manner, then, one runs the risk of being handed a pink slip in the long run. The best thing would be to avoid getting into such an unpleasant situation by following the tips listed below:

1) Be aware of the latest trends in technology or where the IT Industry is headed

In today's world, we are seeing a lot of disruption in all spheres of life. It would be good to see how these might impact one's career. One may also look up the hot skills for that year to see if one can align one's career to one of that.

2) Be innovative at work

Be on the lookout on how you can optimize IT processes, programs, etc and apply these at work

3) Pick up new skills

If you can learn new technologies related to your line of work, it will be really a career boost. Though this may take time depending on your learning capability and the time you can afford towards this activity, it will be worth the effort.

Even if you work on a single language/package only, there are new versions with new features that get released on a regular basis. It would be good to check that out and see if it can address any issue that you currently face or add value to your project.

4) Pick up Industry Domain Knowledge

While most IT Projects are of short durations, if you are lucky to be working on the same project for a long time, then, you would automatically pick up domain knowledge related to business on which your project is based. If not, definitely, make an effort to pick up at least one like, Retail, Finance, Manufacturing, etc.

5) Be flexible to take new related roles/positions

Sundar Pichai, Google CEO, says "If you don't fail sometimes, you are not being ambitious enough"

Be mentally prepared to reinvent yourself, if necessary.

6) Work on areas that will add value to your career

Try to work on areas that will add value to your career and help in creating valuable bullet points in your resume. That way you will also improve your marketability within your organization as well as outside your organization

7) Align to company culture

Different companies have different work cultures. Be open to adapt to company work culture.

8) Network, network, network

Networking plays a huge role in one's career. So, devote some time for networking with peers and others.

Who knows? You may land your next job via networking

Saturday 14 January 2017

Using MD5

In this era of information, we are constantly downloading softwares, updates, new releases/versions, etc, and the sizes of these packages also have increased dramatically of late. To check if the downloaded software is complete, most sites now use MD5 algorithm that produces a 128-bit hash value. It can be used to verify the data intergrity of any downloaded software

The modus operandi is quite simple. After the software is downloaded, you run the MD5 program on your computer to generate the 128-bit hash value. If that compares with the one mentioned in the site from where the software is downloaded, then, the downloaded software is complete.

As an example, I recently downloaded R software, R-3.3.2-win.exe, from here. The true fingerprint of this package is given on the site as b2a206741bec6e837513c9929ea0c5d9. To compute the MD5 for the downloaded R software, we can use CertUtility as mentioned here.

The command is shown below:

This matches the fingerprint on the R download site. So, the data integrity of the downloaded software has been maintained

Tuesday 10 January 2017

Surrogate Keys

In this post, we see the advantages of using a surrogate key in a datawarehouse. But, first, a brief look at natural keys in OLTP applications. Natural keys are created by using data that constitute that record and uniquely identify records in a relational database. A surrogate key, by contrast, is just an integer and nothing can be inferred from it. The reasons why surrogate keys are preferred in a datawarehouse are:

1) Surrogate keys occupy less space than natural keys

Natural keys are usually alphanumeric strings like '978-3-16-148410-0' (a 13 digit ISBN number that uniquely identifies a book) and occupy more space compared to a surrogate integer key like 1010102 (a random integer). This can be seen from the below query:

select vsize('978-3-16-148410-0') "Natural Key", vsize(1010102) "Surrogate Key" from dual;

returns result shown below:

VSIZE function in Oracle returns the number of bytes in the internal representation. We can can see that the number of bytes is more in case of natural key

A direct result of this is that related indexes also will occupy less space in case of surrogate keys resulting in better performance

2) Surrogate keys help in maintaining historical context

Datawarehouses maintain history for records in OLTP applications. So, in case a record in OLTP application is modified or deleted, then, while the natural key of that record may be lost in OLTP applications, the corresponding record is preserved in a datawarehouse for analysis in future using slowly changing dimensions. While a natural key may be updated because of any new business requirement within an organization or because of any acquisition or merger of that organization with another organization, surrogate keys on the other hand are buffered from these changes

Friday 6 January 2017

Lambda in Java 1.8

This post is not a detailed treatise on Lambda expressions, but rather shows an example of how a program can be implemented differently. On page 444 of Java, The Complete Reference, Ninth Edition by Herbert Schildt, a short program has been written that uses a predefined functional interface called Function to calculate factorial of a number.

Below program calculates the factorial of a number, but uses the UnaryOperator interface instead. UnaryOperator represents an operation on a single operand (an integer in our case) that produces a result of the same type as its operand (factorial of the operand integer). The reason for the choice of UnaryOperator is not difficult to see in that we have an output for a single input. Following is the code:

package com.lambda;

// This program uses the UnaryOperator built-in functional interface

import java.util.function.UnaryOperator;

class UnaryOperatorInterface {
public static void main(String args[])
{
// This block lambda computes the factorial of an integer
// UnaryOperator is the functional interface.
UnaryOperator<Integer> factorial = (n) -> {
    int output = 1;
    for(int i=1; i <= n; i++)
        output = i * output;
    return output;
};
System.out.println("The factorial of 6 is " + factorial.apply(6));
}
}

The program output is shown below:

The factorial of 6 is 720

Wednesday 4 January 2017

Primary Key in Oracle

In the first post, we will revisit Primary Keys. All examples shown are validated against Oracle Database 12c Enterprise Edition Release 12.1.0.1.0.

What is a Primary Key?

A Primary Key is a column in a table or a view whose values uniquely identify rows in that table or view. If a combination of columns is used to uniquely identify rows in a table or a view, then, it is called a Composite Primary Key. It is implemented by means of a Primary Key Constraint. A Primary Key Constraint combines a NOT NULL constraint and a unique constraint at the same time. A Primary Key Constraint prevents the Primary Key column or Composite Primary Key columns from containing NUll values and also prevents multiple rows from having the same value in Primary Key column and having the same combination of values in the case of Composite Primary Key columns

Restrictions

There are some restrictions around Primary Key Constraint:

1) Column or columns constituting the Primary Key cannot contain NULL values. If we try to insert NULL values into such columns, we get the following error:

ORA-01400: cannot insert NULL into ("SCHEMA"."TABLE"."COLUMN")

2) A Composite Primary Key can consist of a maximum of 32 columns only. If it exceeds 32 columns, you will get below error:

SQL Error: ORA-02257: maximum number of columns exceeded
02257. 00000 - "maximum number of columns exceeded"
*Cause:    The number of columns in the key list exceeds the maximum number.
*Action:   Reduce the number columns in the list.

3) A table or a view cannot have more than one primary key. You will get the below error in case creation of a table with two primary keys is attempted:

SQL Error: ORA-02260: table can have only one primary key
02260. 00000 - "table can have only one primary key"
*Cause:    Self-evident.
*Action:   Remove the extra primary key.

4) Columns based on data types as BLOB, CLOB, NCLOB, LONG, LONG RAW, VARRAY, NESTED TABLE, BFILE, REF, TIMESTAMP WITH TIME ZONE, or user-defined type cannot be part of Primary Key

The following error is thrown in case we try to create a table with BLOB data type as a primary key:

SQL Error: ORA-02329: column of datatype LOB cannot be unique or a primary key
02329. 00000 - "column of datatype %s cannot be unique or a primary key"
*Cause:    An attempt was made to place a UNIQUE or a PRIMARY KEY constraint
           on a column of datatype VARRAY, nested table, object, LOB, FILE
           or REF.
*Action:   Change the column datatype or remove the constraint. Then retry
           the operation.

Likewise, you will get the following error in case a LONG column is used:

SQL Error: ORA-02269: key column cannot be of LONG datatype
02269. 00000 - "key column cannot be of LONG datatype"
*Cause:    Self-evident.
*Action:   Change the datatype of the column,
           or remove the column from the key.

5) The same column or combination of columns cannot be used for both a primary key and a unique key. This is only to be expected as a primary key already covers for the unique key as well the only difference being that while unique key columns can have NULL values, primary key columns cannot

Define Primary Key Constraint

Like other constraints (barring NOT NULL), a Primary Key Constraint can be defined syntactically in two ways:

1) Inline specification: When the constraint is defined as part of the definition of an individual column, it is called inline specification

2) Out of Line specification: When the constraint is defined as part of the table definition, it is called Out of Line specification

A Primary Key Constraint can be defined in either a CREATE TABLE statement or an ALTER TABLE statement

In the first example, we create a Primary Key Constraint on a table with inline specification using the CREATE TABLE statement:

create table employees
   (employee_number number(6,0) primary key,
    first_name varchar2(20 byte),
    last_name varchar2(25 byte),
    email varchar2(25 byte),
    phone_number varchar2(20 byte),
    hire_date date,
    job_id varchar2(10 byte),
    salary number(8,2),
    commission_pct number(2,2)
   );

This creates a record in the user_constraints table that can be verified by running below query:

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

We can see that Oracle has given a default name for the Primary Key Constraint and also associated an index with this Primary Key Constraint. The details of the index can be seen using below query:

select index_name, index_type, table_name, uniqueness, status from user_indexes where table_name = 'EMPLOYEES';

We can also name the Primary Key Constraint as EMP_NUMBER_PK if we replace the line

employee_number number(6,0) primary key

with

employee_number number(6,0) constraint emp_number_pk primary key

in the above CREATE TABLE statement

In the second example, we create a Composite Primary Key called EMPLOYEE_PK on a view based on the created table called EMPLOYEES with out of line specification using the CREATE VIEW statement:

create table employees
   (employee_number number(6,0),
   first_name varchar2(20 byte),
   last_name varchar2(25 byte),
   email varchar2(25 byte),
   phone_number varchar2(20 byte),
   hire_date date,
   job_id varchar2(10 byte),
   salary number(8,2),
   commission_pct number(2,2),
   constraint employee_pk primary key (employee_number,first_name,last_name)
   );

To get more details about the created constraint, index and columns used in the constraint, the following queries can be used:

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

select index_name, index_type, table_name, uniqueness, status from user_indexes where table_name = 'EMPLOYEES';

select constraint_name, column_name, position from user_cons_columns where constraint_name = 'EMPLOYEE_PK';

The position column in the result of the last query indicates the order of the columns that was used to create the Primary Key Constraint

In the third example, we create a table with no primary key, and then, add a Primary Key Constraint using ALTER command as shown below:

create table employees
   (employee_number number(6,0),
    first_name varchar2(20 byte),
    last_name varchar2(25 byte),
    email varchar2(25 byte),
    phone_number varchar2(20 byte),
    hire_date date,
    job_id varchar2(10 byte),
    salary number(8,2),
    commission_pct number(2,2)
   );

Running below query returns no results:

select constraint_name, constraint_type, index_name, status from user_constraints where table_name = 'EMPLOYEES';

The ALTER command is as follows:

alter table employees add constraint employee_pk primary key (employee_number, first_name, last_name);

Then, the below queries can be used to check for the details of the Primary Key Constraint. They will give the same results as seen in second example:

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

select index_name, index_type, table_name, uniqueness, status from user_indexes where table_name = 'EMPLOYEES';

select constraint_name, column_name, position from user_cons_columns where constraint_name = 'EMPLOYEE_PK';

Disabling & Enabling Primary Key Constraint

It is also possible to create a Primary Key Constraint with the deferrable initially deferred clause, disable it and, then, enable it at a later point in time as shown in fourth example

create table employees
   (employee_number number(6,0),
    first_name varchar2(20 byte),
    last_name varchar2(25 byte),
    email varchar2(25 byte),
    phone_number varchar2(20 byte),
    hire_date date,
    job_id varchar2(10 byte),
    salary number(8,2),
    commission_pct number(2,2),
    constraint employee_pk primary key (employee_number,first_name,last_name) deferrable initially deferred
   );

The DEFERRABLE INITIALLY DEFERRED clause means that the Primary Key Constraint is not enforced till a commit happens

Running below query shows that the status of the constraint is 'ENABLED'

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

The results of below query suggest that an index is created as before:

select index_name, index_type, table_name, uniqueness, status from user_indexes where table_name = 'EMPLOYEES';

It can be seen that unlike in the earlier case, this index is nonunique

Now, let us disable the Primary Key Constraint using below ALTER command:

alter table employees disable primary key;

Running below query shows that the status of the constraint is 'DISABLED'

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

We can see that the INDEX_NAME has a null value and that the status of the constraint is 'DISABLED'

Let us add a few records that violate the PRIMARY KEY CONSTRAINT using below insert commands:

insert into employees (employee_number,first_name,last_name,email,phone_number,hire_date,job_id,salary,commission_pct) values (101,'Michael','Bourne','mikey@mail.com','486534368',to_date('01-JAN-17','DD-MON-RR'),'MANAGER',6567,0.15);
insert into employees (employee_number,first_name,last_name,email,phone_number,hire_date,job_id,salary,commission_pct) values (101,'Michael','Bourne','mikey@mail.com','486534368',to_date('01-JAN-17','DD-MON-RR'),'MANAGER',6567,0.15);
insert into employees (employee_number,first_name,last_name,email,phone_number,hire_date,job_id,salary,commission_pct) values (102,'Sarah','Connor','sara@mail.com','746543727',to_date('05-DEC-16','DD-MON-RR'),'VP',9800,0.18);
insert into employees (employee_number,first_name,last_name,email,phone_number,hire_date,job_id,salary,commission_pct) values (103,'Serena','Tucker','serene@mail.com','723635373',to_date('01-DEC-16','DD-MON-RR'),'CONSULTANT',5645,0.1);

To confirm the records in the table, let us run below select statement:

select employee_number, first_name, last_name, email, hire_date,job_id,salary from employees;

We can see that there are two duplicate records

At a later point in time, we can enable the Primary Key Constraint using the below ALTER command:

alter table employees enable novalidate primary key;

The NOVALIDATE keyword means that existing records will not be validated but new record additions will be subjected to the scrutiny of the Primary Key Constraint

Running below query shows that the status of the constraint is 'ENABLED'

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

Trying to insert a duplicate record will result in below error:

ORA-00001: unique constraint (SCHEMA.PRIMARY_KEY_CONSTRAINT_NAME) violated
ORA-06512: at line 1

Use the script at \dbhome_1\RDBMS\ADMIN\utlexcpt.sql to create an exception table called PRIMARY_KEY_EXCEPTIONS

create table primary_key_exceptions(row_id rowid,
                    owner varchar2(128),
                    table_name varchar2(128),
                constraint varchar2(128));

Then, run below command that will push all the records violating the Primary Key Constraint into the newly created exception table

alter table employees enable primary key exceptions into primary_key_exceptions;

We see the following error:

ALTER TABLE EMPLOYEES ENABLE PRIMARY KEY EXCEPTIONS INTO PRIMARY_KEY_EXCEPTIONS
Error report:
SQL Error: ORA-02437: cannot validate (SCHEMA.EMPLOYEE_PK) - primary key violated
02437. 00000 - "cannot validate (%s.%s) - primary key violated"
*Cause:    attempted to validate a primary key with duplicate values or null
           values.
*Action:   remove the duplicates and null values before enabling a primary
           key.

Querying the exceptions table, PRIMARY_KEY_EXCEPTIONS, we see the erroneous records

select row_id, table_name, constraint from primary_key_exceptions;

Below query can be used to trace the records that violate the Primary Key Constraint in the original table and corrective action can be taken:

select * from employees where rowid in (select row_id from primary_key_exceptions);

Renaming Primary Key Constraint

Existing Primary Key Constraint can be renamed using the following command:

alter table employees rename constraint employee_pk to emp_pk;

Now, the following query will show the renamed Primary Key Constraint:

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

Dropping Primary Key Constraint

Finally, to drop the Primary Key Constraint, use the following command:

alter table employees drop primary key;

The last statement can be verified by using below query that will return no results:

select constraint_name, constraint_type, index_name,status from user_constraints where table_name = 'EMPLOYEES';

However, if the index associated with the Primary Key, needs to be retained, then, we need to use the

alter table EMPLOYEES drop primary key keep index;

The following query shows that the associated index is not dropped

select index_name, index_type, table_name, uniqueness, status from user_indexes WHERE table_name = 'EMPLOYEES';

So, this concludes the post on Primary Keys

Pages