Showing posts with label general. Show all posts
Showing posts with label general. Show all posts

Tuesday, 10 April 2018

More Regular Expressions

In an earlier post we have taken a look at Regular Expressions. We extend the conversation on Regular Expressions in this post by covering a few more topics as inline modifiers, capturing groups, non-capturing groups and look arounds. Some of these topics may seem a bit involved at first glance. For demonstrating examples related to the different topics, we will be using the test harness mentioned here. The java program is saved to a folder called RegularExpressions. Let us use it to check out a few simple examples in Regular Expressions before we take up the topics mentioned earlier. The commands to run the test harness are below:

F:\>cd RegularExpressions

F:\RegularExpressions>javac RegexTestHarness.java

F:\RegularExpressions>java -classpath . RegexTestHarness

The results are shown below:









We can enter a regular expression that we intend to search for. Once we enter the pattern to search for, we get a prompt where we can enter the text that will be searched for the pattern entered earlier. Then, the program will return the results of the search. This process is then repeated till we exit the program using CTRL+C keys

Let us look for vowels in foobar. The results are shown below:

Enter your regex: [aeiou]
Enter input string to search: foobar
I found the text "o" starting at index 1 and ending at index 2.
I found the text "o" starting at index 2 and ending at index 3.
I found the text "a" starting at index 4 and ending at index 5.

Note that three results are returned and in each case the vowels are picked. We can add a restriction on the search to look for two two vowels appearing together as shown below:

Enter your regex: [aeiou]{2}
Enter input string to search: foobar
I found the text "oo" starting at index 1 and ending at index 3.

Inline modifiers have the syntax, (?z) where z is an alphabet like i or s. i means case insensitive. An example of its usage and the result is shown below:

Enter your regex: (?i)ms\.
Enter input string to search: Ms. Jones, MS. Parker and ms. White were at the party.
I found the text "Ms." starting at index 0 and ending at index 3.
I found the text "MS." starting at index 11 and ending at index 14.
I found the text "ms." starting at index 26 and ending at index 29.

If we wish to match case insensitive feature to be applicable to M only, then,

Enter your regex: (?i:M)s\.
Enter input string to search: Ms. Jones, MS. Parker and ms. White were at the party.
I found the text "Ms." starting at index 0 and ending at index 3.
I found the text "ms." starting at index 26 and ending at index 29.

(?s) enables the metacharacter . to match every character on a single line including newline characters. Two examples is shown below:

Enter your regex: (?s).*
Enter input string to search: Ms. Jones, MS. Parker and ms. White were at the party.
I found the text "Ms. Jones, MS. Parker and ms. White were at the party." starting at index 0 and ending at index 54.
I found the text "" starting at index 54 and ending at index 54.

Enter your regex: (?s)^.*
Enter input string to search: Ms. Jones, MS. Parker and ms. White were at the party.
I found the text "Ms. Jones, MS. Parker and ms. White were at the party." starting at index 0 and ending at index 54.

Enter your regex: (?s).*$
Enter input string to search: Ms. Jones, MS. Parker and ms. White were at the party.
I found the text "Ms. Jones, MS. Parker and ms. White were at the party." starting at index 0 and ending at index 54.
I found the text "" starting at index 54 and ending at index 54.

Capturing groups are quite interesting because they help us in picking up selectively those expressions that match the patterns that we are searching for and can be used for processing later. Any search pattern in parenthesis qualifies for a capturing group as shown below:

Enter your regex: (\w{3})
Enter input string to search: a bc def gh ijk
I found the text "def" starting at index 5 and ending at index 8.
I found the text "ijk" starting at index 12 and ending at index 15.

Any expression that matches three alphanumeric elements is returned. In the next example, we use back references in conjunction with two capturing groups:

Enter your regex: (\w{1})(\w{1})\2\1\2
Enter input string to search: abcde fghij abbab kllkl lmnop
I found the text "abbab" starting at index 12 and ending at index 17.
I found the text "kllkl" starting at index 18 and ending at index 23.

Non-capturing groups are very similar to capturing groups but the match is not picked. The syntax is too is very similar to that of capturing groups but we use ?: within the parenthesis as shown below:

Enter your regex: (?:\w{1})(\w{1})(\w{1})\1\1\2
Enter input string to search: abcbbc defghi klmllm
I found the text "abcbbc" starting at index 0 and ending at index 6.
I found the text "klmllm" starting at index 14 and ending at index 20.

Note that there are three groups: the first one is non-capturing and the next two are capturing groups. Since we have only two capturing groups, we can have only two back references

Next we take a look at look arounds. They take only a look for a search pattern either in the forward direction or in the backward direction. But, the search pattern is itself skipped. So, they are called look arounds. If they take a look in the forward direction, they are called Lookaheads, and if they take a look in the backward direction, then, they are called Lookbehinds. There are two types of Lookaheads: Positive Lookahead and Negative Lookahead. The syntax for Positive Lookahead is (?=). An example of Positive Lookahead is shown below:

Enter your regex: foo(?=bar)
Enter input string to search: One can often see foobar used in software code.
I found the text "foo" starting at index 18 and ending at index 21.

The check is made for foo followed by bar but only foo is captured but not the bar. This is evident in the next example:

Enter your regex: foo(?=bar)
Enter input string to search: foo in foobar is used separately also.
I found the text "foo" starting at index 7 and ending at index 10.

There are two foo in input string. But, only the foo that is followed by bar is picked. Negative Lookahead has syntax as (?!) and is same as Positive Lookahead but it will only pick when the search is not matched as shown in below example:

Enter your regex: foo(?!bar)
Enter input string to search: foo in foobar is used separately also.
I found the text "foo" starting at index 0 and ending at index 3.

There are two foo in input string. But, only the foo that is not followed by bar is picked by Negative Lookahead. The next look around is Lookbehind. The syntax for Lookbehind is (?<=). This is similar to Positive Lookahead except that the search pattern is before the picked unit. An example is shown below:

Enter your regex: (?<=foo)bar
Enter input string to search: foo, bar and foobar are often used in software examples
I found the text "bar" starting at index 16 and ending at index 19.

In the above example, between the standalone bar and the bar that is part of foobar, the bar that is preceded by foo is picked because only that satisfies the Lookbehind search pattern. Lastly, we take a look at Negative Lookbehind. Negative Lookbehind have the syntax (?<!). Negative Lookbehind is similar to Lookbehind but the search pattern should not find a match as is seen in below example:

Enter your regex: (?<!foo)bar
Enter input string to search: foo, bar and foobar are often used in software examples
I found the text "bar" starting at index 5 and ending at index 8.

We have used the same input text as in the case of Lookbehind example. Note that the bar that is not preceded by foo is picked and the bar that is preceded by foo is missed. 

This concludes the discussion on inline modifiers, capturing groups, non-capturing groups and look arounds in Regular Expressions

Tuesday, 17 January 2017

Ultimate Tips To Avoid Career Stagnation

The target audience for this post are Developers, Programmers, Software Engineers/Analysts, Package Implementation Specialists, Technical Leads, and IT Architects.

During the course of one's career, one would have definitely faced the problem of career stagnation at least once. Some face it early on in their careers while others may face it a few decades later. If not addressed in a timely manner, then, one runs the risk of being handed a pink slip in the long run. The best thing would be to avoid getting into such an unpleasant situation by following the tips listed below:

1) Be aware of the latest trends in technology or where the IT Industry is headed

In today's world, we are seeing a lot of disruption in all spheres of life. It would be good to see how these might impact one's career. One may also look up the hot skills for that year to see if one can align one's career to one of that.

2) Be innovative at work

Be on the lookout on how you can optimize IT processes, programs, etc and apply these at work

3) Pick up new skills

If you can learn new technologies related to your line of work, it will be really a career boost. Though this may take time depending on your learning capability and the time you can afford towards this activity, it will be worth the effort.

Even if you work on a single language/package only, there are new versions with new features that get released on a regular basis. It would be good to check that out and see if it can address any issue that you currently face or add value to your project.

4) Pick up Industry Domain Knowledge

While most IT Projects are of short durations, if you are lucky to be working on the same project for a long time, then, you would automatically pick up domain knowledge related to business on which your project is based. If not, definitely, make an effort to pick up at least one like, Retail, Finance, Manufacturing, etc.

5) Be flexible to take new related roles/positions

Sundar Pichai, Google CEO, says "If you don't fail sometimes, you are not being ambitious enough"

Be mentally prepared to reinvent yourself, if necessary.

6)  Work on areas that will add value to your career

Try to work on areas that will add value to your career and help in creating valuable bullet points in your resume. That way you will also improve your marketability within your organization as well as outside your organization

7) Align to company culture

Different companies have different work cultures. Be open to adapt to company work culture.

8) Network, network, network

Networking plays a huge role in one's career. So, devote some time for networking with peers and others.

Who knows?  You may land your next job via networking

Saturday, 14 January 2017

Using MD5

In this era of information, we are constantly downloading softwares, updates, new releases/versions, etc, and the sizes of these packages also have increased dramatically of late. To check if the downloaded software is complete, most sites now use MD5 algorithm that produces a 128-bit hash value. It can be used to verify the data intergrity of any downloaded software

The modus operandi is quite simple. After the software is downloaded, you run the MD5 program on your computer to generate the 128-bit hash value. If that compares with the one mentioned in the site from where the software is downloaded, then, the downloaded software is complete.

As an example, I recently downloaded R software, R-3.3.2-win.exe, from here. The true fingerprint of this package is given on the site as b2a206741bec6e837513c9929ea0c5d9. To compute the MD5 for the downloaded R software, we can use CertUtility as mentioned here.

The command is shown below:




 This matches the fingerprint on the R download site. So, the data integrity of the downloaded software has been maintained

Friday, 30 December 2016

Introduction

I had been long wanting to write a blog sharing knowledge that I chanced upon during the course of work or on work related areas. As you may have guessed from the title of the blog, the purpose of this blog is to refresh these areas


There will be a definite effort to keep the blog content simple and, at the same time, provide valuable insights to the reader, something, that the reader can relate to and apply it as they set fit


The inputs for this blog will be from various sources right from product documentation to information in public domain. Since they will be from a variety of sources, it will be difficult to acknowledge or cite all of them