Numpy - II

After the first post on Numpy, we explore more aspects of Numpy. The idea is to cover as far as possible the most basic of these and, thus, lay the foundation for future work in areas of AI, ML or Data Science

Like in the earlier post, we will be using Jupyter notebook for all the work in this article. The code is in blue font and output is in green font below the code. The version details are given below:

import sys
print("Python version:", sys.version)

import numpy as np
print("NumPy version:", np.__version__)

Python version: 3.8.3 (default, Jul  2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)]
NumPy version: 1.18.5

Some of the basic statistic functions are shown below:

numpy_array11 = np.array([[ 1, 2, 3, 4,  5,  6,  7,  8],
                          [ 6, 7, 8, 9, 10, 11, 12, 13]])

array([[ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 6,  7,  8,  9, 10, 11, 12, 13]])















numpy_array11.max(axis=0)   ## max column wise

array([ 6,  7,  8,  9, 10, 11, 12, 13])

numpy_array11.max(axis=1)   ## max row wise

array([ 8, 13])

numpy_array11.cumsum(axis=0) ## cumulative sum along column

array([[ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 7,  9, 11, 13, 15, 17, 19, 21]], dtype=int32)

numpy_array11.cumsum(axis=1) ## cumulative sum along row

array([[ 1,  3,  6, 10, 15, 21, 28, 36],
       [ 6, 13, 21, 30, 40, 51, 63, 76]], dtype=int32)

There are two ways arrays can be copied: Shallow copy and Deep copy. Commands for both are shown below:

numpy_array11 = np.array([[ 1, 2, 3, 4,  5,  6,  7,  8],
                          [ 6, 7, 8, 9, 10, 11, 12, 13]])

array([[ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 6,  7,  8,  9, 10, 11, 12, 13]])

numpy_array11_view = numpy_array11.view() 

array([[ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 6,  7,  8,  9, 10, 11, 12, 13]])

numpy_array11_view is a new view of array but shares the same data.This copying technique is called Shallow copy

numpy_array11_view is numpy_array11     


Above test shows that numpy_array11_view is not numpy_array11 itself

numpy_array11_view.base is numpy_array11


Above test confirms that data in numpy_array11_view is based on numpy_array11

id(numpy_array11)       # identifier of numpy_array11


id(numpy_array11_view)  # identifier of numpy_array11_view


Identifier of numpy_array11_view is different from numpy_array11

numpy_array11_deepcopy = numpy_array11.copy()


array([[ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 6,  7,  8,  9, 10, 11, 12, 13]])

Above manner of copying is called deep copy in which a new object of array is created with the same data that is not shared

numpy_array11_deepcopy is numpy_array11      


The test above shows that numpy_array11_deepcopy is not numpy_array11

numpy_array11_deepcopy.base is numpy_array11 


Above test shows that data in numpy_array11_deepcopy is not based on numpy_array11

Sorting arrays examples are shown below:

numpy_array16= np.random.randint(100, size=(5, 5))

array([[38, 56, 80, 29, 32],
       [26, 55,  0, 47,  5],
       [29, 27, 50, 49, 35],
       [54, 50, 97, 56, 31],
       [63, 27, 38, 98, 84]])

np.sort(numpy_array16)        # sort along the last axis

array([[29, 32, 38, 56, 80],
       [ 0,  5, 26, 47, 55],
       [27, 29, 35, 49, 50],
       [31, 50, 54, 56, 97],
       [27, 38, 63, 84, 98]])

np.sort(numpy_array16, axis=0)   # sort along the first axis

array([[26, 27,  0, 29,  5],
       [29, 27, 38, 47, 31],
       [38, 50, 50, 49, 32],
       [54, 55, 80, 56, 35],
       [63, 56, 97, 98, 84]])

np.sort(numpy_array16, axis=None)  # sort the flattened array

array([ 0,  5, 26, 27, 27, 29, 29, 31, 32, 35, 38, 38, 47, 49, 50, 50, 54,
       55, 56, 56, 63, 80, 84, 97, 98])

dtype = [('name', 'S10'), ('salary', float), ('age', int)]
values = [('Nick', 5500, 41), ('Kyle', 6500, 44),
          ('Ken', 7500, 44)]

structured_array1 = np.array(values, dtype=dtype)       # create a structured array
np.sort(structured_array1, order='salary')              # sort by salary

array([(b'Nick', 5500., 41), (b'Kyle', 6500., 44), (b'Ken', 7500., 44)],
      dtype=[('name', 'S10'), ('salary', '<f8'), ('age', '<i4')])

np.sort(structured_array1, order=['age', 'salary'])     # sort by age, salary

array([(b'Nick', 5500., 41), (b'Kyle', 6500., 44), (b'Ken', 7500., 44)],
      dtype=[('name', 'S10'), ('salary', '<f8'), ('age', '<i4')])

Extraction of elements from a NumPy array is one of the most important activity any developer will encounter. There are various techniques like Subsetting, Slicing, Indexing, etc. Examples of these techniques are described below:

a) Subsetting: In this technique, a subset of the array is extracted and can be a single member or may have more members

numpy_array17 = np.random.randint(100, size=(5, 5))

array([[ 9, 12, 28, 68, 76],
       [74, 58, 37, 39, 46],
       [46, 15, 46, 24, 34],
       [33, 41, 53, 35, 30],
       [49, 78, 86, 57, 38]])

numpy_array17[0]          #Extracts first row

array([ 9, 12, 28, 68, 76])

numpy_array17[:,0]        #Extracts first column

array([ 9, 74, 46, 33, 49])

numpy_array17[2,2]       #Extracts single element


b) Slicing: In this technique, a slice consisting of one or more members is extracted

Syntax for Slicing is [lower:upper:step] where lower bound is included but upper bound is not included. step specifies stride between elements and is 1 by default, if unspecified. The first element in a single dimension array is 0 in the forward direction and is -1 for the last element in the reverse direction. Some examples are shown below:

numpy_array18 = np.array([10,11,12,13,14])


array([11, 12])


array([11, 12])

numpy_array18[:3]   # more like selecting head

array([10, 11, 12])

numpy_array18[-2:]  # more like selecting tail

array([13, 14])


array([10, 12, 14])

numpy_array18[::-1]  # Reversing the array

array([14, 13, 12, 11, 10])

c) Indexing using boolean indices: In this technique, a boolean array is used like shown below:

numpy_array18 = np.array([10,11,12,13,14])

numpy_array18[numpy_array18 >= 13]

 array([13, 14])

d) Fancy indexing: Lastly, we have the fancy indexing where we have the capability to select complex subsets and also modify them using assignment

rand = np.random.RandomState(1)
numpy_array19 = rand.randint(100, size=10)

[37 12 72  9 75  5 79 64 16  1]

[numpy_array19[3], numpy_array19[7], numpy_array19[2]]

[9, 64, 72]

Alternatively, we can pass a single list or array of indices to obtain the same result:

ind = [3, 7, 4]


array([ 9, 64, 75])

When using fancy indexing, the shape of the result reflects the shape of the index arrays rather than the shape of the array being indexed:

indices = np.array([[3, 7],
                [4, 5]])


array([[ 9, 64],
       [75,  5]])

Fancy indexing also works in multiple dimensions. Consider the following array:

numpy_array20 = np.arange(12).reshape((3, 4))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Like with standard indexing, the first index refers to the row, and the second to the column:

row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
numpy_array20[row, col]

array([ 2,  5, 11])

We can modify the array as shown below:

row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
numpy_array20[row, col] = -1

array([[ 0,  1, -1,  3],
       [ 4, -1,  6,  7],
       [ 8,  9, 10, -1]])

A few other operations are described below:

a) Changing array shape:

numpy_array21 = np.arange(24).reshape((2,2,2,3))

array([[[[ 0,  1,  2],
         [ 3,  4,  5]],

        [[ 6,  7,  8],
         [ 9, 10, 11]]],

       [[[12, 13, 14],
         [15, 16, 17]],

        [[18, 19, 20],
         [21, 22, 23]]]])

numpy_array21 = numpy_array21.ravel()   # Flatten the array

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23])

numpy_array21.reshape((2,2,2,3))         # Reshape the array

array([[[[ 0,  1,  2],
         [ 3,  4,  5]],

        [[ 6,  7,  8],
         [ 9, 10, 11]]],

       [[[12, 13, 14],
         [15, 16, 17]],

        [[18, 19, 20],
         [21, 22, 23]]]])

b)  Add and remove members:


array([[0, 1],
       [2, 3]])


array([[0, 1, 2],
       [3, 0, 1]])

Above step returns a new array with the specified shape. If the shape of the new array is larger than the original array, then the new array is filled with repeated copies of the original array

np.append(numpy_array22,numpy_array22)   # Append items to an array

array([0, 1, 2, 3, 0, 1, 2, 3])

np.insert(numpy_array22, 1, 5)   #Insert values along the given axis before the given indices

array([0, 5, 1, 2, 3])

np.insert(numpy_array22, 1, 5, axis=1)

array([[0, 5, 1],
       [2, 5, 3]])

np.delete(numpy_array22, 1, 0)  # Return a new array with sub-arrays along an axis deleted

array([[0, 1]])

np.delete(numpy_array22, 1, 1)  # Return a new array with sub-arrays along an axis deleted


c) Combining arrays:

numpy_array23 = np.array([[1, 1], [2, 2], [3, 3]])

array([[1, 1],
       [2, 2],
       [3, 3]]) 


array([[1, 1],
       [2, 2],
       [3, 3],
       [1, 1],
       [2, 2],
       [3, 3]])

Above steps shows a joining a sequence of arrays along an existing axis


array([[1, 1, 1, 1],
       [2, 2, 2, 2],
       [3, 3, 3, 3]])

np.concatenate((numpy_array23, numpy_array23), axis=None)

array([1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3])

numpy_array24 = np.array([1, 2, 3])
numpy_array25 = np.array([2, 3, 4])

np.vstack((numpy_array24, numpy_array25))     # Stack arrays vertically (row-wise)

array([[1, 2, 3],
       [2, 3, 4]])

np.hstack((numpy_array24, numpy_array25))   # stack arrays in sequence horizontally (column wise)

array([1, 2, 3, 2, 3, 4])

np.column_stack((numpy_array24, numpy_array25))   # Stack 1-D arrays as columns into a 2-D array

array([[1, 2],
       [2, 3],
       [3, 4]])

d) Splitting of arrays: Arrays can be split using hsplit (horizontal split) and vsplit (vertical split) as shown below:

numpy_array26 = np.arange(16.0).reshape(4, 4)

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.]])

np.hsplit(numpy_array26, 2) # Split an array into multiple sub-arrays horizontally (column-wise)

[array([[ 0.,  1.],
        [ 4.,  5.],
        [ 8.,  9.],
        [12., 13.]]),
 array([[ 2.,  3.],
        [ 6.,  7.],
        [10., 11.],
        [14., 15.]])]

np.hsplit(numpy_array26, 4)

[array([[ 0.],
        [ 4.],
        [ 8.],
 array([[ 1.],
        [ 5.],
        [ 9.],
 array([[ 2.],
        [ 6.],
 array([[ 3.],
        [ 7.],

np.vsplit(numpy_array26, 2)   # Split an array into multiple sub-arrays vertically (row-wise)

[array([[0., 1., 2., 3.],
        [4., 5., 6., 7.]]),
 array([[ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])]

np.vsplit(numpy_array26, 4)

[array([[0., 1., 2., 3.]]),
 array([[4., 5., 6., 7.]]),
 array([[ 8.,  9., 10., 11.]]),
 array([[12., 13., 14., 15.]])]

With this we have nearly covered all basic aspects of Numpy arrays. This concludes the posts on Numpy

NumPy - I

Today I received a mail from a friend complaining that there are no new articles. While writing a new post on the blog had been running on my mind for the last few months, for lack of motivation, I was dilly dallying and resorted to the easy way out: procrastinating. Voilà, a shot in the arm and we are back in business

In this post, we will attempt to unveil some important aspects of NumPy. NumPy as we know today was released as NumPy 1.0 in 2006. As of today when we write this article, the version is 1.19.0. The Python versions supported by this release are 3.6-3.8. NumPy is the fundamental package for scientific computing in Python. At the centre of the NumPy package, a Python library, is the ndarray object or a n-dimensional arrays of  homogeneous elements

We will be using Jupyter notebook for all the work in this article. The code is in blue font and output is in green font below the code. The version details are given below:

import sys
print("Python version:", sys.version)

import numpy as np
print("NumPy version:", np.__version__)

Python version: 3.8.3 (default, Jul  2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)]
NumPy version: 1.18.5

Let's start with creating a few NumPy arrays and then calling them immediately:

numpy_array1 = np.array([1, 2, 3, 4, 5])

array([1, 2, 3, 4, 5])

numpy_array2 = np.array([True, False, True, False, True], dtype = bool)

array([ True, False,  True, False,  True])

numpy_array3 = np.array([1.1, 2.2, 3.3, 4.4, 5.5])

array([1.1, 2.2, 3.3, 4.4, 5.5])

numpy_array4 = np.array([1, 2, 3, 4, 5], dtype = np.uint8) ## dtype is Unsigned integer (0 to 255)

array([1, 2, 3, 4, 5], dtype=uint8)

numpy_array5 = np.array(['NumPy',"is","the",'fundamental',"package",'for',"scientific",'computing'])

array(['NumPy', 'is', 'the', 'fundamental', 'package', 'for',
       'scientific', 'computing'], dtype='<U11')

We can also create NumPy arrays from list:

python_list = [6, 7, 8, 9, 10]
numpy_array6 = np.array(python_list)

array([ 6,  7,  8,  9, 10])

Few more examples of NumPy arrays creation are shown below:

String = "1.1 2.2 3.3 4.4 5.5"
numpy_array7 = np.fromstring(String, dtype = np.double, sep = " ")

array([1.1, 2.2, 3.3, 4.4, 5.5])

numpy_array8 = np.zeros((5,2), dtype = int) ## initializes with zeros

array([[0, 0],
       [0, 0],
       [0, 0],
       [0, 0],
       [0, 0]])

numpy_array9 = np.eye(3, dtype = float) ## initializes with zeros but with ones along dialgonal

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

numpy_array10 = np.full((5, 4), 10) ## completes array with value for given shape

array([[10, 10, 10, 10],
       [10, 10, 10, 10],
       [10, 10, 10, 10],
       [10, 10, 10, 10],
       [10, 10, 10, 10]])

To check the properties of arrays, we can use the following commands:

numpy_array11 = np.array([[ 1, 2, 3, 4,  5,  6,  7,  8],
                          [ 6, 7, 8, 9, 10, 11, 12, 13]])

array([[ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 6,  7,  8,  9, 10, 11, 12, 13]])

numpy_array11.shape  ## shape

(2, 8)

numpy_array11.size  ## size


numpy_array11.ndim ## number of dimensions


numpy_array11.itemsize ## size of each item in byte


numpy_array11.nbytes ##size in bytes of all elements


numpy_array11.dtype ## type of elements


len(numpy_array11) ## length of array


Operations on NumPy arrays are shown in the next few commands:

numpy_array_sum = numpy_array1 + numpy_array2 # Addition

array([2, 2, 4, 4, 6])

numpy_array_difference = numpy_array1 - numpy_array2 # Subtraction

array([0, 2, 2, 4, 4])

numpy_array_product = numpy_array1 * numpy_array2 # Multiplication

array([1, 0, 3, 0, 5])

numpy_array_quotient = numpy_array1 / numpy_array1 # Division

array([1., 1., 1., 1., 1.])

numpy_array_square_root = numpy_array1 ** 0.5 # square root

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798])

numpy_array_raised_power = numpy_array1 ** 3 # Exponentiation

array([  1,   8,  27,  64, 125], dtype=int32)

numpy_array_sin = np.sin(numpy_array1) # array value treated as radians

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 , -0.95892427])

numpy_array_log = np.log(numpy_array1) # natural logarithm

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791])

Some of the constants that we may encounter in our line of work:

















To compare two NumPy arrays, we can use the following commands:

np.equal(numpy_array2, numpy_array1)  #Element wise comparison

array([ True, False, False, False, False])

numpy_array1 == numpy_array2  #Element wise comparison

array([ True, False, False, False, False])

np.array_equal(numpy_array1, numpy_array1)  # Array Comparison


np.array_equal(numpy_array1, numpy_array2)  # Array Comparison


Broadcasting in NumPy: For operations between arrays, a comparison is first made of their shapes element-wise. If the shapes are the same, then, no broadcasting is applied. But, if the sizes are not the same, then, for the operation to succeed, the size of  trailing axes for both arrays in an operation must either be the same size or one of them must be one. Else, a ValueError: operands could not be broadcast together with shapes ... exception is thrown, indicating that the arrays have incompatible shapes and broadcasting could not be applied

numpy_array12 = np.arange(1,16).reshape(3,5)

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

numpy_array12 + numpy_array12 # same shape so, no problem

array([[ 2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20],
       [22, 24, 26, 28, 30]])

numpy_array13 = np.array([1])


numpy_array12 + numpy_array13 # broadcasting is valid as size of trailing axis of numpy_array13 is 1

array([[ 2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16]])

numpy_array14 = np.array([1, 2, 3, 4, 5])

array([1, 2, 3, 4, 5])

numpy_array12 + numpy_array14 # broadcasting is valid as size of trailing axes of both arrays is 5 and is same

array([[ 2,  4,  6,  8, 10],
       [ 7,  9, 11, 13, 15],
       [12, 14, 16, 18, 20]])

numpy_array15 = np.array([1, 2])

array([1, 2])

numpy_array12 + numpy_array15  # error will be thrown

ValueError                                Traceback (most recent call last)
<ipython-input-32-ff4233b2f95d> in <module>
----> 1 numpy_array12 + numpy_array15  # error will be thrown

ValueError: operands could not be broadcast together with shapes (3,5) (2,) 
With this concept of Broadcasting in NumPy, we come to the end the first post on NumPy