After the first post on Numpy, we explore more aspects of Numpy. The idea is to cover as far as possible the most basic of these and, thus, lay the foundation for future work in areas of AI, ML or Data Science
Like in the earlier post, we will be using Jupyter notebook for all the work in this article. The code is in blue font and output is in green font below the code. The version details are given below:
import sys
print("Python version:", sys.version)
import numpy as np
print("NumPy version:", np.__version__)
NumPy version: 1.18.5
Some of the basic statistic functions are shown below:
numpy_array11 = np.array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 7, 8, 9, 10, 11, 12, 13]])
numpy_array11
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 7, 8, 9, 10, 11, 12, 13]])
numpy_array11.sum()
112
numpy_array11.min()
1
numpy_array11.max()
13
numpy_array11.mean()
7.0
np.median(numpy_array11)
7.0
numpy_array11.std()
3.391164991562634
numpy_array11.var()
11.5
numpy_array11.max(axis=0) ## max column wise
array([ 6, 7, 8, 9, 10, 11, 12, 13])
numpy_array11.max(axis=1) ## max row wise
array([ 8, 13])
numpy_array11.cumsum(axis=0) ## cumulative sum along column
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 7, 9, 11, 13, 15, 17, 19, 21]], dtype=int32)
numpy_array11.cumsum(axis=1) ## cumulative sum along row
array([[ 1, 3, 6, 10, 15, 21, 28, 36],
[ 6, 13, 21, 30, 40, 51, 63, 76]], dtype=int32)
There are two ways arrays can be copied: Shallow copy and Deep copy. Commands for both are shown below:
numpy_array11 = np.array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 7, 8, 9, 10, 11, 12, 13]])
numpy_array11
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 7, 8, 9, 10, 11, 12, 13]])
numpy_array11_view = numpy_array11.view()
numpy_array11_view
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 7, 8, 9, 10, 11, 12, 13]])
numpy_array11_view is a new view of array but shares the same data.This copying technique is called Shallow copy
numpy_array11_view is numpy_array11
False
Above test shows that numpy_array11_view is not numpy_array11 itself
numpy_array11_view.base is numpy_array11
True
Above test confirms that data in numpy_array11_view is based on numpy_array11
id(numpy_array11) # identifier of numpy_array11
2674179372976
id(numpy_array11_view) # identifier of numpy_array11_view
2674179374496
Identifier of numpy_array11_view is different from numpy_array11
numpy_array11_deepcopy = numpy_array11.copy()
numpy_array11_deepcopy
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 7, 8, 9, 10, 11, 12, 13]])
Above manner of copying is called deep copy in which a new object of array is created with the same data that is not shared
numpy_array11_deepcopy is numpy_array11
False
The test above shows that numpy_array11_deepcopy is not numpy_array11
numpy_array11_deepcopy.base is numpy_array11
False
Above test shows that data in numpy_array11_deepcopy is not based on numpy_array11
Sorting arrays examples are shown below:
numpy_array16= np.random.randint(100, size=(5, 5))
numpy_array16
array([[38, 56, 80, 29, 32],
[26, 55, 0, 47, 5],
[29, 27, 50, 49, 35],
[54, 50, 97, 56, 31],
[63, 27, 38, 98, 84]])
np.sort(numpy_array16) # sort along the last axis
array([[29, 32, 38, 56, 80],
[ 0, 5, 26, 47, 55],
[27, 29, 35, 49, 50],
[31, 50, 54, 56, 97],
[27, 38, 63, 84, 98]])
np.sort(numpy_array16, axis=0) # sort along the first axis
array([[26, 27, 0, 29, 5],
[29, 27, 38, 47, 31],
[38, 50, 50, 49, 32],
[54, 55, 80, 56, 35],
[63, 56, 97, 98, 84]])
np.sort(numpy_array16, axis=None) # sort the flattened array
array([ 0, 5, 26, 27, 27, 29, 29, 31, 32, 35, 38, 38, 47, 49, 50, 50, 54,
55, 56, 56, 63, 80, 84, 97, 98])
dtype = [('name', 'S10'), ('salary', float), ('age', int)]
values = [('Nick', 5500, 41), ('Kyle', 6500, 44),
('Ken', 7500, 44)]
structured_array1 = np.array(values, dtype=dtype) # create a structured array
np.sort(structured_array1, order='salary') # sort by salary
array([(b'Nick', 5500., 41), (b'Kyle', 6500., 44), (b'Ken', 7500., 44)],
dtype=[('name', 'S10'), ('salary', '<f8'), ('age', '<i4')])
np.sort(structured_array1, order=['age', 'salary']) # sort by age, salary
array([(b'Nick', 5500., 41), (b'Kyle', 6500., 44), (b'Ken', 7500., 44)],
dtype=[('name', 'S10'), ('salary', '<f8'), ('age', '<i4')])
Extraction of elements from a NumPy array is one of the most important activity any developer will encounter. There are various techniques like Subsetting, Slicing, Indexing, etc. Examples of these techniques are described below:
a) Subsetting: In this technique, a subset of the array is extracted and can be a single member or may have more members
numpy_array17 = np.random.randint(100, size=(5, 5))
numpy_array17
array([[ 9, 12, 28, 68, 76],
[74, 58, 37, 39, 46],
[46, 15, 46, 24, 34],
[33, 41, 53, 35, 30],
[49, 78, 86, 57, 38]])
numpy_array17[0] #Extracts first row
array([ 9, 12, 28, 68, 76])
numpy_array17[:,0] #Extracts first column
array([ 9, 74, 46, 33, 49])
numpy_array17[2,2] #Extracts single element
46
b) Slicing: In this technique, a slice consisting of one or more members is extracted
Syntax for Slicing is [lower:upper:step] where lower bound is included but upper bound is not included. step specifies stride between elements and is 1 by default, if unspecified. The first element in a single dimension array is 0 in the forward direction and is -1 for the last element in the reverse direction. Some examples are shown below:
numpy_array18 = np.array([10,11,12,13,14])
numpy_array18[1:3]
array([11, 12])
numpy_array18[-4:3]
array([11, 12])
numpy_array18[:3] # more like selecting head
array([10, 11, 12])
numpy_array18[-2:] # more like selecting tail
array([13, 14])
numpy_array18[::2]
array([10, 12, 14])
numpy_array18[::-1] # Reversing the array
array([14, 13, 12, 11, 10])
c) Indexing using boolean indices: In this technique, a boolean array is used like shown below:
numpy_array18 = np.array([10,11,12,13,14])
numpy_array18[numpy_array18 >= 13]
array([13, 14])
d) Fancy indexing: Lastly, we have the fancy indexing where we have the capability to select complex subsets and also modify them using assignment
rand = np.random.RandomState(1)
numpy_array19 = rand.randint(100, size=10)
print(numpy_array19)
[37 12 72 9 75 5 79 64 16 1]
[numpy_array19[3], numpy_array19[7], numpy_array19[2]]
[9, 64, 72]
Alternatively, we can pass a single list or array of indices to obtain the same result:
ind = [3, 7, 4]
numpy_array19[ind]
array([ 9, 64, 75])
When using fancy indexing, the shape of the result reflects the shape of the index arrays rather than the shape of the array being indexed:
indices = np.array([[3, 7],
[4, 5]])
numpy_array19[indices]
array([[ 9, 64],
[75, 5]])
Fancy indexing also works in multiple dimensions. Consider the following array:
numpy_array20 = np.arange(12).reshape((3, 4))
numpy_array20
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Like with standard indexing, the first index refers to the row, and the second to the column:
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
numpy_array20[row, col]
array([ 2, 5, 11])
We can modify the array as shown below:
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
numpy_array20[row, col] = -1
numpy_array20
array([[ 0, 1, -1, 3],
[ 4, -1, 6, 7],
[ 8, 9, 10, -1]])
A few other operations are described below:
a) Changing array shape:
numpy_array21 = np.arange(24).reshape((2,2,2,3))
numpy_array21
array([[[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]]],
[[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]]]])
numpy_array21 = numpy_array21.ravel() # Flatten the array
numpy_array21
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
numpy_array21.reshape((2,2,2,3)) # Reshape the array
array([[[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]]],
[[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]]]])
b) Add and remove members:
numpy_array22=np.array([[0,1],[2,3]])
numpy_array22
array([[0, 1],
[2, 3]])
np.resize(numpy_array22,(2,3))
array([[0, 1, 2],
[3, 0, 1]])
Above step returns a new array with the specified shape. If the shape of the new array is larger than the original array, then the new array is filled with repeated copies of the original array
np.append(numpy_array22,numpy_array22) # Append items to an array
array([0, 1, 2, 3, 0, 1, 2, 3])
np.insert(numpy_array22, 1, 5) #Insert values along the given axis before the given indices
array([0, 5, 1, 2, 3])
np.insert(numpy_array22, 1, 5, axis=1)
array([[0, 5, 1],
[2, 5, 3]])
np.delete(numpy_array22, 1, 0) # Return a new array with sub-arrays along an axis deleted
array([[0, 1]])
np.delete(numpy_array22, 1, 1) # Return a new array with sub-arrays along an axis deleted
array([[0],
[2]])
c) Combining arrays:
numpy_array23 = np.array([[1, 1], [2, 2], [3, 3]])
numpy_array23
array([[1, 1],
[2, 2],
[3, 3]])
np.concatenate((numpy_array23,numpy_array23),axis=0)
array([[1, 1],
[2, 2],
[3, 3],
[1, 1],
[2, 2],
[3, 3]])
Above steps shows a joining a sequence of arrays along an existing axis
np.concatenate((numpy_array23,numpy_array23),axis=1)
array([[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3]])
np.concatenate((numpy_array23, numpy_array23), axis=None)
array([1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3])
numpy_array24 = np.array([1, 2, 3])
numpy_array25 = np.array([2, 3, 4])
np.vstack((numpy_array24, numpy_array25)) # Stack arrays vertically (row-wise)
array([[1, 2, 3],
[2, 3, 4]])
np.hstack((numpy_array24, numpy_array25)) # stack arrays in sequence horizontally (column wise)
array([1, 2, 3, 2, 3, 4])
np.column_stack((numpy_array24, numpy_array25)) # Stack 1-D arrays as columns into a 2-D array
array([[1, 2],
[2, 3],
[3, 4]])
d) Splitting of arrays: Arrays can be split using hsplit (horizontal split) and vsplit (vertical split) as shown below:
numpy_array26 = np.arange(16.0).reshape(4, 4)
numpy_array26
array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.],
[12., 13., 14., 15.]])
np.hsplit(numpy_array26, 2) # Split an array into multiple sub-arrays horizontally (column-wise)
[array([[ 0., 1.],
[ 4., 5.],
[ 8., 9.],
[12., 13.]]),
array([[ 2., 3.],
[ 6., 7.],
[10., 11.],
[14., 15.]])]
np.hsplit(numpy_array26, 4)
[array([[ 0.],
[ 4.],
[ 8.],
[12.]]),
array([[ 1.],
[ 5.],
[ 9.],
[13.]]),
array([[ 2.],
[ 6.],
[10.],
[14.]]),
array([[ 3.],
[ 7.],
[11.],
[15.]])]
np.vsplit(numpy_array26, 2) # Split an array into multiple sub-arrays vertically (row-wise)
[array([[0., 1., 2., 3.],
[4., 5., 6., 7.]]),
array([[ 8., 9., 10., 11.],
[12., 13., 14., 15.]])]
np.vsplit(numpy_array26, 4)
[array([[0., 1., 2., 3.]]),
array([[4., 5., 6., 7.]]),
array([[ 8., 9., 10., 11.]]),
array([[12., 13., 14., 15.]])]
With this we have nearly covered all basic aspects of Numpy arrays. This concludes the posts on Numpy