A Beginner's Guide to Numpy Arrays

A Beginner's Guide to Numpy Arrays

Arrays

Why are you still using python lists when numpy arrays can accomplish the job faster? :o

In this guide, we will cover:

Difference Between Python Lists and Numpy Arrays

An array, unlike Python lists, is a data structure that stores a collection of objects of the same datatype. For vectorized computations, arrays are the best option.

Almost everything in Python is stored as an object. As a result, an ordinary int object comprises a number of mechanisms that enable it to function. Numpy, on the other hand, stores data using C primitive numeric types, making it quick for numeric computation and memory management.

Creating Arrays

The main data structure in NumPy is the ndarray(N-dimensional array). Data stored in ndarray is simply reffered to as an array.

To use numpy, you need to first install numpy library by running this command on your Ipython input cell: !pip install numpy or on your terminal using the command below: py -m pip install numpy - windows OS. python3 -m pip install numpy - Mac/Linux OS. If you are having issues with setting up your ipython environment, click here.

A simple way to create an array is to use the numpy array() function.

code-block-0

# Creating an array with numpy

# import the numpy library
import numpy as np

# create an array of intergers
a = np.array([1, 2, 3])

The Numpy array is homogeneous (all elements in an array have the same datatype). The numpy core data structure for expressing arrays, the ndarray class, includes certain critical metadata or features about the array. The shape of the array, its size, datatype, dimension, and other features are among them. To retrieve the entire list of attributes accessible in the ndarray doctring, use the help function help(numpy.ndarray) on your interpreter.

  • shape attributes returns a tuple that contains the row length and column length of the array.
  • size attributes returns the total number of elements in the array
  • ndim returns the dimension of the array.
  • nbytes returns the number of bytes used to store the array.
  • dtype returns the datatype of the elements in the array.

Here is an example of how these attributes are accessed and used.

code-block-1

import numpy as np
x = np.array([
    [1, 2, 3],
    [4, 5, 6]])

print(type(x))  # <class 'numpy.ndarray'>

print(x.shape)  # (2, 3) : 2 rows and 3 columns

print(x.ndim)   # 2 : 2D array

print(x.dtype)  # int32 or int64

print(x.size)   # 6  : array contains 6 elements

print(x.nbytes) # 24 : total size of the array

Jump to top


Numpy Data Types

The dtype attribute was already described in the previous section. An array's elements all share the same data type (homogeneity). The basic numeric data types supported by numpy are listed below.

  • int (Integers) : int8, int16, int32, int64
  • uint (Unsigned, nonnegative Integers) : uint8, uint16, uint32, uint64
  • bool (Boolean) : Bool
  • float (Floating-point numbers) : float16, float32, float64, float128
  • complex (Complex floating-point number) : complex64, complex128, complex256

Let's look at a few examples on how to create integer, floats and complex-valued arrays.

code-block-2

import numpy as np

# create an interger-type array
a = np.array([10, 11, 12], dtype = np.int8)
print(a)
# Output: [10 11 12]

# create a float-type array
b = np.array([10, 11, 12], dtype = np.float16)
print(b)
# Output: [10. 11. 12.]

# create a cpmplex-type array
c = np.array([10, 11, 12], dtype = np.complex64)
print(c)
# Output: [10.+0.j 11.+0.j 12.+0.j]

Jump to top

Now that we've learned how to create separate array types, let's talk about typecasting.

Typecasting

Converting one data type into another is known as typecasting. In the C programming language, this is also known as data conversion or type conversion.

The only way to modify the dtype of a NumPy array after it is generated is to make a new copy with type-casted array contents. The np.array function can be used to typecast an array or use the astype() function , which is simple to do:

code-block-3

In [1]: import numpy as np
In [2]: arr = np.array([2, 4, 6, 8], dtype= np.float16)

In [3]: arr
Out[3]: array([2., 4., 6., 8.], dtype=float16)

In [4]: arr.dtype
Out[4]: dtype('float16')

In [5]: # convert arr to integer-type
In [6]: arr = np.array(arr, dtype= np.int32)

In [7]: arr
Out[7]: array([2, 4, 6, 8])

In [8]: arr.dtype
Out[8]: dtype('int32')

In [9]: arr.astype(np.complex64)
Out[9]: array([2.+0.j, 4.+0.j, 6.+0.j, 8.+0.j], dtype=complex64)

The data type of the arr array was changed from 'float' to 'int' in code-block-3.

Real and Imaginary Parts

The attributes real and imag in NumPy array instances are used to extract the array's real and imaginary components, respectively:

code-block-4

In [1]: import numpy as np

In [2]: arr = np.array([1, 2, 3], dtype= np.complex64)

In [3]: arr
Out[3]: array([1.+0.j, 2.+0.j, 3.+0.j], dtype=complex64)

In [4]: arr.real
Out[4]: array([1., 2., 3.], dtype=float32)

In [5]: arr.imag
Out[5]: array([0., 0., 0.], dtype=float32)

Jump to top


Various ways to create arrays in Numpy

We looked at NumPy's basic data structure for representing arrays, the ndarray class, and the basic properties of this class in the previous section. The functions from the NumPy library that can be used to construct ndarray instances are covered in this section.

Arrays can be created in a variety of ways, depending on the application or use case. One of the drawbacks of using the numpy.array() function is that it can only be used to create small arrays. In many cases, it is required to construct arrays with members that adhere to a set of rules, such as filled with constant values, growing integers, regularly spaced numbers, random numbers, and so on. We may also need to generate arrays from data contained in a file in other situations. The needs are numerous and diverse, and the NumPy library offers a comprehensive collection of functions for producing arrays of various sorts. Jump to top

Create Arrays from Python Lists and Iterable objects

The numpy.array() function can be used to generate an array by supplying lists and iterable expressions as an argument.

code-block-5

In [1]: import numpy as np

In [2]: # Create a 1D array
In [3]: arr1 = [1, 2, 3, 4]
In [4]: arr1 = np.array(arr1)
In [5]: arr1.ndim  # returns dimension of arr1
Out[5]: 1
In [6]: arr1.shape
Out[6]: (4,)

In [7]: # Create a 2D array
In [8]: arr2 = [[1, 2, 3],[4, 5, 6]]
In [9]: arr2 = np.array(arr2)
In [10]: arr2.ndim
Out[10]: 2
In [11]: arr2.shape
Out[11]: (2, 3)

Jump to top

Automatically create and fill arrays with constant values

The np.zeros and np.ones functions construct and return arrays of zeros and ones, respectively. They take an integer or a tuple as the first argument that describes the number of entries in each dimension of the array. For example, we may use to make a 4 x 3 array filled with zeros and a 5 length array filled with ones.

code-block-6

In [1]: import numpy as np

In [2]: # Create a 4 x 3 array filled with zeros
In [3]: np.zeros((4, 3))
Out[3]: 
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [4]: One dimensional array filled with ones
In [4]: np.ones(5)
Out[4]: array([1., 1., 1., 1., 1.])

numpy.fill() and numpy.full() are used to create and fiill array user-defined values.

code-block-7

In [5]: x1 = np.empty(5)
In [6]: x1.fill(3.0)
In [7]: x1
Out[7]: array([ 3., 3., 3., 3., 3.])

In [8]: x2 = np.full(5, 3.0)
In [9]: x2
Out[9]: array([ 3., 3., 3., 3., 3.])

Jump to top

Filling Arrays with Incremental Sequence

Arrays having regularly spaced values between a starting value and an ending value are frequently required in numerical calculation. np.arange and np.linspace are two NumPy routines that can be used to make such arrays. The start and end values are the first two arguments in both functions. The increment is the third argument in np.arange, whereas the total number of points in the array is the third argument in np.linspace.

Note that np.arange does not include the end value by default, but np.linspace does (though this can be altered using the optional endpoint keyword parameter). It is primarily a question of personal preference whether to use np.arange or np.linspace, although whenever the increment is a non-integer, it is typically suggested to use np.linspace.

Here's an example:

code-block-8

In [1]: np.arange(0.0, 10, 1)
Out[1]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [2]: np.linspace(0, 10, 11)
Out[2]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])

Jump to top

Filling Arrays with Logarithmic Sequences

The function np.logspace is similar to np.linspace, except that the increments between the array's elements are logarithmically spread, and the start and end values are powers of the optional base keyword parameter (which defaults to 10).

Here's how to make an array of logarithmically spaced numbers between 1 and 100:

code-block-9

In [1]: # 5 data points between 10**0=1 to 10**2=100
In [2]: np.logspace(0, 2, 5) 
Out[3]: array([ 1. , 3.16227766, 10. , 31.6227766 , 100.])

Jump to top

Matrix Arrays

Matrices, often known as two-dimensional arrays, are a common type of numerical computation. NumPy comes several functions for creating common matrices. The function np.identity, in particular, generates a square matrix with ones on the diagonal and zeros everywhere else

code-block-10

In [1]: np.identity(4)
Out[1]: array([[ 1., 0., 0., 0.],
               [ 0., 1., 0., 0.],
               [ 0., 0., 1., 0.],
               [ 0., 0., 0., 1.]])

np.eye is a similar function that builds matrices with ones on the diagonal (optionally offset). This is seen in the example below, which generates matrices with nonzero diagonals above and below the diagonal:

code-block-11

In [1]: np.eye(3, k=1)
Out[1]: array([[ 0., 1., 0.],
              [ 0., 0., 1.],
              [ 0., 0., 0.]])

In [2]: np.eye(3, k=-1)
Out[2]: array([[ 0., 0., 0.],
               [ 1., 0., 0.],
               [ 0., 1., 0.]])

You can checkout more numpy function from the numpy official documentation.

Now, that we understand a few things about numpy arrays, watch out for my next post on Indexing and Slicing of Arrays.

Happy Numpying!!! ;)