`NumPy` Arrays

Python programming

numpy

Arrays

In this lesson, you will learn how to work with NumPy arrays as the core data structure for numerical computing. You’ll create 1‑dimensional and multi‑dimensional arrays, reshape them and initialize arrays with special patterns such as zeros, ones and identity matrices. You’ll practice indexing, slicing and filtering to efficiently select and manipulate subsets of data. Finally, you’ll use common NumPy functions to summarize arrays, compute basic statistics and generate random data for simulations and testing.

Authors

Noor Sohail

Will Gammerdinger

Published

March 15, 2026

Keywords

Matrix, Math, Indexing

Approximate time: 30 minutes

Learning objectives

In this lesson, we will:

Wrangle NumPy arrays with indexing and slicing
Create arrays/matrices of varying dimensions and shapes
Use NumPy functions to perform common operations

Overview of lesson

NumPy is a powerful library for numerical computing in Python. It is widely used to work with arrays and matrices. The implementation of these data structures and associated methods have been optimized for performance, making it a popular choice for scientific computing and data analysis. In this lesson, we will cover the basics of working with NumPy arrays, including how to create and wrangle them, as well as some common functions that are available for performing operations on arrays.

`NumPy` library

You can find the official documentation for NumPy here and you should consult it whenever there you need to run a specific task repeatedly to see if an function already exists for that task. You can also use the help() function in Python to get more information about a specific NumPy function or data structure.

1-dimensional arrays

The most basic data structure in NumPy is the 1-dimensional array, which is similar to a list. You can create a 1-dimensional array using the numpy.array() function:

#Import numpy using the alias, np
import numpy as np

# Create a 1-dimensional numpy array called arr
arr = np.array([1, 2, 3, 4, 5])

# Print the numpy array 
print(arr)

[1 2 3 4 5]

One of the key advantages of NumPy arrays over Python lists is their greater efficiency for numerical operations. For example, you can easily perform element-wise operations on NumPy arrays. So if we were to add two arrays together, it would add each element of the first array to the corresponding element of the second array:

# Create another 1-dimensional numpy array called arr2
arr2 = np.array([10, 20, 30, 40, 50])

# Perform element-wise addition on arr and arr2 and assign the output to result
result = arr + arr2

# Print result array
print(result)

[11 22 33 44 55]

If we were to try to do this with Python lists, we would get a different result:

# Create two Python lists
list1 = [1, 2, 3, 4, 5]
list2 = [10, 20, 30, 40, 50]

# Attempt to perform element-wise addition (this will not work as expected)
result_list = list1 + list2

# Print the resulting list
print(result_list)

[1, 2, 3, 4, 5, 10, 20, 30, 40, 50]

Instead of adding two lists together element-wise, Python concatenates them (in order to concatenate arrays in NumPy, you would use the np.concatenate() function).

Attributes can be thought of as the characteristics or properties of a Python object. One attribute of interest could be the data type on a NumPy array. If we were to investigate the dtype attribute of the NumPy array, we would find that it is a specific data type (e.g., int64 or float64). So, unlike lists, which can contain different data types, the elements in NumPy arrays are all the same data type.

# Check the data type attribute of the NumPy array
print(arr.dtype)

int64

N-dimensional arrays

Arrays are not just limited to 1 dimension; NumPy can also create and handle multi-dimensional arrays. For example, you can create a 2-dimensional array (also known as a matrix) while using the same numpy.array() function:

# Create a 2-dimensional array (matrix)
matrix = np.array([[1, 2, 3], [4, 5, 6]])

# Print out the matrix
print(matrix)

[[1 2 3]
 [4 5 6]]

This again may seem familiar to a list of lists in Python, but it is important to note this is a NumPy array which has different properties and capabilities than a list of lists. For example, you can perform matrix operations on this 2-dimensional array, such as matrix multiplication (dot product).

# Create another 2-dimensional array (matrix)
matrix2 = np.array([[10, 20], [30, 40], [50, 60]])

# Perform matrix multiplication
result_matrix = np.dot(matrix, matrix2)

# Print out the result matrix
print(result_matrix)

[[220 280]
 [490 640]]

Special matrices

We can initialize a matrix of zeros or ones by using numpy.zeros() and numpy.ones() methods in as many dimensions as we want.

Rows then columns

When accessing elements in a 2-dimensional array, the first index refers to the row and the second index refers to the column. So matrix[0, 1] would access the element in the first row and second column of the matrix.

Similarly, when initializing a matrix, the first number refers to the number of rows and the second number refers to the number of columns.

# Initialize a 3x3 matrix of zeros
zeros_matrix = np.zeros((3, 3))

# Print the zeros_matrix
print(zeros_matrix)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Similarly, we can initialize a matrix of ones:

# Initialize a 2x4 matrix of ones
ones_matrix = np.ones((2, 4))

# Print the ones_matrix
print(ones_matrix)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]

We can even create an identity matrix with numpy.identity(). An identity matrix is a square matrix (same number of rows and columns) with ones on the main diagonal and zeros elsewhere.

# Initialize a 3x3 identity matrix
identity_matrix = np.identity(3)

# Print the identity matrix
print(identity_matrix)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Identity matrix math

The identity matrix is often denoted as I and has the property that when multiplied by any matrix of compatible dimensions, it returns the original matrix (i.e., I * A = A).

# Example of identity matrix multiplication
# Create an identity matrix
I = np.identity(3)

# Create a populated matrix
A = np.array([[1, 2, 3], 
              [4, 5, 6], 
              [7, 8, 9]])

# Perform matrix multiplication
result = np.dot(I, A)

# Print the matrix
print(result)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

Reshaping arrays

The numpy.reshape() function is a flexible way to create arrays of varying shapes and sizes. We can use this in conjunction with numpy.arange() to create an array of a specific range of numbers and then reshape it into the desired dimensions.

# Create an array of integers from 0 to 9
matrix_arange = np.arange(10)

# Print the matrix
print(matrix_arange)

[0 1 2 3 4 5 6 7 8 9]

And now we can reshape the values to be the size we want to work with:

# Reshape the array into a 2x5 matrix
reshaped_matrix = matrix_arange.reshape(2, 5)

# Print the reshaped matrix
print(reshaped_matrix)

[[0 1 2 3 4]
 [5 6 7 8 9]]

Indexing and slicing arrays

Arrays can be indexed and sliced in a similar way to lists with indices (again beginning with 0) and slicing syntax. So if we wanted to access the first element of a 1-dimensional array, we would use arr1[0]:

# Retrieve the first element of arr
first_element = arr[0]

# Print the retrieved element
print(first_element)

In a 2-dimensional array, the first index refers to the row and the second index refers to the column. So if we wanted to access the first row of a 2-dimensional array, we would use matrix[0]:

# Access the first row of a 2-dimensional array
first_row = matrix[0]

# Print the retrieved row
print(first_row)

[1 2 3]

If we wanted to access the element in the second row and third column of the matrix, we would use matrix[1, 2]:

# Retrieve the element in the second row and third column
element = matrix[1, 2]

# Print the retrieved element
print(element)

So far, this is quite similar to lists. Arrays differ because they have more advanced indexing and slicing capabilities, allowing you to more easily select specific elements or subsets of the array based on certain conditions. This is particularly useful when working with large datasets as it allows you to efficiently filter and wrangle the data.

# Create a 1-dimensional array
arr = np.array([1, 2, 3, 4, 5])

# Select elements greater than 3
filtered_arr = arr[arr > 3]

# Print the selected elements
print(filtered_arr)

[4 5]

Exercise 1

Create a NumPy array holding the values 1, 14, 22, 47, 58 and 67 and assign it to exercise_array
Retrieve the second, third and fourth elements from exercise_array
Retrieve the values less than 25 from exercise_array
Reshape exercise_array into a matrix have three rows with two columns each and assign it to exercise_matrix.
Retrieve the value in the third row, second column of exercise_matrix

Useful `NumPy` functions

Base Python is somewhat limited when it comes to numerical operations. The NumPy library provides a broader range of mathematical tools. For example, you can quickly calculate the mean, median and standard deviation of an array with the numpy.mean(), numpy.median() and numpy.std() functions:

# Create a 1-dimensional array
arr = np.array([2, 7, 9.5, 5, 7, 3.2])

# Calculate mean of arr
mean = np.mean(arr)

# Calculate median of arr
median = np.median(arr)

# Calculate standard deviation of arr
std_dev = np.std(arr)

# Print an f-string (Python inteprets items with {}) with the mean, median and standard deviation
print(f"Mean: {mean}, Median: {median}, Standard Deviation: {std_dev}")

Mean: 5.616666666666667, Median: 6.0, Standard Deviation: 2.523500654954453

NumPy has other functions for performing mathematical operations on arrays, such as

numpy.sum()
numpy.prod()
numpy.cumsum()
numpy.cumprod()
numpy.square()
numpy.sqrt()
numpy.max()
numpy.min().

All these functions allow you to perform operations across the entire array or along specific axes.

Summarizing arrays

Arrays come with built in methods to see the general information about the array. Recall that methods are functions that are associated with a specific object. Let us test our what some of the attributes of the reshaped_matrix array are:

Exercise 2

Test out the following functions on the reshaped_matrix array that we created earlier and describe what the output is for each function:
- np.shape()
- np.ndim()
- np.size()

Random numbers

Within NumPy, we can also generate random numbers, which can be helpful for random sampling or initializing a non-zero matrix with values. These functions come from the numpy.random module. For example, you can generate a random array of numbers between 0 and 1 using the numpy.random.rand() function:

# Generate a 1-dimensional array of 10 random numbers between 0 and 1
random_array = np.random.rand(10)

# Print the random array
print(random_array)

[0.1927394  0.24797559 0.09715149 0.38882449 0.13438002 0.8552513
 0.77674068 0.41788096 0.71171575 0.73780652]

Similarly, we can also create a random array from a specific distribution. For example, you can generate a random array of numbers from a normal distribution using the numpy.random.normal() function:

# Generate a 1-dimensional array of 10 random numbers from a normal distribution 
# With mean 0 and standard deviation 1
random_array = np.random.normal(loc=0,
                                scale=1,
                                size=10)  

# Print the random array
print(random_array)

[-0.90360729  0.12263979 -1.37360015  0.73812439 -0.16095982 -0.6026135
 -1.69004688 -0.14328387  1.7248296  -1.89170355]

Next Lesson >>

Back to Schedule

Reuse

CC-BY-4.0

--- title: "`NumPy` Arrays" description: | In this lesson, you will learn how to work with `NumPy` arrays as the core data structure for numerical computing. You’ll create 1‑dimensional and multi‑dimensional arrays, reshape them and initialize arrays with special patterns such as zeros, ones and identity matrices. You’ll practice indexing, slicing and filtering to efficiently select and manipulate subsets of data. Finally, you’ll use common `NumPy` functions to summarize arrays, compute basic statistics and generate random data for simulations and testing. author: - Noor Sohail - Will Gammerdinger date: "2026-03-15" categories: - Python programming - numpy - Arrays keywords: - Matrix - Math - Indexing license: "CC-BY-4.0" editor_options: markdown: wrap: 72 jupyter: intro_python --- Approximate time: 30 minutes ## Learning objectives In this lesson, we will: - Wrangle `NumPy` arrays with indexing and slicing - Create arrays/matrices of varying dimensions and shapes - Use `NumPy` functions to perform common operations ## Overview of lesson `NumPy` is a powerful library for numerical computing in Python. It is widely used to work with arrays and matrices. The implementation of these data structures and associated methods have been optimized for performance, making it a popular choice for scientific computing and data analysis. In this lesson, we will cover the basics of working with `NumPy` arrays, including how to create and wrangle them, as well as some common functions that are available for performing operations on arrays. ## `NumPy` library You can find the official documentation for `NumPy` [here](https://numpy.org/doc/stable/) and you should consult it whenever there you need to run a specific task repeatedly to see if an function already exists for that task. You can also use the `help()` function in Python to get more information about a specific `NumPy` function or data structure. ### 1-dimensional arrays The most basic data structure in `NumPy` is the 1-dimensional array, which is similar to a list. You can create a 1-dimensional array using the `numpy.array()` function: ```{python} #| label: create_1d_array #Import numpy using the alias, np import numpy as np # Create a 1-dimensional numpy array called arr arr = np.array([1, 2, 3, 4, 5]) # Print the numpy array print(arr) ``` One of the key advantages of `NumPy` arrays over Python lists is their greater efficiency for numerical operations. For example, you can easily perform element-wise operations on `NumPy` arrays. So if we were to add two arrays together, it would add each element of the first array to the corresponding element of the second array: ```{python} #| label: elementwise_operations # Create another 1-dimensional numpy array called arr2 arr2 = np.array([10, 20, 30, 40, 50]) # Perform element-wise addition on arr and arr2 and assign the output to result result = arr + arr2 # Print result array print(result) ``` If we were to try to do this with Python lists, we would get a different result: ```{python} #| label: list_addition # Create two Python lists list1 = [1, 2, 3, 4, 5] list2 = [10, 20, 30, 40, 50] # Attempt to perform element-wise addition (this will not work as expected) result_list = list1 + list2 # Print the resulting list print(result_list) ``` Instead of adding two lists together element-wise, Python concatenates them (in order to concatenate arrays in `NumPy`, you would use the `np.concatenate()` function). Attributes can be thought of as the characteristics or properties of a Python object. One attribute of interest could be the data type on a `NumPy` array. If we were to investigate the `dtype` attribute of the `NumPy` array, we would find that it is a specific data type (e.g., `int64` or `float64`). So, unlike lists, which can contain different data types, the elements in `NumPy` arrays are all the same data type. ```{python} #| label: check_dtype # Check the data type attribute of the NumPy array print(arr.dtype) ``` ### N-dimensional arrays Arrays are not just limited to 1 dimension; `NumPy` can also create and handle multi-dimensional arrays. For example, you can create a 2-dimensional array (also known as a matrix) while using the same `numpy.array()` function: ```{python} #| label: create_2d_array # Create a 2-dimensional array (matrix) matrix = np.array([[1, 2, 3], [4, 5, 6]]) # Print out the matrix print(matrix) ``` This again may seem familiar to a list of lists in Python, but it is important to note this is a `NumPy` array which has different properties and capabilities than a list of lists. For example, you can perform matrix operations on this 2-dimensional array, such as matrix multiplication (dot product). ```{python} #| label: matrix_multiplication # Create another 2-dimensional array (matrix) matrix2 = np.array([[10, 20], [30, 40], [50, 60]]) # Perform matrix multiplication result_matrix = np.dot(matrix, matrix2) # Print out the result matrix print(result_matrix) ``` ### Special matrices We can initialize a matrix of zeros or ones by using `numpy.zeros()` and `numpy.ones()` methods in as many dimensions as we want. ::: callout-note # Rows then columns When accessing elements in a 2-dimensional array, the first index refers to the row and the second index refers to the column. So `matrix[0, 1]` would access the element in the first row and second column of the matrix. Similarly, when initializing a matrix, the first number refers to the number of rows and the second number refers to the number of columns. ::: ```{python} #| label: initialize_matrices # Initialize a 3x3 matrix of zeros zeros_matrix = np.zeros((3, 3)) # Print the zeros_matrix print(zeros_matrix) ``` Similarly, we can initialize a matrix of ones: ```{python} #| label: initialize_ones_matrix # Initialize a 2x4 matrix of ones ones_matrix = np.ones((2, 4)) # Print the ones_matrix print(ones_matrix) ``` We can even create an identity matrix with `numpy.identity()`. An identity matrix is a square matrix (same number of rows and columns) with ones on the main diagonal and zeros elsewhere. ```{python} #| label: initialize_identity_matrix # Initialize a 3x3 identity matrix identity_matrix = np.identity(3) # Print the identity matrix print(identity_matrix) ``` ::: {.callout-note collapse="true"} # Identity matrix math The identity matrix is often denoted as `I` and has the property that when multiplied by any matrix of compatible dimensions, it returns the original matrix (i.e., `I * A = A`). ```{python} #| label: identity_matrix_multiplication # Example of identity matrix multiplication # Create an identity matrix I = np.identity(3) # Create a populated matrix A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Perform matrix multiplication result = np.dot(I, A) # Print the matrix print(result) ``` ::: ### Reshaping arrays The `numpy.reshape()` function is a flexible way to create arrays of varying shapes and sizes. We can use this in conjunction with `numpy.arange()` to create an array of a specific range of numbers and then reshape it into the desired dimensions. ```{python} #| label: arange_function # Create an array of integers from 0 to 9 matrix_arange = np.arange(10) # Print the matrix print(matrix_arange) ``` And now we can reshape the values to be the size we want to work with: ```{python} #| label: reshape_array # Reshape the array into a 2x5 matrix reshaped_matrix = matrix_arange.reshape(2, 5) # Print the reshaped matrix print(reshaped_matrix) ``` ## Indexing and slicing arrays Arrays can be indexed and sliced in a similar way to lists with indices (again beginning with 0) and slicing syntax. So if we wanted to access the first element of a 1-dimensional array, we would use `arr1[0]`: ```{python} #| label: indexing_1d_array # Retrieve the first element of arr first_element = arr[0] # Print the retrieved element print(first_element) ``` In a 2-dimensional array, the first index refers to the row and the second index refers to the column. So if we wanted to access the first row of a 2-dimensional array, we would use `matrix[0]`: ```{python} #| label: indexing_2d_array # Access the first row of a 2-dimensional array first_row = matrix[0] # Print the retrieved row print(first_row) ``` If we wanted to access the element in the second row and third column of the matrix, we would use `matrix[1, 2]`: ```{python} #| label: indexing_2d_array_element # Retrieve the element in the second row and third column element = matrix[1, 2] # Print the retrieved element print(element) ``` So far, this is quite similar to lists. Arrays differ because they have more advanced indexing and slicing capabilities, allowing you to more easily select specific elements or subsets of the array based on certain conditions. This is particularly useful when working with large datasets as it allows you to efficiently filter and wrangle the data. ```{python} #| label: boolean_indexing # Create a 1-dimensional array arr = np.array([1, 2, 3, 4, 5]) # Select elements greater than 3 filtered_arr = arr[arr > 3] # Print the selected elements print(filtered_arr) ``` :::{.callout-tip} # [**Exercise 1**](08_numpy_arrays-Answer_key.qmd#exercise-1) 1. Create a NumPy array holding the values `1`, `14`, `22`, `47`, `58` and `67` and assign it to `exercise_array` 2. Retrieve the second, third and fourth elements from `exercise_array` 3. Retrieve the values less than 25 from `exercise_array` 4. Reshape `exercise_array` into a matrix have three rows with two columns each and assign it to `exercise_matrix`. 5. Retrieve the value in the third row, second column of `exercise_matrix` ::: ## Useful `NumPy` functions Base Python is somewhat limited when it comes to numerical operations. The [`NumPy` library](https://numpy.org/doc/stable/reference/) provides a broader range of mathematical tools. For example, you can quickly calculate the mean, median and standard deviation of an array with the `numpy.mean()`, `numpy.median()` and `numpy.std()` functions: ```{python} #| label: numpy_functions # Create a 1-dimensional array arr = np.array([2, 7, 9.5, 5, 7, 3.2]) # Calculate mean of arr mean = np.mean(arr) # Calculate median of arr median = np.median(arr) # Calculate standard deviation of arr std_dev = np.std(arr) # Print an f-string (Python inteprets items with {}) with the mean, median and standard deviation print(f"Mean: {mean}, Median: {median}, Standard Deviation: {std_dev}") ``` `NumPy` has other functions for performing mathematical operations on arrays, such as - `numpy.sum()` - `numpy.prod()` - `numpy.cumsum()` - `numpy.cumprod()` - `numpy.square()` - `numpy.sqrt()` - `numpy.max()` - `numpy.min()`. All these functions allow you to perform operations across the entire array or along specific axes. ### Summarizing arrays Arrays come with built in methods to see the general information about the array. Recall that methods are functions that are associated with a specific object. Let us test our what some of the attributes of the `reshaped_matrix` array are: :::{.callout-tip} # [**Exercise 2**](08_numpy_arrays-Answer_key.qmd#exercise-2) 1. Test out the following functions on the `reshaped_matrix` array that we created earlier and describe what the output is for each function: - `np.shape()` - `np.ndim()` - `np.size()` ::: ### Random numbers Within `NumPy`, we can also generate random numbers, which can be helpful for random sampling or initializing a non-zero matrix with values. These functions come from the `numpy.random` module. For example, you can generate a random array of numbers between 0 and 1 using the `numpy.random.rand()` function: ```{python} #| label: random_array # Generate a 1-dimensional array of 10 random numbers between 0 and 1 random_array = np.random.rand(10) # Print the random array print(random_array) ``` Similarly, we can also create a random array from a specific distribution. For example, you can generate a random array of numbers from a normal distribution using the `numpy.random.normal()` function: ```{python} #| label: random_normal # Generate a 1-dimensional array of 10 random numbers from a normal distribution # With mean 0 and standard deviation 1 random_array = np.random.normal(loc=0, scale=1, size=10) # Print the random array print(random_array) ``` *** [Next Lesson >>](09_pandas_dataframes.qmd) [Back to Schedule](../schedule/schedule.qmd)