5. NumPy#
This chapter was coauthored by Jason DeBacker and Richard W. Evans.
NumPy is Python’s fundamantal numerical package (the name stands for “numerical Python”), and is at the basis of most computation using Python.[1] Our discussion of Python’s NumPy package starts with Travis Oliphant, who was the primary creator of the NumPy package, a founding contributor to Python’s SciPy package (covered in the SciPy: Root finding, minimizing, interpolation chapter), founder of Anaconda, Inc. that maintains the most popular distribution of Python, and a co-founder of the NumFOCUS non-profit that fiscally supports some of the primary package projects in Python.[2]
Oliphant was a mathematics and electrical engineering student who came up through his masters degree using MATLAB with a focus primarily on signal processing. While working on a PhD, he needed to create custom code that could do signal processing operations that had never been done before. These operations required combinations of mathematical operations. Oliphant liked the ideas of network and collaboration in the open source software community, and Python was a language that felt intuitive and comfortable to him. However, Python had no established numerical matrix operations libraries. Oliphant created the NumPy package to be that numerical engine based on linear algebra array operations.
The fundamental object of the NumPy package is the NumPy array numpy.array
. Python’s native objects—such as lists, tuples, and dictionaries—can hold numbers and perform operations on those numbers. But the NumPy array allows for storing high-dimensional arrays of numbers on which linear algebra and tensor functions can be operated. These linear algebra operations are more effecient than working with lists and tuples, and they form the foundation of modern optimization and machine learning. Learning to use Python’s NumPy package is an essential skill for many numerical computations and other operations.
The iframe below contains a PDF of the BYU ACME open-access lab entitled, “Introduction to NumPy”. You can either scroll through the lab on this page using the iframe window, or you can download the PDF for use on your computer. See [BYU ACME, 2023]. Exercise 5.1 below has you work through the problems in this BYU ACME lab. A Python file template (numpy_intro.py
) and a matrix data file (grid.npy
) used in the lab are stored in the ./code/NumPyIntro/
directory.
The following iframe contains a PDF of the BYU ACME open-access lab entitled, “Advanced NumPy”, which contains content and exercises that build off of the previous BYU ACME NumPy lab. You can either scroll through the lab on this page using the iframe window, or you can download the PDF for use on your computer. See [BYU ACME, 2023]. Exercise 5.2 below has you work through the problems in this BYU ACME lab. A Python file template (advanced_numpy.py
) used in the lab are stored in the ./code/AdvancedNumPy/
directory.
5.1. Extensions and future paths#
One of the drawbacks to the degree to which NumPy arrays are fundamental to Python’s numerical computing is that the format of those arrays is a requirement in Python’s most highly used scientific computing and machine learning packages (SciPy: Root finding, minimizing, interpolation and scikit-learn). However, advances in hardware, large data methods, and optimization algorithms now take much more advantage of parallel computing algorithms, hybrid architectures across multiple traditional processors and GPU’s. All of these innovations have been difficult to incorporate into Python’s scientific computing stack because NumPy arrays have been difficult to make flexible to these architectures.
Below are three areas that have been working to make Python better on these dimentions.
Dask arrays
QuantSight development and support of array API’s in SciPy and in scikit-learn. See here for the scikit-learn blog post. And see here for the SciPy blog post.
Modular’s development of the Mojo programming language.
5.2. Exercises#
Read the BYU ACME “Introduction to NumPy” lab and complete Problems 1 through 7 in the lab. [BYU ACME, 2023]
Read the BYU ACME “Advanced NumPy” lab and complete Problems 1 through 7 in the lab. [BYU ACME, 2023]
Create a Numpy array b
(defined this as the savings of 2 agents (the rows) over 5 periods (the columns)):
Use the shape
method of NumPy arrays to print the shape of this matrix. Use array slicing to print the first row of b
, which represents the lifecycle savings decisions of the first agent (i.e., the amount they choose to save in each of their 5 periods of life). Use array slicing to print the second column of b
, which is the savings of both agents when they are in their second period of life. Finally, use array slicing to print the first two rows and the last three columns of b
(i.e., the savings of both agents from middle age onwards).
Now let’s think about the matrix b
as representing not two individual agents, but two types of agents who each live for five periods. In this way, we will interpret the values in b
as the total savings of different cohorts of these two types of agents who are all living together at a point in time. Now, define a matrix Omega
:
Omega
represents the fraction of agents in the economy of each type/cohort (Note that the elements of Omega
sum to 1). Use matrix multiplication to find B
, which is the dot product of b
and the transpose of Omega
.
Print your matrix B
. What is its shape? What does B
represent?
Multiply element-wise (Hadamard product) the matrix b
from Exercise 5.3 by the matrix Omega
from Exercise 5.4. Use the numpy.array.sum()
method on the resulting matrix, with the appropriate axis
argument in the parentheses to find the total savings of each cohort.
In one line, create a matrix of zeros that is the same size as b
from Exercise 5.3.
Use numpy.where
to return the elements of b
from Exercise 5.3 that are greater than 2.0 and zero elsewhere.
Now suppose a third type of agent. This agent has savings \(b_3 = \left[4.1, 5.1, 7.1, 4.5, 0.9\right]\). Use numpy.vstack
to stack b
from Exercise 5.3 on top of b_3
to create a new \(3\times 5\) matrix b_new
.
5.3. Footnotes#
The footnotes from this chapter.