# NumPy's max() and maximum(): Find Extreme Values in Arrays

by Charles de Villiers
Copied!
Happy Pythoning!

The NumPy library supports expressive, efficient numerical programming in Python. Finding extreme values is a very common requirement in data analysis. The NumPy `max()` and `maximum()` functions are two examples of how NumPy lets you combine the coding comfort offered by Python with the runtime efficiency you’d expect from C.

In this tutorial, you’ll learn how to:

• Use the NumPy `max()` function
• Use the NumPy `maximum()` function and understand why it’s different from `max()`
• Solve practical problems with these functions
• Handle missing values in your data
• Apply the same concepts to finding minimum values

This tutorial includes a very short introduction to NumPy, so even if you’ve never used NumPy before, you should be able to jump right in. With the background provided here, you’ll be ready to continue exploring the wealth of functionality to be found in the NumPy library.

## NumPy: Numerical Python

NumPy is short for Numerical Python. It’s an open source Python library that enables a wide range of applications in the fields of science, statistics, and data analytics through its support of fast, parallelized computations on multidimensional arrays of numbers. Many of the most popular numerical packages use NumPy as their base library.

### Introducing NumPy

The NumPy library is built around a class named `np.ndarray` and a set of methods and functions that leverage Python syntax for defining and manipulating arrays of any shape or size.

NumPy’s core code for array manipulation is written in C. You can use functions and methods directly on an `ndarray` as NumPy’s C-based code efficiently loops over all the array elements in the background. NumPy’s high-level syntax means that you can simply and elegantly express complex programs and execute them at high speeds.

You can use a regular Python `list` to represent an array. However, NumPy arrays are far more efficient than lists, and they’re supported by a huge library of methods and functions. These include mathematical and logical operations, sorting, Fourier transforms, linear algebra, array reshaping, and much more.

Today, NumPy is in widespread use in fields as diverse as astronomy, quantum computing, bioinformatics, and all kinds of engineering.

NumPy is used under the hood as the numerical engine for many other libraries, such as pandas and SciPy. It also integrates easily with visualization libraries like Matplotlib and seaborn.

NumPy is easy to install with your package manager, for example `pip` or `conda`. For detailed instructions plus a more extensive introduction to NumPy and its capabilities, take a look at NumPy Tutorial: Your First Steps Into Data Science in Python or the NumPy Absolute Beginner’s Guide.

In this tutorial, you’ll learn how to take your very first steps in using NumPy. You’ll then explore NumPy’s `max()` and `maximum()` commands.

### Creating and Using NumPy Arrays

You’ll start your investigation with a quick overview of NumPy arrays, the flexible data structure that gives NumPy its versatility and power.

The fundamental building block for any NumPy program is the `ndarray`. An `ndarray` is a Python object wrapping an array of numbers. It may, in principle, have any number of dimensions of any size. You can declare an array in several ways. The most straightforward method starts from a regular Python list or tuple:

Python
``````>>> import numpy as np
>>> A = np.array([3, 7, 2, 4, 5])
>>> A
array([3, 7, 2, 4, 5])

>>> B = np.array(((1, 4), (1, 5), (9, 2)))
>>> B
array([[1, 4],
[1, 5],
[9, 2]])
``````
Copied!

You’ve imported `numpy` under the alias `np`. This is a standard, widespread convention, so you’ll see it in most tutorials and programs. In this example, `A` is a one-dimensional array of numbers, while `B` is two-dimensional.

Notice that the `np.array()` factory function expects a Python list or tuple as its first parameter, so the list or tuple must therefore be wrapped in its own set of brackets or parentheses, respectively. Just throwing in an unwrapped bunch of numbers won’t work:

Python
``````>>> np.array(3, 7, 2, 4, 5)
Traceback (most recent call last):
...
TypeError: array() takes from 1 to 2 positional arguments but 5 were given
``````
Copied!

With this syntax, the interpreter sees five separate positional arguments, so it’s confused.

In your constructor for array `B`, the nested tuple argument needs an extra pair of parentheses to identify it, in its entirety, as the first parameter of `np.array()`.

Addressing the array elements is straightforward. NumPy’s indices start at zero, like all Python sequences. By convention, a two-dimensional array is displayed so that the first index refers to the row, and the second index refers to the column. So `A[0]` is the first element of the one-dimensional array `A`, and `B[2, 1]` is the second element in the third row of the two-dimensional array `B`:

Python
``````>>> A[0]  # First element of A
3
>>> A[4]  # Fifth and last element of A
5
>>> A[-1]  # Last element of A, same as above
5
>>> A[5]  # This won't work because A doesn't have a sixth element
Traceback (most recent call last):
...
IndexError: index 5 is out of bounds for axis 0 with size 5
>>> B[2, 1]  # Second element in third row of B
2
``````
Copied!

So far, it seems that you’ve simply done a little extra typing to create arrays that look very similar to Python lists. But looks can be deceptive! Each `ndarray` object has approximately a hundred built-in properties and methods, and you can pass it to hundreds more functions in the NumPy library.

Almost anything that you can imagine doing to an array can be achieved in a few lines of code. In this tutorial, you’ll only be using a few functions, but you can explore the full power of arrays in the NumPy API documentation.

### Creating Arrays in Other Ways

You’ve already created some NumPy arrays from Python sequences. But arrays can be created in many other ways. One of the simplest is `np.arange()`, which behaves rather like a souped-up version of Python’s built-in `range()` function:

Python
``````>>> np.arange(10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> np.arange(2, 3, 0.1)
array([ 2., 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
``````
Copied!

In the first example above, you only specified the upper limit of `10`. NumPy follows the standard Python convention for ranges and returns an `ndarray` containing the integers `0` to `9`. The second example specifies a starting value of `2`, an upper limit of `3`, and an increment of `0.1`. Unlike Python’s standard `range()` function, `np.arange()` can handle non-integer increments, and it automatically generates an array with `np.float` elements in this case.

NumPy’s arrays may also be read from disk, synthesized from data returned by APIs, or constructed from buffers or other arrays.

NumPy arrays can contain various types of integers, floating-point numbers, and complex numbers, but all the elements in an array must be of the same type.

You’ll start by using built-in `ndarray` properties to understand the arrays `A` and `B`:

Python
``````>>> A.size
5
>>> A.shape
(5,)

>>> B.size
6
>>> B.shape
(3, 2)
``````
Copied!

The `.size` attribute counts the elements in the array, and the `.shape` attribute contains an ordered tuple of dimensions, which NumPy calls axes. `A` is a one-dimensional array with one row containing five elements. Because `A` has only one axis, `A.shape` returns a one-element tuple.

By convention, in a two-dimensional matrix, axis `0` corresponds to the rows, and axis `1` corresponds to the columns, so the output of `B.shape` tells you that `B` has three rows and two columns.

Python strings and lists have a very handy feature known as slicing, which allows you to select sections of a string or list by specifying indices or ranges of indices. This idea generalizes very naturally to NumPy arrays. For example, you can extract just the parts you need from `B`, without affecting the original array:

Python
``````>>> B[2, 0]
9
>>> B[1, :]
array([1, 5])
``````
Copied!

In the first example above, you picked out the single element in row `2` and column `0` using `B[2, 0]`. The second example uses a slice to pick out a sub-array. Here, the index `1` in `B[1, :]` selects row `1` of `B`. The `:` in the second index position selects all the elements in that row. As a result, the expression `B[1, :]` returns an array with one row and two columns, containing all the elements from row `1` of `B`.

If you need to work with matrices having three or more dimensions, then NumPy has you covered. The syntax is flexible enough to cover any case. In this tutorial, though, you’ll only deal with one- and two-dimensional arrays.

If you have any questions as you play with NumPy, the official NumPy docs are thorough and well-written. You’ll find them indispensable if you do serious development using NumPy.

## NumPy’s `max()`: The Maximum Element in an Array

In this section, you’ll become familiar with `np.max()`, a versatile tool for finding maximum values in various circumstances.

`np.max()` is the tool that you need for finding the maximum value or values in a single array. Ready to give it a go?

### Using `max()`

To illustrate the `max()` function, you’re going to create an array named `n_scores` containing the test scores obtained by the students in Professor Newton’s linear algebra class.

Each row represents one student, and each column contains the scores on a particular test. So column `0` contains all the student scores for the first test, column `1` contains the scores for the second test, and so on. Here’s the `n_scores` array:

Python
``````>>> import numpy as np
>>> n_scores = np.array([
...        [63, 72, 75, 51, 83],
...        [44, 53, 57, 56, 48],
...        [71, 77, 82, 91, 76],
...        [67, 56, 82, 33, 74],
...        [64, 76, 72, 63, 76],
...        [47, 56, 49, 53, 42],
...        [91, 93, 90, 88, 96],
...        [61, 56, 77, 74, 74],
... ])
``````
Copied!

You can copy and paste this code into your Python console if you want to follow along. To simplify the formatting before copying, click `>>>` at the top right of the code block. You can do the same with any of the Python code in the examples. Once you’ve done that, the `n_scores` array is in memory. You can ask the interpreter for some of its attributes:

Python
``````>>> n_scores.size
40
>>> n_scores.shape
(8, 5)
``````
Copied!

The `.shape` and `.size` attributes, as above, confirm that you have `8` rows representing students and `5` columns representing tests, for a total of `40` test scores.

Suppose now that you want to find the top score achieved by any student on any test. For Professor Newton’s little linear algebra class, you could find the top score fairly quickly just by examining the data. But there’s a quicker method that’ll show its worth when you’re dealing with much larger datasets, containing perhaps thousands of rows and columns.

Try using the array’s `.max()` method:

Python
``````>>> n_scores.max()
96
``````
Copied!

The `.max()` method has scanned the whole array and returned the largest element. Using this method is exactly equivalent to calling `np.max(n_scores)`.

But perhaps you want some more detailed information. What was the top score for each test? Here you can use the `axis` parameter:

Python
``````>>> n_scores.max(axis=0)
array([91, 93, 90, 91, 96])
``````
Copied!

The new parameter `axis=0` tells NumPy to find the largest value out of all the rows. Since `n_scores` has five columns, NumPy does this for each column independently. This produces five numbers, each of which is the maximum value in that column. The `axis` parameter uses the standard convention for indexing dimensions. So `axis=0` refers to the rows of an array, and `axis=1` refers to the columns.

The top score for each student is just as easy to find:

Python
``````>>> n_scores.max(axis=1)
array([83, 57, 91, 82, 76, 56, 96, 77])
``````
Copied!

This time, NumPy has returned an array with eight elements, one per student. The `n_scores` array contains one row per student. The parameter `axis=1` told NumPy to find the maximum value for each student, across the columns. Therefore, each element of the output contains the highest score attained by the corresponding student.

Perhaps you want the top scores per student, but you’ve decided to exclude the first and last tests. Slicing does the trick:

Python
``````>>> filtered_scores = n_scores[:, 1:-1]
>>> filtered_scores.shape
(8, 3)

>>> filtered_scores
array([72, 75, 51],
[53, 57, 56],
[77, 82, 91],
[56, 82, 33],
[76, 72, 63],
[56, 49, 53],
[93, 90, 88],
[56, 77, 74]])

>>> filtered_scores.max(axis=1)
array([75, 57, 91, 82, 76, 56, 93, 77])
``````
Copied!

You can understand the slice notation `n_scores[:, 1:-1]` as follows. The first index range, represented by the lone `:`, selects all the rows in the slice. The second index range after the comma, `1:-1`, tells NumPy to take the columns, starting at column `1` and ending `1` column before the last. The result of the slice is stored in a new array named `filtered_scores`.

With a bit of practice, you’ll learn to do array slicing on the fly, so you won’t need to create the intermediate array `filtered_scores` explicitly:

Python
``````>>> n_scores[:, 1:-1].max(axis=1)
array([75, 57, 91, 82, 76, 56, 93, 77])
``````
Copied!

Here you’ve performed the slice and the method call in a single line, but the result is the same. NumPy returns the per-student set of maximum `n_scores` for the restricted set of tests.

### Handling Missing Values in `np.max()`

So now you know how to find maximum values in any completely filled array. But what happens when a few array values are missing? This is pretty common with real-world data.

To illustrate, you’ll create a small array containing a week’s worth of daily temperature readings, in Celsius, from a digital thermometer, starting on Monday:

Python
``````>>> temperatures_week_1 = np.array([7.1, 7.7, 8.1, 8.0, 9.2, np.nan, 8.4])
>>> temperatures_week_1.size
7
``````
Copied!

It seems the thermometer had a malfunction on Saturday, and the corresponding temperature value is missing, a situation indicated by the `np.nan` value. This is the special value Not a Number, which is commonly used to mark missing values in real-world data applications.

So far, so good. But a problem arises if you innocently try to apply `.max()` to this array:

Python
``````>>> temperatures_week_1.max()
nan
``````
Copied!

Since `np.nan` reports a missing value, NumPy’s default behavior is to flag this by reporting that the maximum, too, is unknown. For some applications, this makes perfect sense. But for your application, perhaps you’d find it more useful to ignore the Saturday problem and get a maximum value from the remaining, valid readings. NumPy has provided the `np.nanmax()` function to take care of such situations:

Python
``````>>> np.nanmax(temperatures_week_1)
9.2
``````
Copied!

This function ignores any `nan` values and returns the largest numerical value, as expected. Notice that `np.nanmax()` is a function in the NumPy library, not a method of the `ndarray` object.

You’ve now seen the most common examples of NumPy’s maximum-finding capabilities for single arrays. But there are a few more NumPy functions related to maximum values that are worth knowing about.

For example, instead the maximum values in an array, you might want the indices of the maximum values. Let’s say you want to use your `n_scores` array to identify the student who did best on each test. The `.argmax()` method is your friend here:

Python
``````>>> n_scores.argmax(axis=0)
array([6, 6, 6, 2, 6])
``````
Copied!

It appears that student `6` obtained the top score on every test but one. Student `2` did best on the fourth test.

You’ll recall that you can also apply `np.max()` as a function of the NumPy package, rather than as a method of a NumPy array. In this case, the array must be supplied as the first argument of the function. For historical reasons, the package-level function `np.max()` has an alias, `np.amax()`, which is identical in every respect apart from the name:

Python
``````>>> n_scores.max(axis=1)
array([83, 57, 91, 82, 76, 56, 96, 77])

>>> np.max(n_scores, axis=1)
array([83, 57, 91, 82, 76, 56, 96, 77])

>>> np.amax(n_scores, axis=1)
array([83, 57, 91, 82, 76, 56, 96, 77])
``````
Copied!

In the code above, you’ve called `.max()` as a method of the `n_scores` object, and as a stand-alone library function with `n_scores` as its first parameter. You’ve also called the alias `np.amax()` in the same way. All three calls produce exactly the same results.

Now you’ve seen how to use `np.max()`, `np.amax()`, or `.max()` to find maximum values for an array along various axes. You’ve also used `np.nanmax()` to find the maximum values while ignoring `nan` values, as well as `np.argmax()` or `.argmax()` to find the indices of the maximum values.

You won’t be surprised to learn that NumPy has an equivalent set of minimum functions: `np.min()`, `np.amin()`, `.min()`, `np.nanmin()`, `np.argmin()`, and `.argmin()`. You won’t deal with those here, but they behave exactly like their maximum cousins.

## NumPy’s `maximum()`: Maximum Elements Across Arrays

Another common task in data science involves comparing two similar arrays. NumPy’s `maximum()` function is the tool of choice for finding maximum values across arrays. Since `maximum()` always involves two input arrays, there’s no corresponding method. The `np.maximum()` function expects the input arrays as its first two parameters.

### Using `np.maximum()`

Continuing with the previous example involving class scores, suppose that Professor Newton’s colleague—and archrival—Professor Leibniz is also running a linear algebra class with eight students. Construct a new array with the values for Leibniz’s class:

Python
``````>>> l_scores = np.array([
...         [87, 73, 71, 59, 67],
...         [60, 53, 82, 80, 58],
...         [92, 85, 60, 79, 77],
...         [67, 79, 71, 69, 87],
...         [86, 91, 92, 73, 61],
...         [70, 66, 60, 79, 57],
...         [83, 51, 64, 63, 58],
...         [89, 51, 72, 56, 49],
... ])

>>> l_scores.shape
(8, 5)
``````
Copied!

The new array, `l_scores`, has the same shape as `n_scores`.

You’d like to compare the two classes, student by student and test by test, to find the higher score in each case. NumPy has a function, `np.maximum()`, specifically designed for comparing two arrays in an element-by-element manner. Check it out in action:

Python
``````>>> np.maximum(n_scores, l_scores)
array([[87, 73, 75, 59, 83],
[60, 53, 82, 80, 58],
[92, 85, 82, 91, 77],
[67, 79, 82, 69, 87],
[86, 91, 92, 73, 76],
[70, 66, 60, 79, 57],
[91, 93, 90, 88, 96],
[89, 56, 77, 74, 74]])
``````
Copied!

If you visually check the arrays `n_scores` and `l_scores`, then you’ll see that `np.maximum()` has indeed picked out the higher of the two scores for each [row, column] pair of indices.

What if you only want to compare the best test results in each class? You can combine `np.max()` and `np.maximum()` to get that effect:

Python
``````>>> best_n = n_scores.max(axis=0)
>>> best_n
array([91, 93, 90, 91, 96])

>>> best_l = l_scores.max(axis=0)
>>> best_l
array([92, 91, 92, 80, 87])

>>> np.maximum(best_n, best_l)
array([92, 93, 92, 91, 96])
``````
Copied!

As before, each call to `.max()` returns an array of maximum scores for all the students in the relevant class, one element for each test. But this time, you’re feeding those returned arrays into the `maximum()` function, which compares the two arrays and returns the higher score for each test across the arrays.

You can combine those operations into one by dispensing with the intermediate arrays, `best_n` and `best_l`:

Python
``````>>> np.maximum(n_scores.max(axis=0), l_scores.max(axis=0))
array([91, 93, 90, 91, 96])
``````
Copied!

This gives the same result as before, but with less typing. You can choose whichever method you prefer.

### Handling Missing Values in `np.maximum()`

Remember the `temperatures_week_1` array from an earlier example? If you use a second week’s temperature records with the `maximum()` function, you may spot a familiar problem.

First, you’ll create a new array to hold the new temperatures:

Python
``````>>> temperatures_week_2 = np.array(
...     [7.3, 7.9, np.nan, 8.1, np.nan, np.nan, 10.2]
... )
``````
Copied!

There are missing values in the `temperatures_week_2` data, too. Now see what happens if you apply the `np.maximum` function to these two temperature arrays:

Python
``````>>> np.maximum(temperatures_week_1, temperatures_week_2)
array([ 7.3,  7.9,  nan,  8.1,  nan,  nan, 10.2])
``````
Copied!

All the `nan` values in both arrays have popped up as missing values in the output. There’s a good reason for NumPy’s approach to propagating `nan`. Often it’s important for the integrity of your results that you keep track of the missing values, rather than brushing them under the rug. But here, you just want to get the best view of the weekly maximum values. The solution, in this case, is another NumPy package function, `np.fmax()`:

Python
``````>>> np.fmax(temperatures_week_1, temperatures_week_2)
array([ 7.3,  7.9,  8.1,  8.1,  9.2,  nan, 10.2])
``````
Copied!

Now, two of the missing values have simply been ignored, and the remaining floating-point value at that index has been taken as the maximum. But the Saturday temperature can’t be fixed in that way, because both source values are missing. Since there’s no reasonable value to insert here, `np.fmax()` just leaves it as a `nan`.

Just as `np.max()` and `np.nanmax()` have the parallel minimum functions `np.min()` and `np.nanmin()`, so too do `np.maximum()` and `np.fmax()` have corresponding functions, `np.minimum()` and `np.fmin()`, that mirror their functionality for minimum values.

You’ve now seen examples of all the basic use cases for NumPy’s `max()` and `maximum()`, plus a few related functions. Now you’ll investigate some of the more obscure optional parameters to these functions and find out when they can be useful.

### Reusing Memory

When you call a function in Python, a value or object is returned. You can use that result immediately by printing it or writing it to disk, or by feeding it directly into another function as an input parameter. You can also save it to a new variable for future reference.

If you call the function in the Python REPL but don’t use it in one of those ways, then the REPL prints out the return value on the console so that you’re aware that something has been returned. All of this is standard Python stuff, and not specific to NumPy.

NumPy’s array functions are designed to handle huge inputs, and they often produce huge outputs. If you call such a function many hundreds or thousands of times, then you’ll be allocating very large amounts of memory. This can slow your program down and, in an extreme case, might even cause a memory or stack overflow.

This problem can be avoided by using the `out` parameter, which is available for both `np.max()` and `np.maximum()`, as well as for many other NumPy functions. The idea is to pre-allocate a suitable array to hold the function result, and keep reusing that same chunk of memory in subsequent calls.

You can revisit the temperature problem to create an example of using the `out` parameter with the `np.max()` function. You’ll also use the `dtype` parameter to control the type of the returned array:

Python
``````>>> temperature_buffer = np.empty(7, dtype=np.float32)
>>> temperature_buffer.shape
(7,)

>>> np.maximum(temperatures_week_1, temperatures_week_2, out=temperature_buffer)
array([ 7.3,  7.9,  nan,  8.1,  nan,  nan, 10.2], dtype=float32)
``````
Copied!

The initial values in `temperature_buffer` don’t matter, since they’ll be overwritten. But the array’s shape is important in that it must match the output shape. The displayed result looks like the output that you received from the original `np.maximum()` example. So what’s changed? The difference is that you now have the same data stored in `temperature_buffer`:

Python
``````>>> temperature_buffer
array([ 7.3,  7.9,  nan,  8.1,  nan,  nan, 10.2], dtype=float32)
``````
Copied!

The `np.maximum()` return value has been stored in the `temperature_buffer` variable, which you previously created with the right shape to accept that return value. Since you also specified `dtype=np.float32` when you declared this buffer, NumPy will do its best to convert the output data to that type.

Remember to use the buffer contents before they’re overwritten by the next call to this function.

### Filtering Arrays

Another parameter that’s occasionally useful is `where`. This applies a filter to the input array or arrays, so that only those values for which the `where` condition is `True` will be included in the comparison. The other values will be ignored, and the corresponding elements of the output array will be left unaltered. In most cases, this will leave them holding arbitrary values.

For the sake of the example, suppose you’ve decided, for whatever reason, to ignore all scores less than `60` for calculating the per-student maximum values in Professor Newton’s class. Your first attempt might go like this:

Python
``````>>> n_scores
array([[63, 72, 75, 51, 83],
[44, 53, 57, 56, 48],
[71, 77, 82, 91, 76],
[67, 56, 82, 33, 74],
[64, 76, 72, 63, 76],
[47, 56, 49, 53, 42],
[91, 93, 90, 88, 96],
[61, 56, 77, 74, 74]])

>>> n_scores.max(axis=1, where=(n_scores >= 60))
ValueError: reduction operation 'maximum' does not have an identity,
so to use a where mask one has to specify 'initial'
``````
Copied!

The problem here is that NumPy doesn’t know what to do with the students in rows `1` and `5`, who didn’t achieve a single test score of `60` or better. The solution is to provide an `initial` parameter:

Python
``````>>> n_scores.max(axis=1, where=(n_scores >= 60), initial=60)
array([83, 60, 91, 82, 76, 60, 96, 77])
``````
Copied!

With the two new parameters, `where` and `initial`, `n_scores.max()` considers only the elements greater than or equal to `60`. For the rows where there is no such element, it returns the `initial` value of `60` instead. So the lucky students at indices `1` and `5` got their best score boosted to `60` by this operation! The original `n_scores` array is untouched.

### Comparing Differently Shaped Arrays With Broadcasting

You’ve learned how to use `np.maximum()` to compare arrays with identical shapes. But it turns out that this function, along with many others in the NumPy library, is much more versatile than that. NumPy has a concept called broadcasting that provides a very useful extension to the behavior of most functions involving two arrays, including `np.maximum()`.

Whenever you call a NumPy function that operates on two arrays, `A` and `B`, it checks their `.shape` properties to see if they’re compatible. If they have exactly the same `.shape`, then NumPy just matches the arrays element by element, pairing up the element at `A[i, j]` with the element at `B[i, j]`. `np.maximum()` works like this too.

Broadcasting enables NumPy to operate on two arrays with different shapes, provided there’s still a sensible way to match up pairs of elements. The simplest example of this is to broadcast a single element over an entire array. You’ll explore broadcasting by continuing the example of Professor Newton and his linear algebra class. Suppose he asks you to ensure that none of his students receives a score below `75`. Here’s how you might do it:

Python
``````>>> np.maximum(n_scores, 75)
array([[75, 75, 75, 75, 83],
[75, 75, 75, 75, 75],
[75, 77, 82, 91, 76],
[75, 75, 82, 75, 75],
[75, 76, 75, 75, 76],
[75, 75, 75, 75, 75],
[91, 93, 90, 88, 96],
[75, 75, 77, 75, 75]])
``````
Copied!

You’ve applied the `np.maximum()` function to two arguments: `n_scores`, whose `.shape` is (8, 5), and the single scalar parameter `75`. You can think of this second parameter as a 1 × 1 array that’ll be stretched inside the function to cover eight rows and five columns. The stretched array can then be compared element by element with `n_scores`, and the pairwise maximum can be returned for each element of the result.

The result is the same as if you had compared `n_scores` with an array of its own shape, (8, 5), but with the value `75` in each element. This stretching is just conceptual—NumPy is smart enough to do all this without actually creating the stretched array. So you get the notational convenience of this example without compromising efficiency.

You can do much more with broadcasting. Professor Leibniz has noticed Newton’s skulduggery with his `best_n_scores` array, and decides to engage in a little data manipulation of her own.

Leibniz’s plan is to artificially boost all her students’ scores to be at least equal to the average score for a particular test. This will have the effect of increasing all the below-average scores—and thus produce some quite misleading results! How can you help the professor achieve her somewhat nefarious ends?

Your first step is to use the array’s `.mean()` method to create a one-dimensional array of means per test. Then you can use `np.maximum()` and broadcast this array over the entire `l_scores` matrix:

Python
``````>>> mean_l_scores = l_scores.mean(axis=0, dtype=np.integer)
>>> mean_l_scores
array([79, 68, 71, 69, 64])

>>> np.maximum(mean_l_scores, l_scores)
array([[87, 73, 71, 69, 67],
[79, 68, 82, 80, 64],
[92, 85, 71, 79, 77],
[79, 79, 71, 69, 87],
[86, 91, 92, 73, 64],
[79, 68, 71, 79, 64],
[83, 68, 71, 69, 64],
[89, 68, 72, 69, 64]])
``````
Copied!

The broadcasting happens in the highlighted function call. The one-dimensional `mean_l_scores` array has been conceptually stretched to match the two-dimensional `l_scores` array. The output array has the same `.shape` as the larger of the two input arrays, `l_scores`.

### Following Broadcasting Rules

So, what are the rules for broadcasting? A great many NumPy functions accept two array arguments. `np.maximum()` is just one of these. Arrays that can be used together in such functions are termed compatible, and their compatibility depends on the number and size of their dimensions—that is, on their `.shape`.

The simplest case occurs if the two arrays, say `A` and `B`, have identical shapes. Each element in `A` is matched, for the function’s purposes, to the element at the same index address in `B`.

Broadcasting rules get more interesting when `A` and `B` have different shapes. The elements of compatible arrays must somehow be unambiguously paired together so that each element of the larger array can interact with an element of the smaller array. The output array will have the `.shape` of the larger of the two input arrays. So compatible arrays must follow these rules:

1. If one array has fewer dimensions than the other, only the trailing dimensions are matched for compatibility. The trailing dimensions are those that are present in the `.shape` of both arrays, counting from the right. So if `A.shape` is `(99, 99, 2, 3)` and `B.shape` is `(2, 3)`, then `A` and `B` are compatible because `(2, 3)` are the trailing dimensions of each. You can completely ignore the two leftmost dimensions of `A`.

2. Even if the trailing dimensions aren’t equal, the arrays are still compatible if one of those dimensions is equal to `1` in either array. So if `A.shape` is `(99, 99, 2, 3)` as before and `B.shape` is `(1, 99, 1, 3)` or `(1, 3)` or `(1, 2, 1)` or `(1, 1)`, then `B` is still compatible with `A` in each case.

You can get a feel for the broadcasting rules by playing around in the Python REPL. You’ll be creating some toy arrays to illustrate how broadcasting works and how the output array is generated:

Python
``````>>> A = np.arange(24).reshape(2, 3, 4)
>>> A
array([[[ 0,  1,  2,  3], [ 4,  5,  6,  7], [ 8,  9, 10, 11]],
[[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]])

>>> A.shape
(2, 3, 4)

>>> B = np.array(
...     [
...         [[-7, 11, 10,  2], [-6,  7, -2, 14], [ 7,  4,  4, -1]],
...         [[18,  5, 22,  7], [25,  8, 15, 24], [31, 15, 19, 24]],
...     ]
... )

>>> B.shape
(2, 3, 4)

>>> np.maximum(A, B)
array([[[ 0, 11, 10,  3], [ 4,  7,  6, 14], [ 8,  9, 10, 11]],
[[18, 13, 22, 15], [25, 17, 18, 24], [31, 21, 22, 24]]])
``````
Copied!

There’s nothing really new to see here yet. You’ve created two arrays of identical `.shape` and applied the `np.maximum()` operation to them. Notice that the handy `.reshape()` method lets you build arrays of any shape. You can verify that the result is the element-by-element maximum of the two inputs.

The fun starts when you experiment with comparing two arrays of different shapes. Try slicing `B` to make a new array, `C`:

Python
``````>>> C = B[:, :1, :]
>>> C
array([[[-7, 11, 10,  2]],
[[18,  5, 22,  7]]])

>>> C.shape
(2, 1, 4)

>>> np.maximum(A, C)
array([[[ 0, 11, 10,  3], [ 4, 11, 10,  7], [ 8, 11, 10, 11]],
[[18, 13, 22, 15], [18, 17, 22, 19], [20, 21, 22, 23]]]))
``````
Copied!

The two arrays, `A` and `C`, are compatible because the new array’s second dimension is `1`, and the other dimensions match. Notice that the `.shape` of the result of the `maximum()` operation is the same as `A.shape`. That’s because `C`, the smaller array, is being broadcast over `A`. The result of a broadcast operation between arrays will always have the `.shape` of the larger array.

Now you can try an even more radical slicing of `B`:

Python
``````>>> D = B[:, :1, :1]
>>> D
array([[[-7]],[[18]]])

>>> D.shape
(2, 1, 1)

>>> np.maximum(A, D)
array([[[ 0,  1,  2,  3], [ 4,  5,  6,  7], [ 8,  9, 10, 11]],
[[18, 18, 18, 18], [18, 18, 18, 19], [20, 21, 22, 23]]])
``````
Copied!

Once again, the trailing dimensions of `A` and `D` are all either equal or `1`, so the arrays are compatible and the broadcast works. The result has the same `.shape` as `A`.

Perhaps the most extreme type of broadcasting occurs when one of the array parameters is passed as a scalar:

Python
``````>>> np.maximum(A, 10)
array([[[10, 10, 10, 10], [10, 10, 10, 10], [10, 10, 10, 11]],
[[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]])
``````
Copied!

NumPy automatically converts the second parameter, `10`, to an `array([10])` with `.shape` `(1,)`, determines that this converted parameter is compatible with the first, and duly broadcasts it over the entire 2 × 3 × 4 array `A`.

Finally, here’s a case where broadcasting fails:

Python
``````>>> E = B[:, 1:, :]
>>> E
array([[[-6,  7, -2, 14], [ 7,  4,  4, -1]],
[[25,  8, 15, 24], [31, 15, 19, 24]]])

>>> E.shape
(2, 2, 4)

>>> np.maximum(A, E)
Traceback (most recent call last):
...
ValueError: operands could not be broadcast together with shapes (2,3,4) (2,2,4)
``````
Copied!

If you refer back to the broadcasting rules above, you’ll see the problem: the second dimensions of `A` and `E` don’t match, and neither is equal to `1`, so the two arrays are incompatible.

You can read more about broadcasting in Look Ma, No `for` Loops: Array Programming With NumPy. There’s also a good description of the rules in the NumPy docs.

The broadcasting rules can be confusing, so it’s a good idea to play around with some toy arrays until you get a feel for how it works!

## Conclusion

In this tutorial, you’ve explored the NumPy library’s `max()` and `maximum()` operations to find the maximum values within or across arrays.

Here’s what you’ve learned:

• Why NumPy has its own `max()` function, and how you can use it
• How the `maximum()` function differs from `max()`, and when it’s needed
• Which practical applications exist for each function
• How you can handle missing data so your results make sense
• How you can apply your knowledge to the complementary task of finding minimum values

Along the way, you’ve learned or refreshed your knowledge of the basics of NumPy syntax. NumPy is a hugely popular library because of its powerful support for array operations.

Now that you’ve mastered the details of NumPy’s `max()` and `maximum()`, you’re ready to use them in your applications, or continue learning about more of the hundreds of array functions supported by NumPy.

If you’re interested in using NumPy for data science, then you’ll also want to investigate pandas, a very popular data-science library built on top of NumPy. You can learn about it in The Pandas DataFrame: Make Working With Data Delightful. And if you want to produce compelling images from data, take a look at Python Plotting With Matplotlib (Guide).

The applications of NumPy are limitless. Wherever your NumPy adventure takes you next, go forth and matrix-multiply!

Copied!
Happy Pythoning!

🐍 Python Tricks 💌

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

About Charles de Villiers

Charles teaches Physics and Math. When he isn't teaching or coding, he spends way too much time playing online chess.

» More about Charles

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Master Real-World Python Skills With Unlimited Access to Real Python

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

What Do You Think?