Numpy
About NumPy
NumPy is an open-source Python library specializing in large, multi-dimensional arrays, matrices, and high-level mathematical functions. As Python is interpreted, NumPy provides similar functionality to MATLAB by performing most operations on arrays or matrices instead of scalars.
NumPy utilizes Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPACK) to simplify linear algebra computations. It is important to note that NumPy runs solely on CPUs. However, libraries such as PyTorch and CuPy can utilize GPUs through the DLPack Protocol for data exchange.
Python List vs. NumPy Arrays
A list in Python is a general-purpose data structure (It supports different data types), while a NumPy array is a specialized data structure for numerical operations and is more efficient for such operations.
my_array=np.array([1,2,3,4]) print(my_array) #R. [1 2 3 4] my_list=[1,2,3,4] print(my_list) #R. [1, 2, 3, 4]
Note in the examples above that the array values are not separated by comma (,) while lists do use it.
Check out our blog post to discover more features of NumPy.
Quick Installation Guide
In this tutorial you’ll learn how to install any Python library. NumPy is already installed in Jupyter Notebook. To verify and list all locally installed packages, use the following command:
pip list --format=columns
If the library is not already present in your Jupyter Notebook or any other shell environment, run “!pip install library_name” in a code cell or follow the appropriate installation procedure for the IDE you’re using.
!pip install numpy
The import statement is the most common way to invoke a library, but it’s not the only one. The importlib.import_module() and built-in __import__() statements can be also employed. To import the NumPy library, you just have to type:
import numpy as np
where np is its standard alias. You can use others but remember that it is the most adopted and frequently seen in complex codes.
NumPy Fundamentals
Creating 1D, 2D and 3D Arrays
You can use the np.array() function to create a one-dimensional array and assign it to a variable (e.g., MyArray):
MyArray = np.array([1, 2, 3]) print(MyArray) #R. [1 2 3]
But if you want to return evenly spaced N numbers over a specified interval [Int Fnl], use the function np.linspace().
MyArray = np.linspace(1, 20, 10) print(MyArray) #R. [ 1. 3.1 5.2 7.3 9.4 11.5 13.6 15.7 17.8 20.]
where the increment is given by the equation: (Int – Fnl)/(N-1).
Other useful functions are:
# Create an 1D array filled with zeros of size(2) MyArray = np.zeros(2) print(MyArray) #R. [0. 0.] # Create an empty 1D array with of size (2) MyArray = np.ones(2) print(MyArray) #R. [0. 0.] # Create an empty 1D array with a range of elements MyArray = np.arange(4) print(MyArray) #R. [0 1 2 3]
Now, a 2D array in Python has a size of N rows (axis 0) by M columns (axis 1). We can create it by using the same np.array function.
# Create an 2D array of NxM=3x4 Matrix_1 = np.array([[1,2,3,4],[1,2,3,4],[1,2,3,4]]) print(Matrix_1) #R. [[1 2 3 4] #R. [1 2 3 4] #R. [1 2 3 4]] # Other form to create a 2D array of NxM=3x4 my_array=np.array([1,2,3,4]) Matrix_1 = np.array([my_array,my_array,my_array]) print(Matrix_1) #R. [[1 2 3 4] #R. [1 2 3 4] #R. [1 2 3 4]] # Create a 2D array of NxM=3x3 filled with zeros Matrix_1 = np.zeros((3,3)) print(Matrix_1) #R. [[0. 0. 0.] #R. [0. 0. 0.] #R. [0. 0. 0.]]
Finally, a 3D array (or a multidimensional array) in Python is just a collection of 2D arrays, having a size NxMxK.
# Create an a 3D matrix of size NxMxK=3x4x2 Matrix_2=np.array([[[1,2,3,4],[5,6,7,8],[9,10,11,12]],[[13,14,15,16],[17,18,19,20],[21,22,23,24]]]) print(Matrix_2) #R. [[[ 1 2 3 4] #R. [ 5 6 7 8] #R. [ 9 10 11 12]] #R. #R. [[13 14 15 16] #R. [17 18 19 20] #R. [21 22 23 24]]]
Note: An RGB image is composed of 2D arrays, with each array representing a channel (Red, Green, and Blue) of size NxM (pixels). The values of each element range from 0 to 255 and represent the intensity of that channel.
Indexing and Slicing
When operating with arrays, you may want to take a section or a specific value from it. To do this, you need to index and split an array using its corresponding variable name, as shown in the examples below.
data = np.array([1, 2, 3]) Data[1] #R. 2 Data[0:2] #R. array([1, 2])
Note that, similar to Python, NumPy uses 0-based indexing, which means that the first element is indexed at 0, unlike software such as MATLAB, which uses 1-based indexing.
Specifying Data Types
The type of any array is deduced from the type of its elements.
# Exhibiting the format of a list and array my_list=[1,2,3,4] my_array=np.array([1,2,3,4]) print(type(my_list)) print(type(my_array)) #R. <class 'list'> #R. <class 'numpy.ndarray'>
But if you want to be sure, you can define the data type during its creation.
my_array=np.array([[1,2,3],[1,2,3]], dtype=complex) print(my_array) #R. [[1.+0.j 2.+0.j 3.+0.j] #R. [1.+0.j 2.+0.j 3.+0.j]] my_array=np.array([[1.1,2.1,3.1],[1.1,2.1,3.1]], dtype=int) print(my_array) #R. [[1 2 3] #R. [1 2 3]]
Manipulating Arrays
NumPy simplifies the execution of mathematical operations on arrays, like addition, subtraction, multiplication, and division. Here are some examples to illustrate this:
# Math operations (addition, subtraction, multiplication, and division) import numpy as np My_arrayA = np.array([[1, 1], [1, 1]]) My_arrayB = np.array([[3.0, 4.0], [5.0, 6.0]]) SUM_AB=My_arrayA+My_arrayB print(SUM_AB) #R. [[4. 5.] #R. [6. 7.]] My_arrayC = np.array([[0.0, 0.0], [0.0, 0.0]]) SUM=np.add(5,My_arrayB,My_arrayC) print(SUM) #R. [[ 8. 9.] #R. [10. 11.]] SUbS_AB=My_arrayA-My_arrayB print(SUbS_AB) #R. [[-2. -3.] #R. [-4. -5.]] MULT_AB=My_arrayA*My_arrayB # Product of elements print(MULT_AB) #R. [[3. 4.] #R. [5. 6.]] DIV_AB=My_arrayA/My_arrayB print(DIV_AB) #R. [[0.33333333 0.25 ] #R. [0.2 0.16666667]] MULdot_AB=np.dot(My_arrayA,My_arrayB) # Product of matrices print(MULdot) #R. [[ 8. 10.] #R. [ 8. 10.]] print(My_arrayA.sum()) # sum of all values in the array #R. 4 print(My_arrayB.mean(axis = 1)) # mean #R. [3.5 5.5]
For more information, refer to the NumPy documentation.
Highlights
Project Background
- Project: Numpy
- Author: Travis Oliphant
- Initial Release: as Numeric (1995), NumPy (2006)
- Type: Numerical Analysis
- License: BSD
- Contains: Numerical computing tools, random number generators, SciPy library
- Language: Python, C
- GitHub: numpy/numpy
- Runs On: Windows, MacOS, and Linux
- Twitter: /numpy_team
Main Features
- You can resize an array to a new shape while preserving the existing data
- It offers a wide range of functionality for working with arrays of homogeneous data, making it an essential tool for data scientists and engineers.
- It allows programmers to perform combined operations through tools like random number generators, linear algebra routines, Fourier transforms, array handling, and more.
- It is an open-source that runs only on CPU.
Differentiates
- Faster than list
- Great for data analysis
- Reduce memory usage.
- Easy to work with
Projects and Libraries
- CuPy
- Scipy.sparse
- Dask
- XArray
- JAX (For more information, check our article)
- astropy.units
Community Benchmarks
- 22,500 Stars
- 7,700 Forks
- 1,433 Code contributors
- 95+ Releases
- Source: NumPy GitHub
Releases
- v1.24.1 (12-2022): Maintenance release that fixes bugs and regressions.
- V1.24.0 (12/2022): Enhancements in the management and advancement of data types and in accelerating the performance.
- V1.23.5 (11-19-2022): Maintenance release that fixes bugs discovered after the 1.23.4 release and keeps the build infrastructure current.
- V1.23.4 (10-12-2022): maintenance release that fixes bugs discovered after the 1.23.3 release and keeps the build infrastructure current.
- Source: NumPy releases GitHub
Applications
- Comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms.
- Vectorization, indexing, and broadcasting concepts of array computing
References
[1] Documentation, 2023. https://numpy.org/
[2] NumPy– github, 2023. https://github.com/numpy