The multi-dimensional array is a key feature for several AI frameworks. What is an array in programming? Let’s start by reviewing some basic programming terms.
- Arrays: a data structure that is a “list of data”
- Class and Object: class is a template (blueprint) that provides instructions on how to create objects. An object is a “combination of variables, functions or data structures”
- Floats: a number with a decimal
- Functions: “program that performs a specific task” and returns a value vs procedure which “performs some operation but does not return a value”
- Integers: whole number with no decimals. Represented at ‘int’ in software code
- List: values stored in a sequence
- Loop: “code that runs itself repeatedly”
- Multi-dimensional Arrays: “an array with more 2 or more dimensions” and “in a matrix, the 2 dimensions are represented by rows and columns”
- String: in a program, it’s usually text enclosed in quotations
- Tensor: “mathematical entity that lives in a structure and interacts with other mathematical entities” that can be represented by O-D, 1-D, or 3-D cube
- Variable: a value stored in a memory address
Online instructor Sam Lavigne explains in a video the purpose of an array. He creates a simple program where two bubbles float from the bottom of the screen to the top. In the program, the two bubbles (objects) are represented as follows:
Bubble b1;
Bubble b2;
If a programmer wants five bubbles in the program, they simply add more lines of code like this:
Bubble b1;
Bubble b2;
Bubble b3;
Bubble b4;
Bubble b5;
What happens if the programmer wants 1M bubbles? Using this method is not feasible. Instead, the programmer can create an array which is a “list of data” in a single line of code.
Also, Sam demonstrates how the values in the program are stored in RAM. In the illustration (a recreation of Sam’s) below, on the left-hand side, a single integer (int) equal to 5 is stored in one location in RAM called val. On the right-hand side, several integers named values is stored in RAM as a list.
Although arrays are useful in some instances, there are cases when it becomes cumbersome. Joe James, an online instructor describes the Python NumPy array as a “high-performance multi-dimensional array library that is used for “numerical analysis”. He states that the three primary benefits of NumPy arrays are 1) saves coding time 2) executes faster than standard arrays and 3) uses less memory.
One use case that works best with a NumPy array is when a programmer needs to apply a “mathematical operation to every single element in the array.” In the illustration below, there is code for performing this operation with a list and NumPy array.
Using Python List
for i in range (len(my_list)):
my_list [i] *=3Using NumPy Array
my_array *=3
As Joe demonstrates in the video, when a mathematical operation is done with a list, the code uses a “for loop to iterate through the list before it can multiply it by a 3 as it goes through each element” (represented by the code at top). With NumPy, only one line of code is required (code at the bottom).
Developer Keith Galli states that NumPy arrays can be used to store data in 1-D arrays, 2-D arrays, 3-D arrays, or however many are required. Python lists do not support multi-dimensional arrays.
The reason a NumPy array is faster is that it stores data in a contiguous manner (illustrated below), whereas a list stores data in different sections of RAM. Thus, a list takes longer to store data and read/retrieve data.