# NumPy: Zero to Hero

GitHub: https://github.com/hasan-firat-data-and-business-analyst/Data-Science-Python/blob/main/NumPy.py

# What is NumPy ?

NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

# Data Structures

The main data structure in NumPy is the **ndarray**, which is a shorthand name for N-dimensional array. When working with NumPy, data in an ndarray is simply referred to as an array. It is a fixed-sized array in memory that contains data of the same type, such as integers or floating point values.

# Data Types in Python

By default Python have these data types:

- strings — used to represent text data, the text is given under quote marks. e.g. “ABCD”
- integer — used to represent integer numbers. e.g. -1, -2, -3
- float — used to represent real numbers. e.g. 1.2, 42.42
- boolean — used to represent True or False.
- complex — used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j

# Data Types in NumPy

NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

- i — integer
- b — boolean
- u — unsigned integer
- f — float
- c — complex float
- m — timedelta
- M — datetime
- O — object
- S — string
- U — unicode string
- V — fixed chunk of memory for other type ( void )

The most common way to work with numbers in NumPy is through ndarray objects. They are similar to Python lists, but can have any number of dimensions. Also, `ndarray`

supports fast math operations.

Since it can store any number of dimensions, one can use `ndarray`

s to represent any of the data types : scalars, vectors, matrices, or tensors. ( More information will be given in the “ Array’s Dimension “ )

# Install of NumPy

Using the command

pip install numpy

Using the panel

Import numpy as np

Note: the part “ as np “ is kind of alias, and “ as “ is keyword for using the alias. You are able to use whatever you want after the keyword( “as”), such as “ import numpy as datascience “ or “ import numpy as material_data_science”, for example.

# Visualisation for deep understanding the form of array

# Creating the arrays

The array object in NumPy is called `ndarray`

We are able to create a NumPy `ndarray`

object by using the `array()`

function.

**type():** “ type() “ is built-in function in Python and it gives the type of the object. In the above example, as we can see that array (the folder name) is “ numpy.ndarray “ type.

# The second way to create the arrays

What is the difference between the first and the second example?

I guess, there is no difference as a result. But, the first one we used “ print () “ function, such as print(array), and we got

“ [1,2,3,4] “ as a result. In the second example, we did not use “ print() “ function and we got “ array([11,22,33,44,55]) “ as a result. The examples are extended for “ type() “ and “ print(type()) “.If you change the function for the both examples, you are able to understand what I said.

# Array’s Dimension

A dimension in arrays is the levels of array depth also know as nested arrays, which is the arrays having the arrays as their factors or elements.

# 0 — D Arrays or Scalar

The factor in an array.

# 1 — D Arrays or Uni-Dimensional

An array, that have arrays, such as 0 — D Arrays example, is called 1 — D arrays.

# 2 — D Arrays

An array having 1 — D arrays as its factor is called 2 — D arrays. This is common in matrix calculations.

Do not forget!!! The number of factors or elements in each arrays must be equal. In above example, each arrays have 4 factors.

# 3 — D Arrays

An array having 2 — D arrays as its factor or elements is called 3 — D array.

# The second example

As we may see that, you do not have to use only numbers in arrays.

# How to find the number of dimensions ?

# Shape Function in NumPy

In shape method,

0 — D does not have any columns or rows, that’s why it is “ () “

1 — D has one row having 5 factors, and it is “ (5,) “

2 — D has two rows and each rows have 4 elements, and it is “ (2,4) “

3 — D has 3 different clusters and each clusters have 2 rows and each rows have 5 elements or factors, that’s why it is “ (3,2,4)

# More information in shape function

In the second example above at index-4 we have value 4, so we can say that 5th ( 4 + 1 th) dimension has 4 elements.

# Reshape

Reshaping means changing the shape of an array. The shape of an array is the number of elements in each dimension. By reshaping we can add or remove dimensions or change number of elements in each dimension.

# 1-D to 2-D

# 1-D to 3-D

# Size function in NumPy

This function determines the number of values in arrays.

# N — Dimensional Arrays

An array may have any number of dimensions, and you are able to use “ ndmin “ function for getting the number of dimensions.

In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

# Indexing Arrays

# Access Array Elements

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

# Mathematical Operations

When we try to take mathematical action with “ indexing “, we fail to take it action. Because “ indexing “ is not int, float or Boolean, that’s why we fail to use mathematical operations including “ mathematical addition” . As you can think that, we did not fail to do addition in the below table. Frankly speaking, this not “ mathematical addition “, this is kind of assembly or combination. If you check the examples below out, carefully. You are going to understand totally, what I mentioned.

# 2-D Arrays

# 3-D Arrays

# Negative Indexing

# Slicing Arrays

Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [*start*:*end*].

We can also define the step, like this: [*start*:*end*:*step*].

If we don’t pass start its considered 0

If we don’t pass end its considered length of array in that dimension

If we don’t pass step its considered 1

# Negative Slicing

# STEP

Use the step value to determine the step of the slicing.

Table 1. Return every other element from index 1 to index 0. In the first example, the change of the number is 2 between 1 and 5, in the second one, the change of the number is 3 between 1 and 8.

# Slicing multi-dimension

# Checking the Data Type of an Array

# Creating Arrays With a Defined Data Type

The below example, we fail to use both number and string in the same code block.

The last example demonstrates the changing of the types, such as float to integer or Boolean to float for example and the examples can be extended.

# Copy and View

The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

# Copy

The copy *owns* the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

# View

The view *does not own* the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

# Status of data in arrays

Every NumPy array has the attribute base that returns None if the array owns the data. Otherwise, the base attribute refers to the original object.

The copy returns `None`

. The view returns the original array.

# Joining

We pass a sequence of arrays that we want to join to the `concatenate()`

function, along with the axis. If axis is not explicitly passed, it is taken as 0.

# Joining with “ Stack “ function

Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

# Splitting

Splitting is reverse operation of Joining. We use `array_split()`

for splitting arrays.

# Searching

You can search an array for a certain value, and return the indexes that get a match. To search an array, use the where() method.

# Search Sorted and Search from right side

There is a method called `searchsorted()`

which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.

Example explained: The number 7 should be inserted on index 1 to remain the sort order.

The method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value.

By default the left most index is returned, but we can give side=’right’ to return the right most index instead.

As you can see that, the system starts to search the exact value on the right side.

# Find

The find() function finds the substring in a given array of string, between the provided range **[start, end]** returning the **first index** from where the substring starts.

This function calls str.find function internally in an element-wise manner.

# Sort

Sorting means putting elements in an *ordered sequence*.

*Ordered sequence* is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.

The NumPy ndarray object has a function called sort(), that will sort a specified array.

# Filtering

Getting some elements out of an existing array and creating a new array out of them is called *filtering*.

In NumPy, you filter an array using a *boolean index list*.

A *boolean index list* is a list of booleans corresponding to indexes in the array.

If the value at an index is `True`

that element is contained in the filtered array, if the value at that index is `False`

that element is excluded from the filtered array.

The example above will return [11, 22, 77], and [11, 33, 77, ‘ Toronto’] why?

Because the new filter contains only the values where the filter array had the value True, in this case, index 0, 2, and 4, and 0, 2, 4, and 5 .

# Creating filter with a condition

# Zeros

# Ones

# Arange

numpy.arange([*start*, ]*stop*, [*step*, ]*dtype=None*, ***, *like=None*)

Return evenly spaced values within a given interval.

Values are generated within the half-open interval [start, stop) (in other words, the interval including *start* but excluding *stop*). For integer arguments the function is equivalent to the Python built-in *range* function, but returns an ndarray rather than a list.

# Linspace

numpy.linspace(*start*, *stop*, *num=50*, *endpoint=True*, *retstep=False*, *dtype=None*, *axis=0*)[source]

Return evenly spaced numbers over a specified interval.

Returns *num* evenly spaced samples, calculated over the interval [*start*, *stop*].

The endpoint of the interval can optionally be excluded.

Changed in version 1.16.0: Non-scalar *start* and *stop* are now supported.

Changed in version 1.20.0: Values are rounded towards -inf instead of 0 when an integer dtype is specified. The old behavior can still be obtained with np.linspace(start, stop, num).astype(int)

**num**int, optional

Number of samples to generate. Default is 50. Must be non-negative.

**endpoint**bool, optional

If True, *stop* is the last sample. Otherwise, it is not included. Default is True.

**retstep**bool, optional

If True, return (*samples*, *step*), where *step* is the spacing between samples.

-> **start : **[optional] start of interval range. By default start = 0

-> **stop : **end of interval range

-> **restep : **If True, return (samples, step). By deflut restep = False

-> **num : **[int, optional] No. of samples to generate

- >
**dtype :**type of output array

# What is the difference between Arange and Linspace

arange is Similar to linspace, but uses a step size (instead of the number of samples).

Arange() return values with in a range which has a space between values.

linspace() return set of samples with in a given interval.

In sum, arange allow you to define the size of the step. linspace allow you to define the number of steps.

# Random

The Generator provides access to a wide range of distributions, and served as a replacement for RandomState. The main difference between the two is that Generator relies on an additional BitGenerator to manage state and generate the random bits, which are then transformed into random values from useful distributions. The default BitGenerator used by Generator is PCG64. The BitGenerator can be changed by passing an instantized BitGenerator to Generator.

numpy.random.default_rng()¶

# Random Sample

random.rand(*d0*, *d1*, *…*, *dn*)¶ ==== Random values in a given shape.

# Generate Random Floats

The `random.random()`

method returns a random float number between 0.0 to 1.0. The function doesn't need any arguments.

# Generate Random Integers

The `random.randint()`

method returns a random integer between the specified integers.

# Generate Random Numbers within Range

The `random.randrange()`

method returns a randomly selected element from the range created by the start, stop and step arguments. The value of start is 0 by default. Similarly, the value of step is 1 by default.

# Select Random Elements

The `random.choice()`

method returns a randomly selected element from a non-empty sequence. An empty sequence as argument raises an IndexError.

# Shuffle Elements Randomly

The `random.shuffle()`

method randomly reorders the elements in a list.

# Full

**numpy.full(shape, fill_value, dtype = None, order = ‘C’) : **Return a new array with the same shape and type as a given array filled with a fill_value.

**Parameters :**

**shape : **Number of rows

**order : **C_contiguous or F_contiguous

**dtype : **[optional, float(by Default)] Data type of returned array.

**fill_value : **[bool, optional] Value to fill in the array.

# Empty

Return a new array of given shape and type, without initializing entries.

empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster. On the other hand, it requires the user to manually set all the values in the array, and should be used with caution.

# What is difference between Array and List

**Array can store elements of only one data type but List can store the elements of different data types too**. Hence, Array stores homogeneous data values, and the list can store heterogeneous data values

# What is the difference between ndarray and array in numpy?

1- using `array()`

, `zeros()`

or `empty()`

methods: *Arrays should be constructed using array, zeros or empty (refer to the See Also section below). The parameters given here refer to a low-level method (**ndarray(…)**) for instantiating an array.*

2- from `ndarray`

class directly: *There are two modes of creating an array using **__new__**: If buffer is None, then only shape, dtype, and order are used. If buffer is an object exposing the buffer interface, then all keywords are interpreted.*