Numpy - multidimensional data arrays

Based on lectures from

The numpy package (module) is used in almost all numerical computation using Python. It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. It is implemented in C and Fortran so when calculations are vectorized (formulated with vectors and matrices), performance is very good.

Creating numpy arrays

There are a number of ways to initialize new numpy arrays, for example from

Random number generators

File I/O

Very common form is comma-separated values (CSV) or tab-separated values (TSV). To read data from such files into Numpy arrays we can use the numpy.loadtxt or numpy.genfromtxt

File stockholm_td_adj.dat.txt contains Stockholm temperature over the years. The columns are [$year$,$month$,$day$,$T_{average}$,$T_{min}$,$T_{max}$]

Lets save the data in the form [t,$T_{average}$]

More efficient binary storage of data to the disc

Manipulating data

Indexing and slicing

data[lower:upper:step, lower:upper:step]

Fancy indexing Index is itself an array of integer numbers, i.e, which rows or columns?

Using mask to pick data*

Create a mask of [True,....False....] values, and pick from the array only columns/rows where True.

How to compute average temperature in the year of 1973?

What is the mean monthly temperatures for each month of the year?

Let's do Ferbrurary first

Now loop for all months

Linear Algebra

It is implemented in low level fortran/C code, and is much more efficient than code written in Python.

Matrix product or matrix-vector product can be performed by dot command

Slightly less efficient, but nicer code can be obtained by matrix clas

Array/Matrix transformations

More advanced linear algebra operations

Library linalg:

The eigenvalue problem for a matrix $A$:

$\displaystyle A v_n = \lambda_n v_n$

where $v_n$ is the $n$th eigenvector and $\lambda_n$ is the $n$th eigenvalue.

To calculate eigenvalues of a matrix, use the eigvals (symmetric/hermitian eigvalsh) and for calculating both eigenvalues and eigenvectors, use the function eig (or eigh):

sum, cumsum, trace, diag

Reshaping, resizing, and stacking arrays

Vectorizing functions

Every function written in Python is very slow. However numpy type operations are fast, because they are written in fortran/C

What if we have a function that can not simply work on arrays?

For example, theta function?

We can vectorize Theta, to make it applicable to arrays.

This is simply achieved by call to numpy function vectorize, which will create low-level routine from your function

How to calculate number of days in a year with positive temperatures?