Sie sind auf Seite 1von 11

Numpy

Numpy (Numeric Python) is a open source add-on module to python that provide common mathematical
and numerical routines in pre-compiled, fast functions. The NumPy package provides basic routines for
manipulating large arrays and matrices of numeric data. You can find more tutorials at
http://wiki.scipy.org/Tentative_NumPy_Tutorial (http://wiki.scipy.org/Tentative_NumPy_Tutorial). Also
check http://www.numpy.org (http://www.numpy.org) for additional informations.

Importing the Numpy module


Array Creation
Accessing Array Elements
Masked Array
Array Operations
Broadcasting
Reading text files

Exercise time

Importing the NumPy module


There are several ways to import NumPy. The standard approach is to use a simple import statement:

In [3]: import numpy

However, for large amounts of calls to NumPy functions, it can become tedious to write numpy.X over
and over again. Instead, it is common to import under the briefer name np:

In [4]: import numpy as np

Array Creation

In [3]: # 1D array
np.array([1, 2, 3, 4])

Out[3]: array([1, 2, 3, 4])

In [4]: # 2D array
np.array([[1, 2], [3, 4]])

Out[4]: array([[1, 2],


[3, 4]])
In [5]: a = np.array([[1, 2], [3, 4]])
print(a, type(a))

(array([[1, 2],
[3, 4]]), <type 'numpy.ndarray'>)

A numpy array is an object with many attributes

In [7]: print('the numpy array a has a :')


print('\t* length (a.size) : %i' % a.size )
print('\t* shape (a.shape) : %s' % str(a.shape))
print('\t* data type (a.dtype) : %s' % a.dtype )

the numpy array a has a :


* length (a.size) : 4
* shape (a.shape) : (2, 2)
* data type (a.dtype) : int64

zeros and ones


To quickly initialize a numpy array, there are a few convenience methods that only require a shape as
input (and optionally a dtype)..

In [8]: np.zeros((2, 2), dtype=float)

Out[8]: array([[0., 0.],


[0., 0.]])

In [9]: np.ones((3, 5), dtype=int)

Out[9]: array([[1, 1, 1, 1, 1],


[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1]])

arange
A range can be quickly created with the arange method ( as indgen in IDL)

In [10]: np.arange(10)

Out[10]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Additional arguments enable to set the lower and upper bounds as well as the range step.

In [11]: np.arange(2, 3, 0.25)

Out[11]: array([2. , 2.25, 2.5 , 2.75])


Accessing Array Elements

Indexing in 1D numpy arrays are accessed with the same slicing as for lists.

Reminder: [start:end:step]

In [12]: b = np.arange(10)
b[0:9:2]

Out[12]: array([0, 2, 4, 6, 8])

Indexing in n-dimensions The first index represents the row, the second represents the column.
Dimensions need to be separated with commas ','.

In [13]: # Square array from arange with reshape method


c = np.arange(5 * 5).reshape(5, 5)
print(c)
print(c[2, 3]) # Third row, fourth column
print(c[-1, 1]) # Last row, second column

[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
13
21

Matrices and vectors

Operations on vectors and matrices are handled easily, but upcasting (int => float) is always happening.

In [19]: a = np.array([[1, 1],[0, 0]], dtype=float)


b = np.array([[1, 1],[1, 1]])
c = np.array([[1, 2],[3, 4]])
(a + 2) + (b * 4) * c

Out[19]: array([[ 7., 11.],


[14., 18.]])

Between matrices, there is the element-wise product *


In [20]: print a
print c
a * c

[[1. 1.]
[0. 0.]]
[[1 2]
[3 4]]

Out[20]: array([[1., 2.],


[0., 0.]])

and the matrix product np.dot.

In [21]: np.dot(a, c)

Out[21]: array([[4., 6.],


[0., 0.]])

For more complex operations on vectors and matrices, there is a submodule in numpy dedicated to
linear algebra called np.linalg for eigen vector decomposition, inverse operations, etc.

Masked Array

You can associate a mask to an array to create a masked array

In [22]: a = np.array([1,2,3,4])
mask_a = np.array(a == 2)
print(a)
print(mask_a) # True when masked

[1 2 3 4]
[False True False False]

In [23]: masked_a = np.ma.array(a, mask=mask_a)


masked_a

Out[23]: masked_array(data=[1, --, 3, 4],


mask=[False, True, False, False],
fill_value=999999)

In [24]: print(a.sum())
print(masked_a.sum()) # The mask are taken into account

10
8
In [25]: masked_b = np.ma.array(a, mask=(a==1))
masked_b

Out[25]: masked_array(data=[--, 2, 3, 4],


mask=[ True, False, False, False],
fill_value=999999)

In [26]: masked_a+masked_b # Will create a common mask

Out[26]: masked_array(data=[--, --, 6, 8],


mask=[ True, True, False, False],
fill_value=999999)

Array operations

Basic operations

In [27]: c = np.arange(5 * 5).reshape(5, 5)


print c

[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]

In [28]: c + 1

Out[28]: array([[ 1, 2, 3, 4, 5],


[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])

In [29]: c * 5 - 3

Out[29]: array([[ -3, 2, 7, 12, 17],


[ 22, 27, 32, 37, 42],
[ 47, 52, 57, 62, 67],
[ 72, 77, 82, 87, 92],
[ 97, 102, 107, 112, 117]])

Also work inplace


In [30]: c += 3
c

Out[30]: array([[ 3, 4, 5, 6, 7],


[ 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17],
[18, 19, 20, 21, 22],
[23, 24, 25, 26, 27]])

ufuncs

All common arithmetic operations are optimally implemented in numpy and take advantage of the C
machinery underneath. It means that squaring an array or computing its sum is much more efficient in
numpy than just computing it from lists. These small convenience methods are implemented as universal
functions (ufunc - http://docs.scipy.org/doc/numpy/reference/ufuncs.html
(http://docs.scipy.org/doc/numpy/reference/ufuncs.html)), like min, max, mean, std, etc.. can provide
information on the whole array

In [31]: c

Out[31]: array([[ 3, 4, 5, 6, 7],


[ 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17],
[18, 19, 20, 21, 22],
[23, 24, 25, 26, 27]])

In [32]: c.mean()

Out[32]: 15.0

In [33]: c.std()

Out[33]: 7.211102550927978

or in multidimensional data, can provide information along a given axis ( 0 = rows | 1 = columns )

In [34]: c.sum(axis=1)

Out[34]: array([ 25, 50, 75, 100, 125])

In [35]: c.max(axis=0)

Out[35]: array([23, 24, 25, 26, 27])

This is a simpler way of applying the np.min() or np.std() functions to the array.
Broadcasting

One of the major feature of numpy is the use of array broadcasting. Broadcasting allows operations (such
as addition, multiplication etc.) which are normally element-wise to be carried on arrays of different
shapes. It is a virtual replication of the arrays along the missing dimensions. It can be seen as a
generalization of operations involving an array and a scalar.

In [36]: matrix = np.zeros((4, 5))


matrix

Out[36]: array([[0., 0., 0., 0., 0.],


[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])

The addition of a scalar on an matrix can be seen as the addition of a matrix with identical elements (and
same dimensions)

In [37]: matrix + 6

Out[37]: array([[6., 6., 6., 6., 6.],


[6., 6., 6., 6., 6.],
[6., 6., 6., 6., 6.],
[6., 6., 6., 6., 6.]])

The addition of a row on a matrix will be seen as the addition of a matrix with replicated rows (the
number of columns must match).

In [38]: row = np.arange(5)


row

Out[38]: array([0, 1, 2, 3, 4])

In [39]: matrix + row

Out[39]: array([[0., 1., 2., 3., 4.],


[0., 1., 2., 3., 4.],
[0., 1., 2., 3., 4.],
[0., 1., 2., 3., 4.]])

The addition of a column on a matrix will be seen as the addition of a matrix with replicated columns (the
number of rows must match).
In [40]: column = np.ones(4)
column

Out[40]: array([1., 1., 1., 1.])

In [41]: matrix + column # This will fail bec

------------------------------------------------------------------
---------
ValueError Traceback (most recent c
all last)
<ipython-input-41-b29c50e50df5> in <module>()
----> 1 matrix + column # This will fail bec

ValueError: operands could not be broadcast together with shapes (


4,5) (4,)

This one failed since the righmost dimensions are different. So for columns, an additional dimension must
be specified and added on the right, indexing the array with an additional np.newaxis or simply None.

In [42]: column = np.arange(4)[:, None] # or np.ones(4)[:, np.newaxis]


column

Out[42]: array([[0],
[1],
[2],
[3]])

In [43]: matrix + column

Out[43]: array([[0., 0., 0., 0., 0.],


[1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2.],
[3., 3., 3., 3., 3.]])

NOTE: In the row case above, the shapes also did not match (4,5) for the matrix and (5,) for the row. The
actual rule of broadcasting is that for arrays of different rank, dimensions of length 1 are prepended
(added on the left of the array shape) until the two arrays have the same rank. For this reason, arrays with
the following shapes can be broadcasted together: (1, 1, 1, 8) and (9, 1) (4, 1, 9) and (3, 1)

For loop
Generally, we want to avoid iterating over the elements of arrays whenever we can (at all costs). The
reason is that in a interpreted language like Python (or MATLAB), iterations are really slow compared to
vectorized operations. However, sometimes iterations are unavoidable. For such cases, the Python for
loop is the most convenient way to iterate over an array:
In [5]: v = np.array([1,2,3,4])
for element in v:
print(element)

1
2
3
4

Reading text files

Numpy as basic capabilities to read text files

In [44]: %%file numpy_data.txt


1 2 3
4 5 6

Writing numpy_data.txt

In [45]: data = np.loadtxt('numpy_data.txt')


data

Out[45]: array([[1., 2., 3.],


[4., 5., 6.]])

In [46]: np.savetxt('test.txt', data, fmt='%.2f')

You are looking for a particular method in numpy ?

Simply use the lookfor method.


In [47]: np.lookfor('fourier transform')

Search results for 'fourier transform'


--------------------------------------
numpy.fft.fft
Compute the one-dimensional discrete Fourier Transform.
numpy.fft.fft2
Compute the 2-dimensional discrete Fourier Transform
numpy.fft.fftn
Compute the N-dimensional discrete Fourier Transform.
numpy.fft.ifft
Compute the one-dimensional inverse discrete Fourier Transform
.
numpy.fft.rfft
Compute the one-dimensional discrete Fourier Transform for rea
l input.
numpy.fft.ifft2
Compute the 2-dimensional inverse discrete Fourier Transform.
numpy.fft.ifftn
Compute the N-dimensional inverse discrete Fourier Transform.
numpy.fft.rfftn
Compute the N-dimensional discrete Fourier Transform for real
input.
numpy.fft.fftfreq
Return the Discrete Fourier Transform sample frequencies.
numpy.fft.rfftfreq
Return the Discrete Fourier Transform sample frequencies
numpy.bartlett
Return the Bartlett window.
numpy.convolve
Returns the discrete, linear convolution of two one-dimensiona
l sequences.
numpy.fft.irfft
Compute the inverse of the n-point DFT for real input.
numpy.fft.rfft2
Compute the 2-dimensional FFT of a real array.
numpy.fft.irfftn
Compute the inverse of the N-dimensional FFT of real input.
Exercise time !
1. Write a function called isprime() that determines whether a number is prime or not, and returns
either True or False accordingly.
2. Write a Python program to print out the first N numbers in the Fibonacci sequence. The program
should ask the user for N, and should require that N be greater than 2.
3. Write a program to accept as a command line argument the temperature in Celsius and print the
Fahrenheit equivalent of the same.
4. Write a program that accepts a file name and checks whether the file name is bad or not. A file
is defined to be bad if it contains a space or special symbols such as $, %, ^, &, *, (,) are used in
a file.