Sie sind auf Seite 1von 71

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Matplotlib
A tutorial

Devert Alexandre
School of Software Engineering of USTC

30 November 2012 — Slide 1/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 2/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Curve plot
Let’s plot a curve

i m p o r t math
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
n b S a m p l e s = 256
xRange = (−math . p i , math . p i )

x , y = [] , []
f o r n i n xrange ( nbSamples ) :
k = ( n + 0 . 5 ) / nbSamples
x . append ( xRange [ 0 ] + ( xRange [ 1 ] − xRange [ 0 ] ) ∗ k )
y . append ( math . s i n ( x [ − 1 ] ) )

# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 3/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Curve plot
This will show you something like this

1.0

0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 4/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

numpy
matplotlib can work with numpy arrays

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
n b S a m p l e s = 256
xRange = (−math . p i , math . p i )

x , y = numpy . z e r o s ( n b S a m p l e s ) , numpy . z e r o s ( n b S a m p l e s )
f o r n i n xrange ( nbSamples ) :
k = ( n + 0 . 5 ) / nbSamples
x [ n ] = xRange [ 0 ] + ( xRange [ 1 ] − xRange [ 0 ] ) ∗ k
y [ n ] = math . s i n ( x [ n ] )

# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

numpy provides a lot of function and is efficient


Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 5/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

numpy
• zeros build arrays filled of 0
• linspace build arrays filled with an arithmetic
sequence

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . z e r o s ( n b S a m p l e s )
f o r n i n xrange ( nbSamples ) :
y [ n ] = math . s i n ( x [ n ] )

# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 6/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

numpy

numpy functions can work on entire arrays

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )

# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 7/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

PDF output

Exporting to a PDF file is just one change

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )

# Plot the s i n u s o i d
plt . plot (x , y)
p l t . s a v e f i g ( ’ s i n −p l o t . p d f ’ , t r a n s p a r e n t=True )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 8/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 9/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple curves

It’s often convenient to show several curves in one figure

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Plot the curves


plt . plot (x , y)
plt . plot (x , z)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 10/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple curves
It’s often convenient to show several curves in one figure

1.0

0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 10/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom colors

Changing colors can help to make nice documents

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , c= ’#FF4500 ’ )
p l t . p l o t ( x , z , c= ’ #4682B4 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 11/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom colors
Changing colors can help to make nice documents

1.0

0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 11/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line thickness

Line thickness can be changed as well

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , l i n e w i d t h =3 , c= ’#FF4500 ’ )
p l t . p l o t ( x , z , c= ’ #4682B4 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 12/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line thickness
Line thickness can be changed as well

1.0

0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 12/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line patterns

For printed document, colors can be replaced by line


patterns

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# L i n e s t y l e s can be ’ − ’ , ’−−’, ’ − . ’ , ’: ’

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , l i n e s t y l e = ’−−’ , c= ’ #000000 ’ )
p l t . p l o t ( x , z , c= ’ #808080 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 13/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line patterns
For printed document, colors can be replaced by line
patterns

1.0

0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 13/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Markers

It sometime relevant to show the data points

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# M a r k e r s can be ’. ’ , ’,’, ’o ’ , ’ 1 ’ and more

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=64)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , m a r k e r= ’ 1 ’ , m a r k e r s i z e =15 , c= ’ #000000 ’ )
p l t . p l o t ( x , z , c= ’ #000000 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 14/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Markers
It sometime relevant to show the data points

1.0

0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 14/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend

A legend can help to make self–explanatory figures

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# l e g e n d l o c a t i o n can be ’ b e s t ’ , ’ center ’ , ’ left ’, ’ right ’ , etc .

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , c= ’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )
p l t . p l o t ( x , z , c= ’ #4682B4 ’ , l a b e l= ’ c o s ( x ) ’ )
p l t . l e g e n d ( l o c= ’ b e s t ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 15/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend
A legend can help to make self–explanatory figures

1.0
sin(x)
cos(x)
0.5

0.0

0.5

1.0 4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 15/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom axis scale


Changing the axis scale can improve readability

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# l e g e n d l o c a t i o n can be ’ b e s t ’ , ’ center ’ , ’ left ’, ’ right ’ , etc .

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

a x i s . s e t y l i m ( −0.5 ∗ math . p i , 0 . 5 ∗ math . p i )

# Plot the curves


p l t . p l o t ( x , y , c= ’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )
p l t . p l o t ( x , z , c= ’ #4682B4 ’ , l a b e l= ’ c o s ( x ) ’ )
p l t . l e g e n d ( l o c= ’ b e s t ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 16/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom axis scale


Changing the axis scale can improve readability

1.5
sin(x)
cos(x)
1.0

0.5

0.0

0.5

1.0

1.5
4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 16/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Grid
Same goes for a grid, can be helpful

i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# l e g e n d l o c a t i o n can be ’ b e s t ’ , ’ center ’ , ’ left ’, ’ right ’ , etc .

# G e n e r a t e a s i n u s o i d and a c o s i n o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=256)
y = numpy . s i n ( x )
z = numpy . c o s ( x )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

a x i s . s e t y l i m ( −0.5 ∗ math . p i , 0 . 5 ∗ math . p i )


a x i s . g r i d ( True )

# Plot the curves


p l t . p l o t ( x , y , c= ’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )
p l t . p l o t ( x , z , c= ’ #4682B4 ’ , l a b e l= ’ c o s ( x ) ’ )
p l t . l e g e n d ( l o c= ’ b e s t ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 17/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Grid
Same goes for a grid, can be helpful

1.5
sin(x)
cos(x)
1.0

0.5

0.0

0.5

1.0

1.5
4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 17/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars
Your data might come with a known measure error

import math
import numpy
import numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate a noisy s i n u s o i d
x = numpy . l i n s p a c e (−math . p i , math . p i , num=48)
y = numpy . s i n ( x + 0 . 0 5 ∗ numpy . random . s t a n d a r d n o r m a l ( l e n ( x ) ) )
y e r r o r = 0 . 1 ∗ numpy . random . s t a n d a r d n o r m a l ( l e n ( x ) )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

a x i s . s e t y l i m ( −0.5 ∗ math . p i , 0 . 5 ∗ math . p i )

# Plot the curves


p l t . p l o t ( x , y , c= ’#FF4500 ’ )
p l t . e r r o r b a r ( x , y , y e r r=y e r r o r )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 18/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars
Your data might come with a known measure error

1.5

1.0

0.5

0.0

0.5

1.0

1.5
4 3 2 1 0 1 2 3 4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 18/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 19/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Scatter plot

A scatter plot just shows one point for each dataset entry

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )

# Plot the points


plt . scatter (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 20/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Scatter plot
A scatter plot just shows one point for each dataset entry

5
4
3
2
1
0
1
2
3
43 2 1 0 1 2 3

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 20/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratio
If can be very important to have the same scale on both
axis

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = 0 . 1 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= ’ e q u a l ’ )

# Plot the points


p l t . s c a t t e r ( x , y , c= ’#FF4500 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 21/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratio
If can be very important to have the same scale on both
axis

0.3
0.2
0.1
0.0
0.1
0.2
0.3 3 2 1 0 1 2 3

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 21/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratio
Alternative way to keep the same scale on both axis

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = 0 . 1 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

cmin , cmax = min ( min ( x ) , min ( y ) ) , max ( max ( x ) , max ( y ) )


cmin −= 0 . 0 5 ∗ ( cmax − cmin )
cmax += 0 . 0 5 ∗ ( cmax − cmin )

a x i s . s e t x l i m ( cmin , cmax )
a x i s . s e t y l i m ( cmin , cmax )

# Plot the points


p l t . s c a t t e r ( x , y , c= ’#FF4500 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 22/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratio
Alternative way to keep the same scale on both axis

2 1 0 1 2

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 22/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple scatter plots


As for curve, you can show 2 datasets on one figure
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

c o l o r s = ( ’#FF4500 ’ , ’#3CB371 ’ , ’ #4682B4 ’ , ’#DB7093 ’ , ’#FFD700 ’ )

# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100

x , y = [] , []

x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]

x += [ 0 . 5 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]

# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= ’ e q u a l ’ )

# Plot the points


f o r i in xrange ( len ( x ) ) :
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l o r s [ i % l e n ( c o l o r s ) ] )

p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 23/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple scatter plots


As for curve, you can show 2 datasets on one figure

43 2 1 0 1 2 3 4 5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 23/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Showing centers
It can help to see the centers or the median points
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

c o l o r s = ( ’#FF4500 ’ , ’#3CB371 ’ , ’ #4682B4 ’ , ’#DB7093 ’ , ’#FFD700 ’ )

# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100

x , y = [] , []

x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]

x += [ 0 . 5 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]

# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= ’ e q u a l ’ )

# Plot the points


f o r i in xrange ( len ( x ) ) :
col = colors [ i % len ( colors )]
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l )
p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , c=c o l , s =250)

p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 24/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Showing centers
It can help to see the centers or the median points

43 2 1 0 1 2 3 4 5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 24/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Marker styles
You can use different markers styles
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

m a r k e r s = ( ’+ ’ , ’ ˆ ’ , ’ . ’ )

# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100

x , y = [] , []

x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]

x += [ 0 . 5 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 ∗ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]

# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= ’ e q u a l ’ )

# Plot the points


f o r i in xrange ( len ( x ) ) :
m = markers [ i % l en ( markers ) ]
p l t . s c a t t e r ( x [ i ] , y [ i ] , m a r k e r=m, c= ’ #000000 ’ )
p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , m a r k e r=m, s =250 , c= ’ #000000

p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 25/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Marker styles
You can use different markers styles

10

44 3 2 1 0 1 2 3 4 5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 25/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 26/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots

Let’s do a simple boxplot

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


x = numpy . random . s t a n d a r d n o r m a l ( 2 5 6 )

# Show a b o x p l o t o f t h e d a t a
plt . boxplot (x)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 27/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots
Let’s do a simple boxplot

3 1

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 27/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots

You might want to show the original data in the same


time

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


x = numpy . random . s t a n d a r d n o r m a l ( 2 5 6 )

# Show a b o x p l o t o f t h e d a t a
p l t . s c a t t e r ( [ 0 ] ∗ l e n ( x ) , x , c= ’ #4682B4 ’ )
p l t . boxplot (x , p o s i t i o n s =[0])
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 28/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots
You might want to show the original data in the same
time

4 0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 28/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple boxplots
Boxplots are often used to show side by side various
distributions

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


data = [ ]
f o r i in xrange ( 5 ) :
mu = 10 ∗ numpy . random . r a n d o m s a m p l e ( )
s i g m a = 2 ∗ numpy . random . r a n d o m s a m p l e ( ) + 0 . 1
d a t a . append ( numpy . random . n o r m a l (mu , sigma , 2 5 6 ) )

# Show a b o x p l o t o f t h e d a t a
p l t . boxplot ( data )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 29/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple boxplots
Boxplots are often used to show side by side various
distributions

14

12

10

2 1 2 3 4 5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 29/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Orientation

Changing the orientation is easily done

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


data = [ ]
f o r i in xrange ( 5 ) :
mu = 10 ∗ numpy . random . r a n d o m s a m p l e ( )
s i g m a = 2 ∗ numpy . random . r a n d o m s a m p l e ( ) + 0 . 1
d a t a . append ( numpy . random . n o r m a l (mu , sigma , 2 5 6 ) )

# Show a b o x p l o t o f t h e d a t a
p l t . b o x p l o t ( data , v e r t =0)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 30/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Orientation
Changing the orientation is easily done

6 4 2 0 2 4 6 8 10 12

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 30/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend
Good graphics have a proper legend
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


l a b e l s = [ ’ m e r c u r y ’ , ’ l e a d ’ , ’ l i t h i u m ’ , ’ t u n g s t e n e ’ , ’ cadnium ’ ]
data = [ ]
f o r i in xrange ( len ( l a b e l s ) ) :
mu = 10 ∗ numpy . random . r a n d o m s a m p l e ( ) + 100
s i g m a = 2 ∗ numpy . random . r a n d o m s a m p l e ( ) + 0 . 1
d a t a . append ( numpy . random . n o r m a l (mu , sigma , 2 5 6 ) )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

axis . s e t t i t l e ( ’ Alien nodules composition ’ )


xtickNames = p l t . setp ( axis , x t i c k l a b e l s = l a b e l s )
a x i s . s e t y l a b e l ( ’ c o n c e n t r a t i o n ( ppm ) ’ )

# Show a b o x p l o t o f t h e d a t a
p l t . boxplot ( data )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 31/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend
Good graphics have a proper legend

112 Alien nodules composition

110

108
concentration (ppm)

106

104

102

100 mercury lead lithium tungstene cadnium

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 31/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 32/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms

Histogram are convenient to sum-up results

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
d a t a = numpy . a b s ( numpy . random . s t a n d a r d n o r m a l ( 3 0 ) )

# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( d a t a ) ) , data , c o l o r= ’ #4682B4 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 33/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms
Histogram are convenient to sum-up results

30

25

20

15

10

00.0 0.5 1.0 1.5 2.0 2.5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 33/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms

A variant to show 2 quantities per item on 1 figure

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
data = [ [ ] , [ ] ]
f o r i i n xrange ( l e n ( data ) ) :
d a t a [ i ] = numpy . a b s ( numpy . random . s t a n d a r d n o r m a l ( 3 0 ) )

# Show an h i s t o g r a m
l a b e l s = range ( l e n ( data [ 0 ] ) )

p l t . b a r h ( l a b e l s , d a t a [ 0 ] , c o l o r= ’ #4682B4 ’ )
p l t . b a r h ( l a b e l s , −1 ∗ d a t a [ 1 ] , c o l o r= ’#FF4500 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 34/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms
A variant to show 2 quantities per item on 1 figure

30

25

20

15

10

03 2 1 0 1 2 3

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 34/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Labels
Very often, we need to name items on an histogram

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
names = [ ’ Wang Bu ’ , ’ Cheng Cao ’ , ’ Zhang Xue L i ’ , ’ Tang Wei ’ , ’ Sun Wu Kong ’ ]
marks = 7 ∗ numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 3

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

a x i s . s e t x l i m (0 , 10)

p l t . y t i c k s ( numpy . a r a n g e ( l e n ( marks ) ) + 0 . 5 , names )

a x i s . s e t t i t l e ( ’ Data−m i n i n g marks ’ )
a x i s . s e t x l a b e l ( ’ mark ’ )

# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( marks ) ) , marks , c o l o r= ’ #4682B4 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 35/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Labels
Very often, we need to name items on an histogram

Data-mining marks
Sun Wu Kong

Tang Wei

Zhang Xue Li

Cheng Cao

Wang Bu

0 2 4 6 8 10
mark

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 35/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars
Error bars, to indicate the accuracy of values

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
names = [ ’ 6809 ’ , ’ 6502 ’ , ’ 8086 ’ , ’ Z80 ’ , ’ RCA1802 ’ ]
s p e e d = 70 ∗ numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 30
e r r o r = 9 ∗ numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 1

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

p l t . y t i c k s ( numpy . a r a n g e ( l e n ( names ) ) + 0 . 5 , names )

a x i s . s e t t i t l e ( ’ 8 b i t s CPU benchmark − s t r i n g test ’)


axis . set xlabel ( ’ score ’ )

# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( names ) ) , s p e e d , x e r r=e r r o r , c o l o r= ’ #4682B4 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 36/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars
Error bars, to indicate the accuracy of values

8 bits CPU benchmark - string test


RCA1802

Z80

8086

6502

6809

0 20 40 60 80 100
score

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 36/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 37/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

Let’s display some real data: Old Faithful geyser

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 38/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

This way works, but good example of half-done job

i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Read t h e d a t a
d a t a = numpy . l o a d t x t ( ’ . / d a t a s e t s / g e y s e r . d a t ’ )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= ’ e q u a l ’ )

# Plot the points


p l t . s c a t t e r ( d a t a [ : , 0 ] , d a t a [ : , 1 ] , c= ’#FF4500 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 39/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful
This way works, but good example of half-done job

5.5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.530 40 50 60 70 80 90 100

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 39/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

Let’s make a more readable figure

i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Read t h e d a t a
d a t a = numpy . l o a d t x t ( ’ . / d a t a s e t s / g e y s e r . d a t ’ )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

a x i s . s e t t i t l e ( ’ Old F a i t h f u l g e y s e r d a t a s e t ’ )
a x i s . s e t x l a b e l ( ’ w a i t i n g t i m e (mn . ) ’ )
a x i s . s e t y l a b e l ( ’ e r u p t i o n d u r a t i o n (mn . ) ’ )

# Plot the points


p l t . s c a t t e r ( d a t a [ : , 0 ] , d a t a [ : , 1 ] , c= ’#FF4500 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 40/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful
Let’s make a more readable figure

5.5 Old Faithful geyser dataset

5.0

4.5
eruption duration (mn.)

4.0

3.5

3.0

2.5

2.0

1.530 40 50 60 70 80 90 100
waiting time (mn.)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 40/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

Let’s display more complex data: fishes and mercury


poisoning

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 41/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes


A first try

i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Read t h e d a t a
d a t a = numpy . l o a d t x t ( ’ . / d a t a s e t s / f i s h . d a t ’ )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)

a x i s . s e t t i t l e ( ’ North C a r o l i n a f i s h e s mercury c o n c e n t r a t i o n ’ )
a x i s . s e t x l a b e l ( ’ weight (g . ) ’ )
a x i s . s e t y l a b e l ( ’ m e r c u r y c o n c e n t r a t i o n ( ppm ) ’ )

# Plot the points


p l t . s c a t t e r ( d a t a [ : , 3 ] , d a t a [ : , 4 ] , c= ’#FF4500 ’ )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 42/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes


A first try

4.0 North Carolina fishes mercury concentration

3.5
3.0
mercury concentration (ppm)

2.5
2.0
1.5
1.0
0.5
0.0
0.51000 0 1000 2000 3000 4000 5000
weight (g.)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 42/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes


Show more information with subplots

North Carolina fishes mercury concentration


5000
4000
weight (g.)

3000
2000
1000
0
(ppm)concentration (ppm)

100020 30 40 50 60 70
4.0 length (cm.)
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.51000
mercury

0 1000 2000 3000 4000 5000


4.0 weight (g.)
3.5
mercury concentration

3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.520 30 40 50 60 70
length (cm.)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 43/44


UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes


Data from one river with its own color
North Carolina fishes mercury concentration
5000
4000
weight (g.)

3000
2000
1000
0
(ppm)concentration (ppm)

100020 30 40 50 60 70
4.0 length (cm.)
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.51000
mercury

0 1000 2000 3000 4000 5000


4.0 weight (g.)
3.5
mercury concentration

3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.520 30 40 50 60 70
length (cm.)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 44/44

Das könnte Ihnen auch gefallen