Beruflich Dokumente
Kultur Dokumente
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Introduction
Objectives
Theoretical part
Small examples
2 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
Overview
Motivation
Basic Usage
Graph definition and Syntax
Graph structure
Strong typing
Differences from Python/NumPy
Graph Transformations
Substitution and Cloning
Gradient
Shared variables
Make it fast!
Optimizations
Code Generation
GPU
Advanced Topics
Looping: the scan operation
Debugging
Extending Theano
New features
3 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
Theano vision
I
I
I
Substitutions
Gradient, R operator
Stability optimizations
Speed optimizations
Use fast back-ends (CUDA, BLAS, custom C code)
4 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
Current status
Mature: Theano has been developed and used since January 2008 (8 years
old)
Theano: deeplearning.net/software/theano/
Deep Learning Tutorials: deeplearning.net/tutorial/
5 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
Related projects
Blocks
Keras
Lasagne
PyMC 3
sklearn-theano
Platoon
Theano-MPI
...
6 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
Basic usage
7 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
Defining an expression
8 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
debugprint(dot)
dot [id A] ''
|x [id B]
|W [id C]
debugprint(out)
sigmoid [id A] ''
|Elemwise{add,no_inplace} [id B] ''
|dot [id C] ''
| |x [id D]
| |W [id E]
|b [id F]
9 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
=
=
=
=
theano.function(inputs=[x,
theano.function([x, W, b],
theano.function([x, W, b],
theano.function([x, W, b],
W], outputs=dot)
out)
[dot, out])
[dot + b, out])
10 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
theano.printing.debugprint(f)
CGemv{inplace} [id A] '' 3
|AllocEmpty{dtype='float64'} [id B] '' 2
| |Shape_i{1} [id C] '' 1
|
|W [id D]
|TensorConstant{1.0} [id E]
|InplaceDimShuffle{1,0} [id F] 'W.T' 0
| |W [id D]
|x [id G]
|TensorConstant{0.0} [id H]
theano.printing.pydotprint(f)
theano.printing.debugprint(g)
Elemwise{ScalarSigmoid}[(0, 0)] [id A] '' 2
|CGemv{no_inplace} [id B] '' 1
|b [id C]
|TensorConstant{1.0} [id D]
|InplaceDimShuffle{1,0} [id E] 'W.T' 0
| |W [id F]
|x [id G]
|TensorConstant{1.0} [id D]
theano.printing.pydotprint(g)
11 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
pydotprint(f)
InplaceDimShuffle{1,0}
Shape_i{1}
TensorType(float64, matrix)
name=W.T TensorType(float64, matrix)
TensorType(int64, scalar)
AllocEmpty{dtype='float64'}
2
CGemv{inplace}
TensorType(float64, vector)
12 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
pydotprint(g)
InplaceDimShuffle{1,0}
TensorType(float64, matrix)
name=b TensorType(float64, vector)
1 4
CGemv{no_inplace}
TensorType(float64, vector)
Elemwise{ScalarSigmoid}[(0, 0)]
TensorType(float64, vector)
13 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
pydotprint(h)
InplaceDimShuffle{1,0}
Shape_i{1}
TensorType(float64, matrix)
name=W.T TensorType(float64, matrix)
TensorType(int64, scalar)
AllocEmpty{dtype='float64'}
2
CGemv{inplace}
TensorType(float64, vector)
Elemwise{Composite{scalar_sigmoid((i0 + i1))}}
TensorType(float64, vector)
14 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
d3viz
15 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Motivation
Basic Usage
0.03158954, -0.26423186])
0.73722395,
0.67606977])
0.03158954, -0.26423186]),
0.73722395, 0.67606977])]
1.03158954,
0.73722395,
0.73576814]),
0.67606977])]
16 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
Overview
Motivation
Basic Usage
Graph definition and Syntax
Graph structure
Strong typing
Differences from Python/NumPy
Graph Transformations
Substitution and Cloning
Gradient
Shared variables
Make it fast!
Optimizations
Code Generation
GPU
Advanced Topics
Looping: the scan operation
Debugging
Extending Theano
New features
17 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
Graph structure
The graph that represents mathematical operations is bipartite, and has two
sorts of nodes:
I
In practice:
I
Variables are used for the graph inputs and outputs, and intermediate
values
18 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
None
matrix
type
None
matrix
owner
y
inputs
op
Apply
outputs
owner
matrix
z
19 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
pydotprint(f, compact=False)
InplaceDimShuffle{1,0}
Shape_i{1}
TensorType(int64, scalar)
AllocEmpty{dtype='float64'}
2
TensorType(float64, vector)
CGemv{inplace}
TensorType(float64, vector)
20 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
Strong typing
I
I
21 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
Broadcasting tensors
r = T.row('r')
print(r.broadcastable)
c = T.col('c')
print(c.broadcastable)
# (True, False)
# (False, True)
f = theano.function([r, c], r + c)
print(f([[1, 2, 3]], [[.1], [.2]]))
# [[ 1.1 2.1 3.1]
# [ 1.2 2.2 3.2]]
22 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
No side effects
a = T.inc_subtensor(a[:], 1) or a = T.set_subtensor(a[:], 0)
Exceptions:
I
The Print() Op
The Assert() Op
23 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Graph structure
Strong typing
Differences from Python/NumPy
Python keywords
We cannot redefine Pythons keywords: they affect the flow when building the
graph, not when executing it.
I
I
I
24 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Overview
Motivation
Basic Usage
Graph definition and Syntax
Graph structure
Strong typing
Differences from Python/NumPy
Graph Transformations
Substitution and Cloning
Gradient
Shared variables
Make it fast!
Optimizations
Code Generation
GPU
Advanced Topics
Looping: the scan operation
Debugging
Extending Theano
New features
25 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
T.vector('x')
T.matrix('W')
T.vector('b')
= T.dot(x, W)
= T.nnet.sigmoid(dot + b)
26 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
27 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
C : RN R
f : RM R
g : RN RM
C (x) = f (g (x))
C
f
= g
x x
g (x)
g
x x
is not needed.
The whole M N Jacobian matrix g
x x
M
N
We only need gx : R R , v 7 v g
x x
28 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Using theano.grad
y = T.vector('y')
C = ((out - y) ** 2).sum()
dC_dW = theano.grad(C, W)
dC_db = theano.grad(C, b)
# or dC_dW, dC_db = theano.grad(C, [W, b])
I
29 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
30 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
pydotprint(cost_and_upd)
InplaceDimShuffle{1,0}
TensorType(float64, matrix)
name=b TensorType(float64, vector)
CGemv{no_inplace}
TensorType(float64, vector)
0
Elemwise{ScalarSigmoid}[(0, 0)]
1
3 TensorType(float64, vector)
1
0 TensorType(float64, vector)
Elemwise{sub,no_inplace}
1 TensorType(float64, vector)
1 TensorType(float64, vector)
Elemwise{sub}
4 TensorType(float64, vector)
2 TensorType(float64, vector)
3 TensorType(float64, vector)
Elemwise{sqr,no_inplace}
TensorType(float64, vector)
Sum{acc_dtype=float64}
CGer{non-destructive}
TensorType(float64, scalar)
TensorType(float64, matrix)
TensorType(float64, vector)
Elemwise{mul,no_inplace}
3 TensorType(float64, vector)
31 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Update values
Cumbersome
32 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Shared variables
33 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
x = T.vector('x')
y = T.vector('y')
W = theano.shared(W_val)
b = theano.shared(b_val)
dot = T.dot(x, W)
out = T.nnet.sigmoid(dot + b)
f = theano.function([x], dot) # W is an implicit input
g = theano.function([x], out) # W and b are implicit inputs
print(f(x_val))
# [ 1.79048354 0.03158954 -0.26423186]
print(g(x_val))
# [ 0.9421594
0.73722395 0.67606977]
I
34 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
All outputs, including the update expressions, are computed before the
updates are performed
35 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
pydotprint(cost_and_perform_updates)
InplaceDimShuffle{1,0}
2 TensorType(float64, matrix) 4 1
CGemv{no_inplace}
TensorType(float64, vector)
name=y TensorType(float64, vector)
Elemwise{ScalarSigmoid}[(0, 0)]
1
3 TensorType(float64, vector)
2 TensorType(float64, vector)
0 TensorType(float64, vector)
1 TensorType(float64, vector) 0
Elemwise{sub,no_inplace}
2 TensorType(float64, vector)
1 TensorType(float64, vector)
TensorType(float64, vector)
0
TensorType(float64, vector)
Sum{acc_dtype=float64}
Elemwise{sub}
3 TensorType(float64, vector) 0
Elemwise{Mul}[(0, 1)]
3 TensorType(float64, vector)
CGer{destructive}
UPDATE
TensorType(float64, vector)
TensorType(float64, scalar)
TensorType(float64, matrix)
UPDATE
TensorType(float64, matrix)
36 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
Overview
Motivation
Basic Usage
Graph definition and Syntax
Graph structure
Strong typing
Differences from Python/NumPy
Graph Transformations
Substitution and Cloning
Gradient
Shared variables
Make it fast!
Optimizations
Code Generation
GPU
Advanced Topics
Looping: the scan operation
Debugging
Extending Theano
New features
37 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
Graph optimizations
An optimization replaces a part of the graph with different nodes
I
Shape inference
Constant folding
Transfer to GPU
38 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
Enabling/disabling optimizations
39 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
Each operator can define C code computing the outputs given the inputs
For GPU code, same process, using CUDA and nvcc instead.
40 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
At runtime, if all operations have C code, calling the pointers will be fast
41 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Optimizations
Code Generation
GPU
Configuration flags
43 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Overview
Motivation
Basic Usage
Graph definition and Syntax
Graph structure
Strong typing
Differences from Python/NumPy
Graph Transformations
Substitution and Cloning
Gradient
Shared variables
Make it fast!
Optimizations
Code Generation
GPU
Advanced Topics
Looping: the scan operation
Debugging
Extending Theano
New features
44 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Overview of scan
Symbolic looping
I
45 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
49.
64.
81.]
1.60000000e+01
1.29600000e+03
8.10000000e+01
2.40100000e+03
46 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Self-diagnostic tools
47 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
To define the gradient, have to actually define a class deriving from Op, and
define the grad method.
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
49 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Performance improvements
I
I
I
I
Faster compilation
I
I
I
I
Diagnostic tools
I
I
I
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
51 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
Acknowledgements
52 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
53 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
http://github.com/mila-udem/summerschool2016/
I
Slides: theano/course/intro_theano.pdf
54 / 55
Overview
Graph definition and Syntax
Graph Transformations
Make it fast!
Advanced Topics
http://github.com/mila-udem/summerschool2016/
I
Slides: theano/course/intro_theano.pdf
More resources
I
Documentation: http://deeplearning.net/software/theano/
Code: http://github.com/Theano/Theano/