Sie sind auf Seite 1von 14

Capsule networks recap

Capsule networks - CNNs don’t capture spatial


relationships
Dynamic routing intuition with example

Eg: classify 2 or 3 (for simplicity)

Our hidden layer (l) consists of a capsule


layer with six capsules, corresponding to
low level features, and

our final layer (l+1) consists of 2 capsules,


corresponding to the numbers 2 or 3.
Agreement and Assignment
Find agreement between predictions for capsule coordinate frames.

La
No ye
n- rw
li is
ne e
ar
it
y

Sm
art
Sp
ar
si
ty
Agreement
Find agreement between predictions for capsule coordinate frames.
Agreement
Find agreement between predictions for capsule coordinate frames.
Agreement
Find agreement between predictions for capsule coordinate frames.
Agreement and Assignment
Find agreement between predictions for capsule coordinate frames.

La
No ye
n- rw
li is
ne e
ar
it
y

Sm
art
Sp
ar
si
ty
Dynamic routing intuition- Routing by agreement
These routing coefficients are not
learned parts of a network!

These routing coefficients are


consequently not fixed weights;

They are learned every single forward


pass of the network, and depend on the
individual image.

This is why the procedure is called


dynamic routing — it is part of the
forward pass.
Capsule networks 4d Pose Matrix ??
Capsules are groups of neurons
whose output represents different
properties of the same feature.

By grouping neurons, the output of


a capsule is a 4x4 matrix (or 16
dimensional vector), whose entries
represent information such as the
x and y-coordinates of the feature,
For example, in digit recognition, one layer of capsules correspond to digits. One
the angle of rotation of the object, capsule in the output layer may correspond to a 6. Different entries in the 4x4
matrix will correspond to different properties of the 6. As seen below, if you
and other characteristics. modify one entry of the matrix, the 6 changes in scale and thickness, and if you
modify another entry of the matrix, the top hook of the six changes.
Dynamic Routing in depth
Capsule network intuition
if 3 predictions of lower level features point
at the same position and state of the face,
then it must be a face there.

Hinton proposed that we use a process called


“routing-by-agreement”. This means that lower
level features (fingers, eyes, mouth) will only get
sent to a higher level layer that matches its
contents. If the features it contains resemble that
of an eye or a mouth, it will get to a “face” or if it
contains fingers and a palm, it will get send to
“hand”.
How Capsules work
How a Capsule Network would classify this face
4 computational steps happening inside the capsule.

2. Scalar Weighting of Input Vectors


● Scalar weighting they are determined using “dynamic routing”, which is a novel
way to determine where each capsule’s output goes.

Das könnte Ihnen auch gefallen