Sie sind auf Seite 1von 13

Learning to Propagate for Graph Meta-Learning

(NeurIPS’19)

Krishna Prasad Neupane

kpn3569@rit.edu

July 11, 2020

Krishna Prasad Neupane (RIT) GPN July 11, 2020 1 / 13


Motivation

In existing meta-learning methods, tasks are implicitly related by


sharing meta-parameters.

This paper shows that a meta-learner that explicitly relates tasks on


graph can significantly improves few-shot learning by acquiring
relation of their outputs.

Krishna Prasad Neupane (RIT) GPN July 11, 2020 2 / 13


Graph Meta-Learning:Introduction

The paper develops a meta-learner called Gated Propagation


Network (GPN) to learn how to pass messages between nodes (i.e.,
classes) on the graph for more effective few-shot learning and
knowledge sharing.
The graph is defined with each class as a node and each edge
connecting a class to its sub-class (i.e., children class) or parent class.
For a task, the meta-learner generates a prototype representing each
class by using only few-shot training data, and during test a new
sample is classified to the class of its nearest prototype.
The GPN learns an attention mechanism to compute the combination
weights and a gate to filter the message from neighbors and also from
itself.

Krishna Prasad Neupane (RIT) GPN July 11, 2020 3 / 13


Graph Meta-Learning:Introduction

Figure: t-SNE plot of class prototypes produced by GPN at left and its procedure
at right.
Krishna Prasad Neupane (RIT) GPN July 11, 2020 4 / 13
Graph Meta-Learning:Problem Setup

Given a directed acyclic graph (G) = (Y, E ), where Y is the true set
of classes for all possible tasks, each node denotes a class y ∈ Y, and
each directed edge yi → yj ∈ E connects a parent class yi ∈ Y to one
of its child classes yj ∈ Y on the graph G.
The paper assumes that each learning task T is defined by a subset
of classes Y T ⊂ Y drawn from a certain distribution T (G) defined on
the graph.
The goal of the paper is to learn a meta-learner M(.; Θ) that is
parameterized by Θ and can produce a learner model M(T ; Θ) for
each task.
This problem can be formulated as a risk minimization of
learning-to-learn:

min ET ∈T (G) [E(x,y )∈D T − log p(y |x; M(T ; Θ))] (1)
Θ

Krishna Prasad Neupane (RIT) GPN July 11, 2020 5 / 13


Graph Meta-Learning:Problem Setup

Given a sample x, logM(T;Θ) produces the probability of each class


y ∈ Y using soft-version of KNN.The probability is computed by an
RBF Kernel over the Euclidean distances between f (x) and prototype
Py as:

exp −||f (x) − Py ||2


p(y |x; M(T ; Θ)) = P 2
(2)
z∈PyT exp −||f (x) − Pz ||

where f () is a learnable representation model for input x.


The main idea of graph meta-learning is to improve the prototype of
each class in P by assimilating their neighbors’ prototypes on the
graph.

Krishna Prasad Neupane (RIT) GPN July 11, 2020 6 / 13


Graph Meta-Learning:Gated Propagation Network (GPN)

GPN is a meta-learning model that learns


how to send and aggregate messages between
classes in order to generate class prototypes
that result in high KNN prediction accuracy
across different N-way-K -shot classification
tasks.
A multi-head dot-product attention
mechanism is deployed to measure the
similarity between each class and its
neighbors on the graph, and use the obtained
similarities as weights to aggregate the
messages (prototypes) from its neighbors. Figure: Prototype propagation i
In each head, a gate is applied to determine GPN.
whether to accept the aggregated messages
from the neighbors or the message from itself.
Krishna Prasad Neupane (RIT) GPN July 11, 2020 7 / 13
Graph Meta-Learning:Gated Propagation Network (GPN)
Given a task T associated with a subset of classes y ∈ Y T and an
N-way K-shot training set D T . An initial prototype for each class
y ∈ Y T by averaging over all the K-shot samples i.e,
1 X
Py0 = T
f (xi ) (3)
|{(xi , yi ) ∈ D : yi = y }| T (xi ,yi )∈D ,yi =y

Update of the prototype from its neighbors:


X < h1 (p), h2 (q) >
PNt+1
y →y
= a(Pyt , Pzt ) × Pzt , a(p, q) = (4)
||h1 (p)|| × ||h2 (q)||
z∈Ny

where h1 (.) and h2 (.) are learnable transformation.


Prototype update also including own prototype:
Pyt+1 = gPyt+1 + (1 − g )PNt+1
y →y
,
exp[γ cos(Py0 , Pyt+1 )] (5)
g=
exp[γ cos(Py0 , Pyt+1 )] + exp[γ cos(Py0 , PNt+1
y →y
)]
Krishna Prasad Neupane (RIT) GPN July 11, 2020 8 / 13
Graph Meta-Learning:Gated Propagation Network (GPN)

To capture the different types of relations, the paper applies k


modules of the attentive and gated propagation as the multi-head
attention:
k
t+1 1 X t+1
Py = Py [i] (6)
k
i=1

Final prototype as step T of class y is given by:

Py = λ × Py0 + (1 − λ) × PyT (7)

Krishna Prasad Neupane (RIT) GPN July 11, 2020 9 / 13


GPN Training

Generating training tasks by


subgraph sampling: Random
sampling (Randomly selects N
classes) and Snowball sampling
(Sequentially selects
N-classes).
Building propagation
pathways by maximum
spanning tree: Maximum
spanning tree (MST) build
through weighted cosine
similarity of the prototypes.

Krishna Prasad Neupane (RIT) GPN July 11, 2020 10 / 13


Experiments

Krishna Prasad Neupane (RIT) GPN July 11, 2020 11 / 13


Ablation Study

The figure shows that the GPN is capable to map the prototypes from the
same classes but from different tasks in compare to prototypical networks.

Figure: Prototypical vs GPN

Krishna Prasad Neupane (RIT) GPN July 11, 2020 12 / 13


The End

Krishna Prasad Neupane (RIT) GPN July 11, 2020 13 / 13

Das könnte Ihnen auch gefallen