Sie sind auf Seite 1von 51

Lecture Series: AI is the New Electricity

Deep Learning - SCOPING, EVOLUTION & FUTURE TRENDS

Dr. Chiranjit Acharya


AILABS Academy
J-3, GP Block, Sector V, Salt
Lake City, Kolkata, West Presented at AILABS Academy,
Bengal 700091 Kolkata on April 18th 2018

Confidential, unpublished property of aiLabs. Do not duplicate or distribute. Use and distribution limited solely to authorized personnel. (c) Copyright 2018
A Journey into Deep Learning

▪Cutting edge technology


▪Garnered traction in both industry and academics
▪Achieves near-human-level performance in many pattern
recognition tasks
▪Excels in
▪structured, relational data
▪unstructured rich-media data such as image, video,
audio and text

AILABS (c) Copyright 2018 2


A Journey into Deep Learning

▪What is Deep Learning? Where is the “deepness”?

▪Where does Deep Learning come from?

▪What are the models and algorithms of Deep Learning?

▪What is the trajectory of evolution of Deep Learning?

▪What are the future trends of Deep Learning?

AILABS (c) Copyright 2018 3


A Journey into Deep Learning

AILABS (c) Copyright 2018 4


Artificial Intelligence

Holy Grail of AI Research

▪Understanding the neuro-biological and neuro-


physical basis of human intelligence
▪science of intelligence
▪Building intelligent machines which can think and act
like humans
▪engineering of intelligence

AILABS (c) Copyright 2018 5


Artificial Intelligence

Facets of AI Research
▪knowledge representation
▪Reasoning
▪natural language understanding
▪natural scene understanding

AILABS (c) Copyright 2018 6


Artificial Intelligence

Facets of AI Research
▪natural speech understanding
▪problem solving
▪Perception
▪Learning
▪planning

AILABS (c) Copyright 2018 7


Machine Learning

Basic Doctrine of Learning


▪learning from examples
Outcome of Learning
▪rules of inference for some predictive task
▪embodiment of the rules = model
▪model is an abstract computing device
•kernel machine, decision tree, neural
network

AILABS (c) Copyright 2018 8


Machine Learning

Connotations of Learning

▪process of generalization

▪discovering nature/traits of data

▪unraveling patterns and anti-patterns in data

AILABS (c) Copyright 2018 9


Machine Learning

Connotations of Learning:

▪knowing distributional characteristics of data

▪identifying causal effects and propagation

▪identifying non-causal co variations & correlations

AILABS (c) Copyright 2018 10


Machine Learning

Design Aspects of Learning System


▪ Choose the training experience

▪ Choose exactly what is to be learned, i.e. the target function /


machine

▪ Choose objective function & optimality criteria

▪ Choose a learning algorithm to infer the target function from


the experience.

AILABS (c) Copyright 2018 11


Learning Work Flow

▪Stage 1: Feature Extraction, Feature subset selection,


Feature Vector Representation

▪Stage 2: Training / Testing Set Creation and Augmentation

▪Stage 3: Training the Inference Machine

▪Stage 4: Running the Inference Machine on Test Set

▪Stage 5: Stratified Sampling and Validation


AILABS (c) Copyright 2018 12
Feature Extraction / Selection

low-level parts
mid-level parts Cognitive Elements
high-level parts

additional descriptors

Domain Expert Corpus Knowledge Engineer

Sparse
Sparse Coder
Representation

AILABS (c) Copyright 2018 13


Training Set Augmentation

Existing
Existing
training setSet
Training

Sparse
Representation Samples
Random
Sampler

Reviewer

Augmented
training set

AILABS (c) Copyright 2018 14


Training and Prediction / Recognition

Adaptive Prediction /
Training
Recognition
Set Learner Model

Unlabelled Predicted /
Residual Corpus Recognized Corpus
Prediction /
Recognition
Model

AILABS (c) Copyright 2018 15


Sampling , Validation & Convergence

Human
Predicted
Reviewed
Corpus Stratified sub- Reviewer Stratified sub-
samples samples
Stratified
Sampler

Precision &
Recall
Calculator

Go back to No Yes End of


Training Set Converged Relevance
Augmentation Scoring
?
AILABS (c) Copyright 2018 16
Evolution of Connectionist Models

1943: Artificial neuron model (McCulloch & Pitts)

▪ "A logical calculus of the ideas immanent in nervous activity"

▪ simple artificial “neurons” could be made to perform basic


logical operations such as AND, OR and NOT

▪ known as Linear Threshold Gate

▪ NO learning

AILABS (c) Copyright 2018 17


Evolution of Connectionist Models

1943: Artificial neuron model (McCulloch & Pitts)


w1j

x1

w2j n

x2 s j   wij xi  b j y j  f (s j )
i 0
yj

wnj

xn

AILABS (c) Copyright 2018


bj 18
Evolution of Connectionist Models

1957: Perceptron model (Rosenblatt)


▪ invention of learning rules inspired by ideas from
neuroscience

if Σ inputi * weighti > threshold, output = +1


if Σ inputi * weighti < threshold, output = -1
▪ learns to classify input into two output classes
▪ Sigmoid transfer function: boundedness, graduality

y  1 as x  
y  0 as x  

AILABS (c) Copyright 2018 19


Evolution of Connectionist Models

1943: Artificial neuron model (McCulloch & Pitts)

w1j

x1

w2j n

x2 s j   wij xi  b j y j  f (s j )
i 0
yj

wnj
1
s j
1 e
xn

AILABS (c) Copyright 2018


bj 20
Evolution of Connectionist Models

1960s: Delta Learning Rule (Widrow & Hoff)


▪ Define the error as the
squared residuals
E 1
2  n n
n
( y  ˆ
y ) 2

summed over all training


cases:

E yˆ n En
▪ Now differentiate to get
wi
 1
2 n w yˆ
error derivatives for i n
weights
  xi ,n ( yn  yˆ n )
n
▪ The batch delta rule
changes the weights in
proportion to their error E
derivatives summed wi  
over all training cases wi
AILABS (c) Copyright 2018 21
Evolution of Connectionist Models

1969: Minsky's objection to Perceptrons

▪ Marvin Minsky & Seymour Papert: Perceptrons

▪ Unless input categories are linearly separable, a perceptron


cannot learn to discriminate between them.

▪ Unfortunately, it appeared that many important categories


were not linearly separable.

AILABS (c) Copyright 2018 22


Evolution of Connectionist Models
1969: Minsky's objection to Perceptrons
Perceptrons are good at linear classification but ...
x1 1
1
1
1
1
1 1

1
1

x2

AILABS (c) Copyright 2018 23


Evolution of Connectionist Models

1969: Minsky's objection to Perceptrons


Perceptrons are incapable of simple nonlinear classification like XOR

(1) x1 (1) (0)

X1 X2 Output

0 0 0

0 1 1

1 0 1

1 1 0
(0) (0) (1)
(XOR operation) (0) x2(1)

AILABS (c) Copyright 2018 24


Universal Approximation Theorem

Existential Version (Kolmogorov)

▪ There exists a finite combination of superposition and


addition of continuous functions of single variables which can
approximate any continuous, multivariate function on
compact subsets of R^d.

Constructive Version (Cybenko)


▪ The standard multilayer feed-forward network with a single
hidden layer, containing finite number of hidden neurons, is
a universal approximator among continuous functions on
compact subsets of R^d, under mild assumptions on the
activation function.

AILABS (c) Copyright 2018 25


Evolution of Connectionist Models

1986: Backpropagation for Multi-Layer Perceptrons


(Rumelhart, Hinton & Williams)
▪ solution to Minsky's objection regarding perceptron's limitation

▪ nonlinear classification is achieved by fully connected, multilayer,


feedforward networks of perceptrons (MLP)

▪ MLP can be trained by backpropagation

▪ Two-pass algorithm
▪ forward propagation of activation signals from input to output
▪ backward propagation of error derivatives from output to input

AILABS (c) Copyright 2018 26


Evolution of Connectionist Models

1986: Backpropagation for Multi-Layer Perceptrons


(Rumelhart, Hinton & Williams)

Input Layer 1 Layer 2 Output


x1 y1
x2 y2





xN yM
Input
Hidden Output
Layer
Layer Layer

AILABS (c) Copyright 2018 27


Evolution of Connectionist Models

1986: Backpropagation for Multi-Layer Perceptrons


(Rumelhart, Hinton & Williams)
▪ solution to Minsky's objection regarding perceptron's limitation

▪ nonlinear classification is achieved by fully connected, multilayer,


feedforward networks of perceptrons (MLP)

▪ MLP can be trained by backpropagation

▪ Two-pass algorithm
▪ forward propagation of activation signals from input to output
▪ backward propagation of error derivatives from output to input

AILABS (c) Copyright 2018 28


Machine Learning Example

Handwriting Digit Recognition

Machine
“2”

AILABS (c) Copyright 2018 29


Handwriting Digit Recognition

Input Output

x1 y1 y1
0.1 is 1

x2 y2 y2
0.7 is 2
The image
is “2”




y10 0.2
y1 is 0
x256
16 x 16 = 256
Color → 1 Each output represents the
No color → 0 confidence of a digit.

AILABS (c) Copyright 2018 30


Example Application

Handwriting Digit Recognition

x1 y1

x2 y2
Machine “2




x256 y10

AILABS (c) Copyright 2018 31


Evolution of Connectionist Models
1989: Convolutional Neural Network (LeCun)
neuron
Input Layer 1 Layer 2 Layer Output
x1 … L y1

x2 … y2








xN … yM
Input
… Output
Layer Hidden Layers Layer

Deep means many hidden layers


AILABS (c) Copyright 2018 32
Convolutional Neural Network

▪ Input can have very high dimension.


▪ Using a fully-connected neural network would need a large
amount of parameters.
▪ CNNs are a special type of neural network whose hidden
units are only connected to local receptive field.
▪ The number of parameters needed by CNNs is much
smaller.

Example: 200x200 image


a)fully connected: 40,000
hidden units => 1.6 billion
parameters
b)CNN: 5x5 kernel (filter), 100
feature maps => 2,500
parameters

AILABS (c) Copyright 2018 33


Convolution Operation

Patc
h

AILABS (c) Copyright 2018 34


Convolution Operation in CNN
▪ Input: an image (2-D array): x
▪ Convolution kernel (2-D array of learnable parameters): w
▪ Feature map (2-D array of processed data): s
▪ Convolution operation in 2-D domains:

AILABS (c) Copyright 2018 35


Convolution Filters

AILABS (c) Copyright 2018 36


Convolution Operation with Filters

AILABS (c) Copyright 2018 37


Convolution Layers

Convolution Layer
Channels Feature Maps

AILABS (c) Copyright 2018 38


3 Stages of a Convolutional Layer

AILABS (c) Copyright 2018 39


Non Linear Stage

Tanh(x) ReLU

AILABS (c) Copyright 2018 40


Evolution of Connectionist Models
2006: Deep Belief Networks (Hinton), Stacked Auto-Encoders
(Bengio)
neuron
Input Layer 1 Layer 2 Layer Output
x1 … L y1

x2 … y2








xN … yM
Input
… Output
Layer Hidden Layers Layer

Deep means man y hidden layers


AILABS (c) Copyright 2018 41
Deep Learning

Traditional pattern recognition models use hand-crafted


features and relatively simple trainable classifier.

hand-crafted “Simple”
feature Trainable output
extractor Classifier

This approach has the following limitations:


• It is very tedious and costly to develop hand-crafted
features
▪ The hand-crafted features are usually highly dependents on
one application, and cannot be transferred easily to other
applications

AILABS (c) Copyright 2018 42


Deep Learning

Deep learning = representation learning


Seeks to learn hierarchical representations (i.e. features)
automatically through multiple stage of feature learning process.

Low-level Mid-level High-level Trainable


features features features classifier output

Feature visualization of convolutional net trained on ImageNet (Zeiler and Fergus, 2013)

AILABS (c) Copyright 2018 43


Learning Hierarchical Representations

Low-level Mid-level High-level Trainable


output
features features features classifier

Increasing level of abstraction

Hierarchy of representations with increasing level of abstraction.


Each stage is a kind of trainable nonlinear feature transformation
Image recognition
Pixel → edge → motif → part → object
Text
Character → word → word group → clause → sentence → story

AILABS (c) Copyright 2018 44


Pooling
Common pooling operations:
Max pooling
Report the maximum output within a rectangular neighborhood.
Average pooling
Report the average output of a rectangular neighborhood (possibly
weighted by the distance from the central pixel).

AILABS (c) Copyright 2018 45


CiFAR10

CiFAR10

AILABS (c) Copyright 2018 46


Deep CNN on CiFAR10

Deep CNN on CiFAR10

AILABS (c) Copyright 2018 47


Deep CNN on CiFAR10

Deep CNN on CiFAR10

AILABS (c) Copyright 2018 48


Deep CNN on CiFAR10

Deep CNN on CiFAR10

AILABS (c) Copyright 2018 49


Future Trends

▪ Different and wider range of problems are being


addressed
▪ natural language understanding
▪ natural scene understanding
▪ natural speech understanding

▪ Feature learning is being investigated at deeper level

▪ Manifold learning

▪ Reinforcement learning

▪ Integration with other paradigms of machine learning

AILABS (c) Copyright 2018 50


Thank You

Das könnte Ihnen auch gefallen