Sie sind auf Seite 1von 13

Bhujbal Knowledge City

MET Institute of Engineering

Modelling and designing of CNN for feature


abstraction.

Presented By :- Krusha Sandip Joshi

Guided By – Dr. Kalpana V. Metre

Department of Information Technology


Bhujbal Knowledge City
MET Institute of Engineering

Contents :

• Introduction
Bhujbal Knowledge City
MET Institute of Engineering

INTRODUCTION
The task of image captioning can be divided into two modules logically – one is an image
based model – which extracts the features and nuances out of our image, and the other
is a language based model – which translates the features and objects given by our
image based model to a natural sentence.
For our image based model (viz encoder) – we usually rely on a Convolutional Neural
Network model. And for our language based model (viz decoder) – we rely on a
Recurrent Neural Network.
Bhujbal Knowledge City
MET Institute of Engineering

CNN
A convolution neural network (CNN) is a specific type of artificial neural
network that uses perceptrons, a machine learning unit algorithm, for
supervised learning, to analyze data.
 CNNs apply to image processing, natural language processing and other
kinds of cognitive tasks.
CNNs can encode abstract features from images. These can then be used
for classification, object detection, segmentation, captioning and various
other tasks.
Bhujbal Knowledge City
MET Institute of Engineering

CNN
Convolutional networks are trainable multistage architectures with each
stage consisting of multiple layers.
 The input and output of each stage are sets of arrays called as feature
maps.
 In the case of a colored image, each feature map would be a 2D array
containing a color channel of the input image, a 3D array for a video and a
1D array for an audio input.
Eg., An image of 6 x 6 x 3 array of matrix of RGB.
Bhujbal Knowledge City
MET Institute of Engineering

Layers in CNN model


The below figure is a complete flow of CNN to process an input image and
classifies the objects based on values.
Bhujbal Knowledge City
MET Institute of Engineering

Convolution Layer
Convolution is the first layer to extract features from an input image.
This layer is the core building block of a CNN. The layer’s parameters consist
of learnable kernels or filters which extend through the full depth of the input
It is a mathematical operation that takes two inputs such as image matrix
and a filter or kernel.

===
Bhujbal Knowledge City
MET Institute of Engineering

Non-linearity Layer
This is a layer of neurons which apply various activation functions.
The activation functions are typically sigmoid, tanh and ReLU.
ReLU stands for rectified linear unit, and is a type of activation function.
Mathematically, it is defined as y = max(0, x).
This functions helps us to make sense and extract knowledge form such
complicated big datasets.
It makes the network more powerful.
Adds ability to it to learn something complex and complicated form data
and represent non-linear complex arbitrary functional mappings between
inputs and outputs.
Bhujbal Knowledge City
MET Institute of Engineering

Pooling Layer
Pooling layers section would reduce the number of parameters when the
images are too large.
Pooling is done for the sole purpose of reducing the spatial size of the
image.
 Spatial pooling also called subsampling or downsampling which reduces
the dimensionality of each map but retains the important information.
Spatial pooling can be of different types:
•Max Pooling
•Average Pooling
•Sum Pooling
Bhujbal Knowledge City
MET Institute of Engineering

Fully Connected Layer


The layer we call as FC layer, we flattened our matrix into vector and feed
it into a fully connected layer like neural network.
With the fully connected layers, we combined these features together to
create a model.
It generates global semantic information.
Bhujbal Knowledge City
MET Institute of Engineering

Advantages of Using CNN:



The usage of CNNs are motivated by the fact that they can capture relevant features
from an image at different levels similar to a human brain. This is feature
learning. Conventional neural networks cannot do this.
Another main feature of CNNs is weight sharing.
CNN is more efficient in terms of memory and complexity.
For a completely new task / problem CNNs are very good feature extractors. This is
also called pre-training and CNNs are very efficient in such tasks.
Another advantage of this pre-training is we avoid training of CNN and save memory,
time. The only thing you have to train is the classifier at the end for your labels.
Bhujbal Knowledge City
MET Institute of Engineering

Applications of CNN:
1) Speech Recognition: Convolutional Neural Networks have been used recently in
Speech Recognition and has given better results over Deep Neural Networks (DNN).
2) It is also used in object tracking and video classification.
3)It helps in iterative image reconstruction and super resolution of low level images.
4)It supports edge detection and semantic segmentation.
Bhujbal Knowledge City
MET Institute of Engineering

Thank You …

Das könnte Ihnen auch gefallen