Sie sind auf Seite 1von 13

B.

TECH PROJECT
MID TERM REPORT

HANDWRITTEN DIGITS
RECOGNITION USING NEURAL
NETWORKS

​SubmittedTo: Prof. M.V.Joshi


Submitted by:
-Ketan Mittal (201301141)
-Rahul Rathod(201301213)
Index

1.Introduction
2.Machine Learning
2.1 Supervised Learning
2.2 Unsupervised Learning
2.3 Linear Regression
2.4 Logistic Regression
3.Neural Networks
3.1 Introduction
3.2 Transfer Function
3.3 Perceptron
3.3.1 Implementation of Nand gates using a Perceptron
3.4 Sigmoid Neuron
3.5 Cost Function
3.6 Gradient Descent
3.7 Backpropagation Algorithm
4. Future Timeline of the project
5. References
INTRODUCTION

Handwritten digits recognition is the ability of a computer to infer digits that


have been manually inputted by the user. The purpose of the project is to
train the computer using a set of predefined techniques based on which the
computer will interpret handwritten digits from various documents. The
project is going to involve an extensive study of machine learning and
neural networks principles which we have covered in this report. The
primary motivation of going ahead with the project was a curiosity to build
something substantial after having done our course on Introduction to Data
Mining and Warehousing- IT 633.

Although the topic of the project is not something that has never been done
before, we decided to go ahead with it and learn as much as we can in the
due course of the project​. ​Before going into the advanced concepts of
neural networks involved in the project, we would describe the basics of
some machine learning concepts which will be involved in the project.

Machine Learning
Without having to explicitly program the computers, this field gives
computers the ability to learn. It is very similar to data mining except that it
looks to find out pattern in data while data mining focuses on extraction of
data for human requirement.

Machine learning tasks are broadly classified into two categories:


● Supervised Learning
● Unsupervised Learning

Supervised Learning:
Supervised learning is the process of an algorithm learning from the
training dataset,where you have input variables (x) and an output variable
(Y) and you use an algorithm to learn the mapping function from the input
to the output.
Y = f(X)

Supervised learning problems can be further grouped into regression and


classification problems.
● Classification​:Classification problem,we are instead trying to predict
results in a discrete output such as “0” or “1"
● Regression​: A regression problem ,we are trying to predict results in
a continuous output

Examples of supervised machine learning algorithms are:


● Linear regression for regression problems.
● Random forest for classification and regression problems.
● Support vector machines for classification problem
.

Unsupervised learning:

Unsupervised learning is where you only have input data (X) and no
corresponding output variables.The system itself recognize the correlation
and organize pattern into categories accordingly.The goal for unsupervised
learning is to model the structure in the data in order to learn more about
the data.

Unsupervised learning problems can be further grouped into clustering and


association problems.
● Clustering​:A process of dividing a dataset into groups such that,the
members of each group are similar as possible to another.
● Association​: A process where you want to discover rules that
describe large portions of your data.

Examples of unsupervised learning algorithms are:


● Recommendation Systems.
● k-means for clustering problems.
● Apriori algorithm for association rule learning problems.

Linear Regression
A machine learning technique used to fit an equation to a dataset
y=mx
When the inputs are represented as vectors, in order to find out the ‘theta’
matrix:
Xϴ = Y.
Multiplying by X transpose on both sides and then making ϴ as the subject
of our equation, we get:

Logistic Regression:
A machine learning technique used for binary classification.Obtaining a
best fit lo​gistic function. ​In linear regression, the outcome (dependent
variable) is continuous. It can have any one of an infinite number of
possible values. In logistic regression, the outcome (dependent variable)
has only a limited number of possible values.
NEURAL NETWORKS

The project is going to involve an extensive study of neural networks.


Therefore, it becomes essential to know the basics of neural networks and
understanding its modelling before we proceed to implement our own
network.

Introduction

Neural networks are a modelling of the human nervous system. It solves


problems just as a human brain does with large clusters of nerve cells
(called neurons) connected to each other by axons. ​Each neural unit is
connected with many others, and links can be enforcing or inhibitory in their
effect on the activation state of connected neural units. Each individual
neural unit may have a summation function which combines the values of
all its inputs together.

Transfer Function

The behaviour of an Artificial Neural Network depends on both the weights


and the input-output function that is specified for the units. We have studied
two such function in detail: ​Perceptron, ​where the output is set at one of
two levels, depending on whether the total input is greater than or less than
some threshold value and the ​Sigmoid neuron, ​where the output varies
continuously but not linearly as the input changes.

PERCEPTRON

A perceptron takes several binary values as input and gives a single binary
value as output. Binary values are either 0 or 1.
There are 3 inputs here x1, x2 and x3 each of which will be either 0 or 1
and there is one output which too will have a binary value. There are
weights associated with each of the inputs which are important to scale up
or scale down the importance of a particular input value. These weights are
real numbers and the output value is decided by whether the weighted
sum ​∑ wjxj is less than a certain threshold value. The output is 0 if ∑ wjxj is
less than or equal to the threshold value and 1 otherwise.

The perceptron can be modelled to have multiple layers in order to produce


more complex outputs. In the following figure, there are three layers, the
input layer, the output layer and the hidden layer in between them. The
output of the input layer is fed as an input to the hidden layer where
complex modelling is done.

IMPLEMENTATION OF NAND GATE USING A PERCEPTRON


To implement a nand gate, it is essential that the output is 0 only when both
the inputs are 1. The truth table of the Nand gate is as under:

x1 x2 Output
0 0 1
0 1 1
1 0 1
1 1 0

Consider two inputs x1 and x2, with a threshold value of -3. Let the
assigned weight to each of the inputs be -2.

When x1 and x2 both are 0, our weighted sum is : -2*0 + -2*0 which is 0.
Since this is more than the threshold value, output is 1. When any one of
the inputs (say x1) is 1 and x2 is 0, then the weighted sum will be -2(1) +
-2(0) which is equal to -2. Since this is again greater than -3 the output will
again be 1. When both the inputs are equal to 1, then the weighted sum will
be equal to -4 which is less than our threshold value, Thus output will be 0
and we have implemented the NAND gate.
The neural network would look like this: In our

Thus, by attaching different weights to the inputs and taking a suitable


value of the threshold, we can train/model our neural network as per our
needs. It should be noted that a small change in the weights or the
threshold value could completely reverse the output. However, in our
project, we desire that causing a small change in the weights should
produce a small change in the output. That is to say, in our handwritten
digit recognition project, if our output produces 3 instead of 4, we can make
small changes in th2 weights to produce 4 as the output. However, in the
case of perceptron, since the output is binary, the outputs are completely
reversed. Hence, there is a need to address this issue which is done by the
sigmoid neuron

SIGMOID NEURON

The inputs here can be any real value ranging from 0 to 1. This is unlike the
perceptron where we had binary inputs. Also, the output is: σ(w.x − t).
Here, σ is known as the sigmoid function which is defined by : σ (z) = 1/(1
+ e^-z).

The sigmoid function closely resembles the perceptron. When the weighted
sum is a very huge positive quantity, e^-z will tend to 0 and the output will
be 1. When the weighted sum is a huge negative quantity, e^-z will tend to
infinity and the output will be 0. For other values, the output will be a real
number between 0 and 1.

Here is the graph of the sigmoid function, where Z is the weighted sum and
the threshold value taken together.

COST FUNCTION

In order to come with a correct output, out neural network should produce a
solution that is as close as possible to the desired solution. That is, the
difference between the actual solution and the solution provided by our
neural network should be as low as possible. To understand it
mathematically, we will define a cost function which is given by:

Here n is the total number of the training inputs, y(x) is the actual desired
output while a is the output that is produced by our neural network. Clearly,
the difference between y(x) and a should be as low as possible. We are
finding the ​Mean Square Error​ which is why the terms are squared and
this value should be as close to 0 as possible. Note that an accurate output
should give the value of the cost function as 0.

We need to find a set of weights for which the cost function is as small as
possible. We will be doing this by using an algorithm known as the
Gradient Descent.

GRADIENT DESCENT

It is an iterative approach, where we keep on finding the minimum value of


the cost function until we no longer get any minimum value. It assumes
that if a function is defined and differentiable in the neighbourhood of a
point, then the function is going to decrease the fastest if one moves from
that point in the direction of negative gradient of the function at that point.
That is, the change will be - 𝛾∇F(x), where 𝛾 is known as the learning
rate. We keep on decreasing the value of the cost function until we reach a
global minimum.
It is necessary to understand how to compute the gradient of the cost
function. The algorithm used to compute it is known as ​Back Propagation

Back-Propagation Algorithm

Back-propagation Algorithm calculates how the error changes as each


weight is increased or decreased slightly. This algorithm is primarily
responsible for two things:
1) Propagation
2) Weight Updation

After having received the output in the output layer, the cost function is
computed for each of the outputs produced in each of the neurons of the
output layer. Once that is done, the error values are propagated backwards
towards the initial layers so as to train the neural network further. After that,
these error values are used to compute the gradient of the cost function
with respect to the assigned weights. Consequently, the weights are
updated so as to minimise the cost function and in the process of going
back and forth, our network becomes trained after having received and
propagated the input sets quite a number of times.

FUTURE TIMELINE OF OUR PROJECT

We believe that we have now got a firm grasp on the concepts that will be
employed in building our neural network. We would like to finish the
complete coding/implementation part of our project by the end of the first
week of April. If everything goes well, we intend to submit the final project
report along with the coding files, latest by 15th April. This is a tentative
timeline, however, we will try and make sure that we stick to it. We would
also take this opportunity to thank you for your guidance and supporting us
in our project.

REFERENCES

1. Y. Le Cun, O. Matan, B. Boser, J. S. Denker, D. Henderson, R. E.


Howard, W. Hubbard, L. D. Jacket, and H. S. Baird, “Handwritten zip
code recognition with multilayer networks,” [1990] Proceedings. 10th
International Conference on Pattern Recognition, vol. ii. pp. 35–40,
1990.
2. Research Paper on Basic of Artificial Neural Network Ms. Sonali. B.
Maind and Ms. Priyanka Wankar. International Journal on Recent and
Innovation Trends in Computing and Communication
3. Machine Learning Video Lessons by Andrew Ng
4. Handwritten Character Recognition using Neural Network Chirag I
Patel, Ripal Patel, Palak Patel. International Journal of Scientific &
Engineering Research Volume 2, Issue 5, May-2011
-x-x-x-x-x-x-x-x-x-x-x-x-x-x-

Das könnte Ihnen auch gefallen