Beruflich Dokumente
Kultur Dokumente
ABSTRACT
In this paper, we present a new neural network (NN) based method for optical character recognition (OCR)
as well as handwritten character recognition (HCR). Experimental results show that our proposed method
achieves increased accuracy in optical character recognition as well as handwritten character recognition.
We present through an overview of existing handwritten character recognition techniques. All the
algorithms describes more or less on their own. Handwritten character recognition is a very popular and
computationally expensive task; we describe advanced approaches for handwritten character recognition.
In the present work, we would like to compare the most important once out of the variety of advanced
existing techniques, and we will systematize the techniques by their characteristic considerations. It leads to
the behaviour of the algorithms reaches to the expected similarities.
275
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
be applied on optical character (Fig 2. (a)) or Features extracted can be either low level
handwritten characters (Fig 2.(b)). Based on that, or high level. Low level features include width,
systems can be classified as OCR or HCR height, curliness and aspect ratio etc., of the
respectively. Online methods are superiors to their character. These alone cannot be used to distinguish
counterparts i.e. offline methods due to the one character from another in the character set of
temporal information present in the character the languages [11]. So, there are the number of
generation [4]. other high level features which includes number
Accuracy of HCR is still limited to 90 and position of loops, straight lines, headlines and
percent due to presence of large variation in shape, curves etc. Tirthraj Dash et al discussed HCR using
scale, style and orientation etc., [8]. Character associative memory net (AMN) in their paper [12],
processing systems are domain and application represents the direct work at pixel level. Dataset
specific, like the other systems it is not possible to was designed in MS Paint 6.1 with normal Arial
design generic systems which can process all kinds font size 28(twenty eight). Dimension of image was
of scripts and languages. Lots of work has been kept 31X39. Once characters are extracted, their
carried out on European languages and Arabic binary pixel values are directly used to train AMN.
(Urdu) language. Whereas domestic languages like I. K. Pathan et al have proposed offline approach
Hindi, Punjabi, Bangla, Tamil and Gujarati etc., are for handwritten isolated Urdu characters in their
very less explored due to limited usage. In this wok [13]. Urdu character may contain one, two,
paper, our focus is to carry out in depth literature three or four segments, in which one component is
survey on handwritten character recognition known as primary (generally represents large
methods. continuous stroke) and rest of all are known as
secondary components (generally represents small
stroke or dots). Authors have used moment
invariant (MI) feature to recognize the characters.
MI features are well known to be invariant under
rotation, translation, scaling and reflection. MI
features are the measure of pixel distribution
around the centre of gravity of character and it
Figure 2. (a) Optical character (b) Handwritten captures the global character shape information. If
character. character image is single component than it is
normalized in 60X 60 pixels and horizontally
Image processing and pattern recognition divided into three (3) equal parts. 7 MIs are
plays significant role in handwritten character extracted from each zone and 7 MIs are calculated
recognition. Rajbala et al [10], have discussed from overall image, so, total of 28 features are used
various types of classification of feature extraction to train support vector machines (SVM), if image is
methods like statistical feature based methods, having multi components then 28 MIs are extracted
structural feature based methods and global from primary component (60 X 60) and 21 MIs are
transformation techniques. Statistical methods are extracted from secondary component (22 X 22).
based on the planning of how the data should be Separate SVMs are trained for both and decision is
selected. It uses the information of statistical taken based on the rules satisfying some criteria.
distribution of pixels in image, it can be mainly Proposed system claim to get highest accuracy of
classified in three categories: 1). Partitioning in 93.59 %. In paper [4], Pradeep et al has proposed
regions, 2). Profile generation and projections 3) neural network based classification of handwritten
distances and crossing. Structural features are character recognition system. Each individual
extracted from structure and geometry of character character is resized to 30 X 20 pixels for
like number of horizontal and vertical lines, aspect processing; they have used the binary features to
ratio, number of cross points, number of loops, train the neural network. However, such features
number of branch points, number of strokes and are not robust. In post processing stage, recognized
number of curves etc. Global transformation characters are converted to ASCII format. Input
features are calculated by converting the image in layer has 600 neurons equal to number of pixels.
frequency domains like Discrete Fourier Output layer has 26 neurons as English has 26
Transformation (DFT), Discrete Cosine alphabets. The proposed ANN uses back
Transformation (DCT), Discrete Wavelet propagation algorithm.
Transformation (DWT), Gabor filtering, and 2. COMPARISION OF OCR TECHNIQUES
Walsh-Hadamard transformation etc.
276
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
Various techniques used for the design of A neural network is a powerful data modelling tool
OCR by their characteristics. that is able to capture and represent complex
input/output relationships. Motivation for the
Matrix Matching development of neural network technology
stemmed from the desire to develop an artificial
Matrix Matching converts each character system that could perform "intelligent" tasks similar
into a pattern within a matrix, and then compares to those performed by the human brain. Neural
the pattern with an index of known characters. Its networks resemble the human brain in the
recognition is strongest on monotype and uniform following two ways: (1) A neural network acquires
single column pages. knowledge through learning; (2) A neural network's
knowledge is stored within inter-neuron connection
Fuzzy Logic strengths known as synaptic weights.
Feature Extraction
277
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
278
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
279
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
7. APPLICATIONS
280
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
281
Journal of Theoretical and Applied Information Technology
20th January 2016. Vol.83. No.2
2005 - 2015 JATIT & LLS. All rights reserved.
282