Sie sind auf Seite 1von 7

Visual Character Recognition person and even from time to time

with the same person.


using Artificial Neural 2. Like any image, visual characters are
Networks subject to spoilage due to noise.
3. There are no hard-and-fast rules that
Shashank Araokar* define the appearance of a visual
character. Hence rules need to be
heuristically deduced from samples.
ABSTRACT
As such, the human system of vision is
The recognition of optical characters is excellent in the sense of the following
known to be one of the earliest qualities:
applications of Artificial Neural
Networks, which partially emulate 1. The human brain is adaptive to
human thinking in the domain of minor changes and errors in visual
artificial intelligence. In this paper, a patterns. Thus we are able to read the
simplified neural approach to handwritings of many people despite
recognition of optical or visual different styles of writing.
characters is portrayed and discussed. 2. The human vision system learns
The document is expected to serve as a from experience: Hence we are able
resource for learners and amateur to grasp newer styles and scripts with
investigators in pattern recognition, amazingly high speed.
neural networking and related 3. The human vision system is immune
disciplines. to most variations of size, aspect
ratio, color, location and orientation
of visual characters.
[1.] INTRODUCTION:
In contrast to limitations of classical
The recognition of characters from computing, Artificial Neural Networks
scanned images of documents has been a (ANNs), that were first developed in the
problem that has received much mid 1900’s serve for the emulation of
attention in the fields of image human thinking in computation to a
processing, pattern recognition and meager, yet appreciable extent. Of the
artificial intelligence. Classical methods several fields wherein they have been
in pattern recognition do not as such applied, humanoid computing in general
suffice for the recognition of visual and pattern recognition in particular
characters due to the following reasons: have been of increasing activity. The
recognition of visual (optical) characters
1. The ‘same’ characters differ in sizes, is a problem of relatively amenable
shapes and styles from person to complexity when compared with greater
challenges such as recognition of human
faces. ANNs have enjoyed considerable
*
Presently Bachelor’ s student (Electronics and success in this area due to their
Telecommunication), University of Mumbai, humanoid qualities such as adapting to
India
Email: shashank.araokar@ieee.org changes and learning from prior
shashank.araokar@gmail.com

1
experience. The subsequent parts of the
paper elucidate this fact in more details.

The paper is organized as follows: in


section [2.], image digitization, which is
an essential step prior to neural
networking, is described. Section [3.] Fig. (1)
describes the learning mechanism of the
neural network used, and the employed The process of digitization is important
architecture is described in section [4.]. for the neural network used in the
Section [5.] discusses the issues that system. In this process, the input image
affect the performance of the proposed is sampled into a binary window which
methods with reference to its accuracy, forms the input to the recognition
computational complexity and system. In the above figure, the alphabet
extensibility. A has been digitized into 6X8=48 digital
cells, each having a single color, either
black or white. It becomes important for
[2.] IMAGE DIGITIZATION: us to encode this information in a form
meaningful to a computer. For this, we
When a document is put to visual assign a value +1 to each black pixel
recognition, it is expected to be and 0 to each white pixel and create the
consisting of printed (or handwritten) binary image matrix I which is shown in
characters pertaining to one or more the Fig. (1.c). So much of conversion is
scripts or fonts. This document however, enough for neural networking which is
may contain information besides optical described next. Digitization of an image
characters alone. For example, it may into a binary matrix of specified
contain pictures and colors that do not dimensions makes the input image
provide any useful information in the invariant of its actual dimensions. Hence
instant sense of character recognition. In an image of whatever size gets
addition, characters which need to be transformed into a binary matrix of fixed
singly analyzed may exist as word pre-determined dimensions. This
clusters or may be located at various establishes uniformity in the dimensions
points in the document. Such an image is of the input and stored patterns as they
usually processed for noise-reduction move through the recognition system.
and separation of individual characters
from the document. It is convenient for
comprehension to assume that the [3.] LEARNING MECHANISM:
submitted image is freed from noise and
that individual characters have already In the employed system, a highly
been located (using for example, a simplified architecture of artificial neural
suitable clustering algorithm). This networks is used. For purpose of easy
situation is synonymous to the one in understanding, the learning mechanism
which a single noise-free character has of the neural network is described first
been submitted to the system for and its architecture is described next, in
recognition. section [4.]. In the used method, various
characters are taught to the network in a

2
supervised manner. A character is for all i=1 to x
presented to the system and is assigned a {
particular label. Several variant patterns for all j=1 to y
of the same character are taught to the {
network under the same label. Hence the Wk (i, j ) = Wk (i, j ) + M (i, j )
network learns various possible }
variations of a single pattern and } (1.2)
becomes adaptive in nature. During the
training process, the input to the neural Here x and y are the dimensions of the
network is the input matrix M defined as matrix Wk (and M).
follows:
The following figure shows the
If I (i, j ) = 1 Then M (i, j ) = 1 digitization of three input patterns
Else: representing S that are presented to the
If I (i, j ) = 0 Then M (i, j ) = −1 (1.1) system for it to learn.

The input matrix M is now fed as input


to the neural network. It is typical for
any neural network to learn in a
supervised or unsupervised manner by
adjusting its weights. In the current
method of learning, each candidate
character taught to the network
possesses a corresponding weight
matrix. For the kth character to be taught Fig. (2)
to the network, the weight matrix is
denoted by Wk. As learning of the Note that the patterns slightly differ from
character progresses, it is this weight each other, just as handwriting differs
matrix that is updated. At the from person to person (or time to time)
commencement of teaching (supervised and like printed characters differ from
training), this matrix is initialized to machine to machine.
zero. Whenever a character is to be
taught to the network, an input pattern Fig. (3) gives the weight matrix, say, WS
representing that character is submitted corresponding to the alphabet S. The
to the network. The network is then matrix is has been updated thrice to learn
instructed to identify this pattern as, say, the alphabet S. It should be noted that
the kth character in a knowledge base of this matrix is specific to the alphabet S
characters. That means that the pattern is alone. Other characters shall each have a
assigned a label k. In accordance with corresponding weight matrix.
this, the weight matrix Wk is updated in
the following manner:

3
1 3 3 3 3 1 [4.] NETWORK ARCHITECTURE:
3 3 −3 −3 −1 −1
The overall architecture of the
3 −1 −3 −3 −3 −3 recognition system is shown in Fig. (4).
3 3 1 −1 −1 −1 In this system, the candidate pattern I is
WS = the input. The block ‘M’ provides the
−1 3 3 3 3 3
input matrix M to the weight blocks Wk
−3 −3 −3 −3 −3 3 for each k. There are totally n weight-
3 −3 −3 −1 1 3 blocks for the totally n characters to be
3 3 3 3 3 1 taught (or already taught) to the system.
Fig. (3)

A close observation of the matrix would


bring the following points to notice:

1. The matrix-elements with higher


(positive) values are the ones which
stand for the most commonly
occurring image-pixels.
2. The elements with lesser or negative Fig. (4)
values stand for pixels which appear
less frequently in the images. The recognition of patterns is now done
on the basis of certain statistics that shall
Neural networks learn through such be defined next.
updating of their weights. Each time, the
weights are adjusted in such a manner as (4.1) Candidate Score (ψ ): This statistic
to give an output closer to the desired is a product of corresponding elements
output than before. The weights may of the weight matrix Wk of the kth learnt
represent the importance or priority of a pattern and an input pattern I as its
parameter, which in the instant case is candidate. It is formulated using the as
the occurrence of a particular pixel in a follows:
character pattern. It can be seen that the
weights of the most frequent pixels are x y
higher and usually positive and those of ψ (k ) = Wk (i, j ) * I (i, j ) (1.3)
the uncommon ones are lower and often i =1 j =1

negative. The matrix therefore assigns


importance to pixels on the basis of their It should be noted that unlike in the
frequency of occurrence in the pattern. training process where M was the
In other words, highly probable pixels processed input matrix, in the
are assigned higher priority while the recognition process, the binary image
less-frequent ones are penalized. matrix I is directly fed to the system for
However, all labeled patterns are treated recognition.
without bias, so as to include impartial
adaptation in the system.

4
(4.2) Ideal Weight-Model Score ( µ ): • Conclude that the candidate
This statistic simply gives the sum total pattern does not exist
of all the positive elements of the weight within the knowledge base
matrix of a learnt pattern. It may be OR
formulated as follows (with • Teach the candidate pattern to
µ (k ) initialized to 0 each time). the network till a satisfactory
value of Q(k ) is obtained.
for i=1 to x 4. Conditionally, identify the input
{ candidate pattern as being akin to the
for j=1 to y kth learnt pattern OR proceed with
{ the training for better performance.
if Wk (i, j ) > 0 then
{ In Fig. (4), the selector gives an output k
by making the best selection as in Step 4
µ (k ) = µ (k ) + Wk (i, j )
of the aforementioned algorithm. The
} adaptive performance of the network can
} easily be tested by an example: we
} (1.4) submit two hand-drawn patterns
representing S and P respectively to the
(4.3) Recognition Quotient (Q): This system that has already learnt only the
statistic gives a measure of how well the character S. The recognition quotient
recognition system identifies an input yielded by the trained system is
pattern as a matching candidate for one mentioned alongside.
of its many learnt patterns. It is simply
given by:

ψ (k )
Q(k ) = (1.5)
µ (k )

The greater the value of Q, the more


confidence does the system bestow on
the input pattern as being similar to a Fig. (5)
pattern already known to it. The
classification of input patterns now Note that the pattern in Fig. (5) does not
follows the following trivial procedure:- exactly appear like the three patterns of
Fig. (2) that were taught to the system.
1. For an input candidate pattern I, However, being adaptive, the system
calculate the recognition quotient nevertheless bestows a good quotient
( Q(k ) ) for each learnt pattern k. Q = 0.68 on the pattern, indicating a
2. Determine the value of k for which match. To improve recognition of this
Q(k ) has the maximum value. particular pattern, the same pattern can
3. Too low maximum value of Q(k ) be repeatedly input to the system and
taught to it as before under the same
(say less than 0.5) indicates poor
label. As a result, the value of
recognition. In such a case:
Q approaches unity after each time the
pattern is taught. This illustrates learning

5
from prior experience in neural The dimensions of the input matrix need
networks. to be adjusted for performance. Greater
the dimensions, higher the resolution and
better the recognition. This however
increases the time-complexity of the
system which can be a sensitive issue
with slower computers. Typically,
32X32 matrices have been empirically
found sufficient for the recognition of
English handwritten characters. For
Fig. (6) intricate scripts, greater resolution of the
matrices is required.
The system however dismisses the
candidature of the pattern representing P As already illustrated in the previous
in Fig. (5) by a yielding a low value example, efficient supervised teaching is
of Q(= 0.21) . It can be observed by essential for the proper performance.
regular teaching, that the system Neural expert systems are therefore
develops on its ability to identify a typically used where human-centered
matching pattern and reject non- training is preferred against rigid and
matching patterns. Thus, regular inflexible system-rules.
supervised teaching marks enhanced
performance of the system
[6.] CONCLUSION:

[5.] PERFORMANCE ISSUES: A simplistic approach for recognition of


visual characters using artificial neural
The neural system has some direct networks has been described. The
advantages that become apparent at this advantages of neural computing over
stage: classical methods have been outlined.
Despite the computational complexity
1. The method is highly adaptive; involved, artificial neural networks offer
recognition is tolerant to minor several advantages in pattern recognition
errors and changes in patterns. and classification in the sense of
2. The knowledge base of the system emulating adaptive human intelligence
can be modified by teaching it newer to a small extent.
characters or teaching different
variants of earlier characters.
3. The system is highly general and is [7.] ACKNOWLEDGEMENT:
invariant to size and aspect ratio.
4. The system can be made user- The author wishes to acknowledge the
specific: User-profiles of characters use of the Open Source VB-based
can be maintained, and the system software Recog developed by Neil
can be made to recognize them as Fraser for the purpose of character
per the orientation of the user. recognition using neural networks. Parts
of this paper have been derived from the

6
same program. It is available for
download at the following web address:
http://neil.fraser.name/software/recog

[8.] REFERENCES:

1. Anil K. Jain, Jianchang Mao, K. M.


Mohiuddin, Artificial Neural
Networks: A Tutorial, Computer,
v.29 n.3, p.31-44, March 1996
2. Simon Haykin, Neural Networks: A
comprehensive foundation, 2nd
Edition, Prentice Hall, 1998
3. Alexander J. Faaborg, Using Neural
Networks to Create an Adaptive
Character Recognition System,
March 2002, available at:
• http://web.media.mit.edu/~faabor
g/research/cornell/hci_neuralnet
work_finalPaper.pdf
4. E. W. Brown, Character Recognition
by Feature Point Extraction,
unpublished paper authored at
Northeastern University, 1992,
available at:
• http://www.ccs.neu.edu/home/fe
neric/charrecnn.html

Das könnte Ihnen auch gefallen