Sie sind auf Seite 1von 6

The 2018 International Conference on Signals and Systems (ICSigSys)

Printed Arabic Letter Recognition Based On Image


Ainatul Radhiah1, Carmadi Machbub 2, Egi Muhammad Idris Hidayat3 and Ary Setijadi Prihatmanto 4
School of Electrical Engineering and Informatics
Bandung Institute of Technology
Bandung, Indonesia
1
ainawind27@students.itb.ac.id
2
carmadi@lskk.ee.itb.ac.id
3
egi@lskk.ee.itb.ac.id
4
asetijadi@lskk.ee.itb.ac.id

Abstract— In the process of learning Arabic letters, recognizing recognition results with LVQ reached 51.19% accuracy.
Arabic letters is a very important part. Learning Arabic letters Albakor et.al [2] have conducted a research on the recognition
will be more effective by using a system that can recognize Arabic of the Arabic letters, Backpropagation Neural Network used for
letters, both in isolated and sentence form. Most of the developed classification purposes, this research involves a segmentation
techniques for classifying characters in other languages cannot be
process, which produces an accuracy of 98.7%. Supriana et al.
used for Arabic characters due to the differences in the structure.
Arabic letters are cursive in general. In this research, we proposed [3] developed an Arabic letter recognition system in a sentence,
an Arabic letter recognition system that can recognize both forms. in classification stage is used the decision tree that generated by
In the classification stage, we compared Neural Network and a C4.5 algorithm, recognition results achieved 48% accuracy.
Hidden Markov Model. The result of the recognition of isolated Izakian et al. [4], developed an isolated Farsi / Arab letters
Arabic letters using the ANN classification method for isolated recognition system using the 1-nearest neighbor, the results
letters reaches 100% accuracy and the result of letter recognition achieved 97.4% accuracy. Cheung et al. [5] developed an
in the sentence reaches 69% accuracy. While the result of the Arabic optical recognition-based segmentation with string
recognition of Arabic letters using HMM for isolated letters matching approach and achieved 90% accuracy.
reaches 71% accuracy and the result of letter recognition in the
Based on previous studies, research on the recognition of the
sentence reaches 50% accuracy. From the data obtained on the
Arabic letters incorrectly identified with the ANN method, there Arabic letters contained in the sentence has not been widely
are 5 letters that are always misidentified. The accuracy of the known. Therefore in this research we developed an Arabic
letters correctly identified is 50%, whereas from the data obtained letter recognition system in isolated form and in the sentence.
on the Arabic letters incorrectly identified with the HMM method, In this research we also discussed how the processing of
there are 9 letters that are always misidentified, The accuracy of features, such as the normalization of chain code, how to get
the letters correctly identified is 38%. the feature the number of dots, and how to get the dots position
feature.
Keywords—Arabic Letter Recognition , Stentiford Algorithm, The system has five stages: binarization, segmentation,
Chain code, Neural Network, Hiden Markov Model.
thinning, feature extraction and classification. Binarization
I. INTRODUCTION stage is done by converting the image into a binary form that
has a value of zero and one. In the segmentation, the stage is
Arabic has 28 base letters and 3 additional letters that written
done with the Zidouri[6] algorithm which has several
right to left and written cursively both printed or handwriting.
parameters, features, and rules to segment a word. In the
Therefore the recognition of Arabic letters in sentences requires
thinning stage is done by Stentiford[7] algorithm which has
a segmentation process. Some Arabic letters have a similar
four templates, endpoint and number of connectivity to check
shape and can be distinguished from the number of dots and the
whether pixel image is deleted or not. In the feature extraction
position of the dots. Each Arabic letter has a different shape,
stage, three features extracted, the first is the normalized chain
depending on its position in the word, that is isolated, in the
code, the second is the number of dots, the third is the position
beginning, in the middle and at the end.
of the dot. At the classification, the stage is done by comparing
Previous research on the recognition of Arabic letters has
the two methods of Artificial Neural Network (ANN) and
grown. Meilani et al.[1] have conducted research on the
Hidden Markov Model (HMM).
recognition of isolated Arabic letters using Neural Network
The purpose of the development of Arabic letter recognition
with Backpropagation and Learning Vector Quantisation
system is to help the process of learning Arabic letters either in
(LVQ) method. The results showed that the recognition with
isolated form or in the sentence. As for the limitations of the
Backpropagation achieved 98.81% accuracy and the

978-1-5386-5689-1/18/$31.00 ©2018 IEEE 86


The 2018 International Conference on Signals and Systems (ICSigSys)

problem in this research is, the letters used are printed Arabic For the 1 st guide band in the sets, even if it fails to qualify Rule
letters in isolated positions and in sentences. 1 – 4 and the guide band next to it satisfies Rule 2 then it should
be selected. If all guide bands fail to satisfy any rule, then apply
II. METHODOLOGY less constrained rule base i.e., removing F4 condition except
Rule 4 [6].
This research was conducted in 5 stages, the following is an
explanation of each stage that is implemented.

Fig.2 Results of Zidouri letter segmentation

C. Thinning
One of the uses of thinning is in the pattern recognition
application. The image used in thinning is a binary image. This
process erodes the pixels as much as possible without affecting
the general shape. After thinning process the pattern should still
be recognized. The resulting image of the thinning algorithm is
called the skeleton.
There are several popular thinning algorithms, including Zhang
Suen [8], Stentiford [7] and Hilditch [9]. In this study we
compared 3 methods of thinning ie Zhang Suen, Stentiford and
Hilditch. Stentiford algorithm was chosen as the best thinning
method. In the case of thinning Arabic letters, the algorithm of
Zhang Suen and Hilditch has a deficiency in thinning results.
Fig.1 System block diagram Figure 3 show comparison of the thinning results of the letters
A. Binarization "‫ "ث‬with Zhang Suen, Stentiford and Hilditch algorithms.
Binarization of the image is the process of converting the image
into binary that have values 0 and 1. A grayscale image will be
changed to black and white. A binarization process is required
to perform the next steps on the recognition of Arabic letters
and sentences. The way it does is by doing threshold on each
color channel. The threshold used is 150. If the color channel is Zhang Suen Stentiford Hilditch
Fig.3 Comparison of thinning algorithms
less than 150 it will be converted to black, and if the color more
than 150 will be changed to white.
In Figure 3 we can see that the result of thinning with Zhang
Suen algorithm removes the right part of the letter, which
B. Segmentation
should not be deleted, as in the thinning result with the
The letter segmentation was done using the Zidouri algorithm
[6]. The first step of segmentation is to specify some parameters Stentiford algorithm. While the results of thinning with Hilditch
algorithm remove 2 dots of ‫ ث‬letters, so the ‫ ث‬letter has only 1
used as the reference of segmentation. After the parameters are
dot, which should have 3 dots. Thinning results with Stentiford
established segmentation stages are performed. Then will be
algorithm look perfect without any mistake.
selected guide band as the reference to character segmentation.
To select the correct guide band some features are extracted
from each guide band. The stentifod algorithm uses a set of four templates to scan the
There are four rules that this algorithm uses when selecting image, that is T1, T2, T3, and T4 as shown in figure 1.
guide band candidate :
Rule 1: Choose guide band having the highest relative width
(F1) and F4 = 1
Rule 2 : Choose guide band if F2 > Ls and F4 = 1
Rule 3: Choose guide band if F2 <= Ls and F3 > Ls' and guide
band is not the last one. Fig.4 Templates of Stentiford Algorithm
Rule 4 : Choose guide band if F1 >= Lm and F4 = 1 Here are the steps to get the skeleton of an image with the
With F1 = width of guide band, F2 = Distance from 1st Stentiford algorithm [7] :
predecessor from right, zero in case of 1st guide band, F3 =
Distance from 2nd predecessor from right, zero in case of 1st 1) Initially locate the pixel (i, j) that matches the T1
and 2nd guide band, and F4 = the position of the guide band template. Matching this template moves from left to right and
found, worth one if above the baseline and zero if below. from top to bottom.

978-1-5386-5689-1/18/$31.00 ©2018 IEEE 87


The 2018 International Conference on Signals and Systems (ICSigSys)

2) If the middle pixel is not an endpoint and has the number • Follow the priority of directions 1 to 8
of connectivity = 1, then mark pixels for later deletion. • Move the pixel position
The endpoint is a pixel which is the end limit and is only • Append direction to the chain code
connected 1 pixel only. That is if the black pixel has only one The length of the chain code of an object changes according to
black neighbor of the eight possible neighbors. the shape of an object. To maintain the consistency, chain code
The number of connectivity is a measure of how many objects length should be normalized.
are connected to a certain pixel. Here is the formula to calculate In this research, the length of the chain code of each object in
the number of connectivity. the letter image is normalized to 10. steps 1 and 2 follow the
∑ ∈ . . (1) steps developed by Izakian [4], and steps 3 and 4 were
Where: developed in this research. The following are the steps of chain
Nk is the value of the 8 neighbors around the pixels to be code normalization:
analyzed, and the value S = {1,3,5,7} a) The chain code is converted into a matrix with two
N0 is the value of the middle pixel. rows. The first line is the value of the chain code. The second
N1 is the value of the pixel on the right of the central pixel and line is the frequency of occurrence of each number in the chain
the rest are numbered sequentially in the opposite direction of code, like the following chain code:
the clock 7777311122222583353333, After the first stage, chain code
3) Repeat steps 1 and 2 for all pixels that match the T1 will be 2 x 9 matrix:
template. 731258353
4) Similarly follow the above-mentioned steps 1-3 for the 413511214
templates: T2, T3, and T4. b) Eliminate all values that have only 1 frequency.
T2 will match pixels on the left side of the object, moving from
731258353 7123
bottom to top and from left to right. T3 will select pixels along
413511214 4356
the bottom of the image and move from right to left and from
c) Show chain code according to the frequency of the
bottom to top. T4 locates pixels on the right side of the object,
occurrence:
moving from top to bottom and right to left.
777711122222333333
5) Pixels marked for deletion are set to white.
d) Perform chain code mapping to 10 chain code, the
formula is :

FOC : 777711122222333333
. 1 (2)
where :
FOC = Frequency of Chain code
NC = Normalized Chain code
i = Index of Normalized Chain code, the value is 0 - 9
Fig.5 Result of stentiford thinnning algorithm Normalized chaincode is :
7711222333
D. Feature Extraction 1865486644
We use three features in this research: the number of dots, the
position of the dots, and the chain code. We explain each of 6666666666
this feature below.
6666541188
1) Normalized Chain Code
In pattern recognition, chain code is a technique to describe a 5556678888
structure of an object. The chain code is obtained by tracing the
pixels of the object boundary based on predetermined
Fig.6 Chain codes for some example Arabic characters
directions. The result of the chain code is the numbers that
indicate the direction that represents the boundary of the object.
Chain code can only be done on the binary image. 2) Number of Dots
Here is how to extract the chain code of an object in an image: The feature of the number of dots is an important feature in
a) Find a black pixel that has only 1 neighbor by tracing the Arabic letters since some Arabic letters have the same shape
pixels in the image starting from the top left corner until it finds but are only differentiated by the number of dots. Such as
a black pixel that has 1 neighbor, if not found a black pixel that character ‫ ث‬,‫ب‬, and ‫ت‬.
has only 1 neighbor then grab the first black pixels encountered.
b) Do iteration of the image: Below are the steps to determine the number of dots in an
• Change the current pixel to 0 image.

978-1-5386-5689-1/18/$31.00 ©2018 IEEE 88


The 2018 International Conference on Signals and Systems (ICSigSys)

a) Iterate each pixel in the image, starting from the top left 1. Group of units connected by the path
corner. 2. The summing unit that sums the input signal already
b) If a black pixel is found and has not been processed yet, get multiplied by its weight.
the chain code starting from this black pixel. Mark each black 3. The activation function used to determine the output of a
pixel used during the chain code extraction as processed. neuron, that is determining whether the signal from the
• If the length of the chain code is less than or equal to 7, neuron input will be forwarded to another neuron or not.
we consider the current chain code as a dot. Add the
number of dots by one.
• If the length of the chain code is more than 7, we
consider the current chain code as part of the main body.

3) Position of Dots
The position of dots is an important feature of Arabic letters.
Some Arabic letters have the same shape and number of dots
but are distinguished by the position of dots. Dots position is
obtained by calculating dot position of letter and height of the
letter. Fig.7 Neuron models
These are the steps to determine the position of dots.
The image will be divided into 5 parts. In this research training stage built by Neural Network using
a) If dot position is in the position of less than 2/5 the Backpropagation. Type of network architecture that used is the
image height then the position of dots is above which is plural layer with 1 hidden layer and 1 output layer, the learning
represented by the number 0. rate is 0.1. The activation function used for the hidden layer is
b) If dot position is in the position less than 3/5 the the sigmoid activation function, and the activation function
height of the image then the position of dots is in the middle used for the output layer is the softmax activation function. The
which is represented by the number 1. input of neurons for each sample is 12, the first neuron is the
c) If dot position is in the position of more than 3/5 number of dots, the second neuron is the position of the dots
image height then the position of dots is below which and the third neuron is the chain code that has been normalized.
represented by number 2.
2) Classification using Hidden Markov Model
Following the computation of the chain code, dot count, and dot
position, the arrangement of features will be as follow.
F = [DotCount, DotPos, Chain_Code]

E. Classification
The classification stage is performed with Neural Network and
Hidden Markov Model (HMM).

1) Classification Using Neural Network


Artificial Neural Networks are a computational system whose Fig.8 Hidden Markov Model
network structure mimics the human nervous system in order
to produce responses and behaviors such as biological Neural
Networks. In the Hidden Markov Model (HMM) there are states that can
The following is how simple the Neural Network works not be observed directly or hidden but can only observe through
compared to the biological Neural Network: observations of other variables [10].
a) Processing of signals or information occurs in neurons. Basically HMM consists of three things [10]:
b) Signals are sent between the neurons through a link, the a) Evaluation
dendrites, and axons. Evaluation is the process of calculating the probability of the
c) Liaison between neurons has a weight that will observation sequence on the HMM model. Evaluation using
strengthen or weaken the signal. forward and backward algorithm.
d) Each neuron has an activation function that serves to b) Decoding
determine the output of a neuron, whether the signal will be Decoding is done to find the best state of observation sequence
forwarded to another neuron or not. on HMM models with the Viterbi algorithm.
c) Parameter Estimation (Learning)
Neurons are the principal information processing units of Baum - Welch algorithm performs learning to obtain HMM
Artificial Neural Networks that act on the impulses they receive model.
and are transmitted to other neurons.

978-1-5386-5689-1/18/$31.00 ©2018 IEEE 89


The 2018 International Conference on Signals and Systems (ICSigSys)

In this research, the number of observing sequences is 12, C. ‫ف كھة دوريان ا ال ذو اق جيدة‬
consists of the number of dots, position of dots and normalized D. ‫تتخلي ابدا في الحياة‬
chain code, hidden state is label id of the letter, total of the
hidden state is 31.
E. ‫انتظر اي محا كمة‬
F. ‫لحفاظ علي صحتك حياة طيبة‬
G. ‫المعلمين يعلمو ن‬
III. EXPERIMENT AND DISCUSSION H. ‫جدةاالرز المطبو خ‬
We used Java to develop the recognition system. We tested the I. ‫يتكلم ببطء مفھومة حتي‬
segmentation stage using 10 sentences created with 3 different J. ‫الحفا ظ علي صحتك حيا ة طيبة‬
fonts, totaling in 30 sentences. We achieved a segmentation
accuracy of 93%. Figure 9 shows an example of a sentence that Based on the recognition of each font, the Times New Roman
fails to be segmented properly font has advantages over the Arial Unicode Ms font and
Tahoma font, both in recognition of isolated Arabic letters and
in the recognition of Arabic letters in sentences. In the
recognition of isolated Arabic letters with Neural Network
Fig. 9 An example of a sentence that fails to be segmented properly recognition method reach 100% accuracy for all fonts, whereas,
in the Hidden Markov Model, Arial Unicode Ms font has 74%
At the classification stage, we used sentences created with three
accuracy, Tahoma font has 61% accuracy and Times New
different fonts for training and testing. The fonts used are Arial
Roman font has 77% accuracy. In the recognition of Arabic
Unicode Ms, Tahoma, and Times New Roman. For isolated
letters in sentences with Neural Network method, Arial
Arabic letter recognition, we used 31 isolated Arabic letters
Unicode Ms font and Tahoma fonts have the same accuracy,
with 3 different fonts as the test data. For Arabic letter
that is 66%, this is lower than the Times New Roman font that
recognition in sentence, we used 10 sentences from 3 different
has 75% accuracy, while in the Hidden Markov Model, Arial
fonts for test data. Table I shows the experimental results with
Unicode Ms font has 49% accuracy, Tahoma font has 50%
Neural Network classification. The recognition of isolated
accuracy, and Times New Roman font has 51% accuracy. The
Arabic letters yields recognition accuracy with an average of
recognition of Arabic letters in sentences experienced a lower
100%. The recognition of Arabic letters in sentences yields
accuracy than the recognition of isolated Arabic letters. This is
recognition accuracy with an average of 69%. Table II shows
because the recognition of Arabic letters in sentences through
the experimental results with Hidden Markov Model
segmentation process, what makes the binary result of the
classification. The recognition of isolated Arabic letters yields
letters in a segmented sentence different with the binary result
recognition accuracy with an average of 71%, and for the
of the letters in the training data, although in the same letter, so
recognition of Arabic letters in sentences yields recognition
the result of the chain code between training data and testing
accuracy with an average of 50%
data is different. This leads to the decline in recognition
TABLE I. PERFORMANCE OF ARABIC RECOGNITION WITH NEURAL
accuracy.
NETWORK CLASSIFICATION
IV. CONCLUSION
Accuracy of Accuracy of The chain code based approach for Arabic letter recognition has
Isolated Arabic
Font
Arabic Character In been a key feature in this research. To improve the recognition
Character Sentence accuracy the number of dots and position of dots feature has
Arial Unicode Ms 100% 66% been added. The three features have been able to provide
Tahoma 100% 66% different features for each letter so that the results obtained are
Times New Roman 100% 75% quite good. The results showed that classification with Neural
Network gets better result compared to Hidden Markov Model.
TABLE II. PERFORMANCE OF ARABIC RECOGNITION WITH HIDDEN
MARKOV MODEL CLASSIFICATION
REFERENCES
Accuracy of Accuracy of
Isolated Arabic [1] Meilani, N.A., Amrizal, V, and Hakiem, N. (2016): Comparative
Font analysis of the accuracy of backpropagation and learning vector
Arabic Character In
Character Sentence quantisation for pattern recognition of hijaiyah letters, 6th International
Arial Unicode Ms 74% 49% Conference on Information and Communication Technology for The
Tahoma 61% 50% Muslim World, 4.
Times New Roman 77% 51% [2] Albakor, M., Saeed K., and Sukkar, F. (2009): Intelligent system for
Arabic character recognition, World Congress on Nature & Biologically
Inspired Computing (NaBIC 2009), 1.
This is the test data for the recognition of the Arabic letters in [3] Supriana, I., and Nasution, A. (2013): Arabic character recognition
the sentence. system development, The 4th International Conference on Electrical
Engineering and Informatics (ICEEI 2013), 1.
A. ‫االحترام المتبادل بين االديان‬ [4] Izakian, S. A. M., Tork, L., and Zamanifar K. (2008): Multi-font
B. ‫لقمر تبدو جميلة جدا‬ farsi/arabic isolated character recognition using kode rantais, World
Academy of Science, Engineering and Technology International Journal

978-1-5386-5689-1/18/$31.00 ©2018 IEEE 90


The 2018 International Conference on Signals and Systems (ICSigSys)

of Computer, Electrical, Automation, Control and Information


Engineering Vol:2, No:7, 1, 3.
[5] Cheung, A, Bennamoun M., and Bergamann, N.W. (2001): An Arabic
Optical Character Recognition System Using Recognition-Based
Segmentation, The Journal Of Pattern Recognition Society, 1.
[6] Zidouri, A. (2010): On multiple typeface arabic script recognition,
Research Journal of Applied Sciences Engineering and Technology, 3.
[7] Stentiford, F. W. M.., and Mortimer, R. G. (1983): Some New Heuristics
for Thinning Binary Handprinted Characters for OCR, IEEE Transaction
On Systems, MAN, AND Cybernetics, VOL. SMC - 13, NO. 1, 3-4
[8] Zhang, T. Y. and Suen, C. Y. (1984): A fast parallel algorithm for
thinning digital patterns, Communications of the ACM Voluume 27
Number 3, 1-3.
[9] Hilditch, C.J. (1968): An application of graph theory on pattern
recognition, In Machine Intell. (B. Meltzer and Michie Eds). New York
Amer. Elsevier, 3.
[10] Handaya, D., Fakhruroja, H., Hidayat, E. M. I., and Machbub, C. (2016):
Comparison of Indonesian speaker recognition using vector quantization
and hidden markov model for unclear pronunciation problem, IEEE 6th
International Conference on System Engineering and Technology
(ICSET), 3.

978-1-5386-5689-1/18/$31.00 ©2018 IEEE 91

Das könnte Ihnen auch gefallen