Beruflich Dokumente
Kultur Dokumente
INTRODUCTION
1.1
1. Introduction
involve reading aid for the blind, library automation, language processing and
multimedia design.
1.2
a touch sensitive surface, which may be integrated with, or adjacent to, an output
display.
a software application which interprets the movements of the stylus across the
writing surface, translating the resulting strokes into digital text.
Online character recognition is sometimes confused with OCR. OCR is an
instance of offline character recognition, where the system recognizes the fixed static
shape of the character, while on-line character recognition instead recognizes the
dynamic motion during handwriting. For example, online recognition, such as that used
for gestures in the Tablet PC can tell whether a horizontal mark was drawn right-to-left,
or left-to-right. Online character recognition is also referred to by other terms such as
dynamic character recognition, real-time character recognition, and Intelligent Character
Recognition or ICR. On-line systems for recognizing hand-printed text on the fly have
become well-known as commercial products in recent years. Among these are the input
devices for personal digital assistants such as those running Palm OS. The Apple Newton
pioneered this product. The algorithms used in these devices take advantage of the fact
that the order, speed, and direction of individual lines segments at input are known. Also,
the user can be retrained to use only specific letter shapes. These methods cannot be used
in software that scans paper documents, so accurate recognition of offline character
recognition is still largely an open problem.
1. Introduction
1.3
Character Recognition
Graphics Recognition
Document Understanding
1. Introduction
Sketching Interfaces
Historical Documents
Multimedia Systems
WWW Applications
1.4
Character
image
Preprocessing
Feature
Extraction
Recognition
1. Introduction
offline handwriting recognition. In general, there are four basic approaches for pattern
recognition, namely, statistical, structural, artificial neural network and soft-computing.
In the statistical or decision-theoretic approach, the recognition is based on the
decision boundaries established in the feature space by statistical distributions of the
patterns. A decision is usually made by maximizing a posteriori probability, where the
recognition error of this approach is called Bayes error.
In structural (syntactic) approaches, each pattern is defined by using structural
descriptions or representations. The recognition is performed according to structural
similarities. This is based upon the fact that structural relationships between features are
also essential to recognize patterns.
Since the mid 1980s, neural networks have become popular in handwriting
recognition as they learn from examples, are robust, insensitive to noise and have
generalization ability. Among various architectures, the feed-forward neural networks
remain dominant because the well-known training methods, back-propagation of errors
and approximation function are available. Multi-layer perceptron (MLP) classifiers are
able to form complex hyper plane decision regions that can classify large number of
classes. In comparison with the three approaches introduced above, there is strong
1. Introduction
evidence that structural approaches and neural-network based approaches may offer
better solution to the problems in handwriting recognition. Statistical approaches are
more suitable to the problems with large random variations in data, such as speech
recognition. Words and characters are highly structured entities and many uncertainties
in handwriting are not random by nature or may be difficult to model by traditional
probabilistic techniques. For such uncertainties statistical approaches may fail. In order
to develop handwriting systems that are able to make use of structural information and
handle different types of uncertainty, we need to combine different paradigms.
Recently, soft computing approaches are used by researchers to develop hybrid
handwriting recognition systems. It is a new problem-solving paradigm that combines
emerging techniques and theories such as neural networks, fuzzy logic, genetic
algorithms and other evolutionary methods. Such handwriting recognition systems can
be classified into two categories: in the first category, they recognize patterns based on a
single classifier that is formed by combining different soft computing methods. They
generally are multi-stage classifiers. The recognition systems belonging to the second
category make decisions based on results of several classifiers to produce higher
recognition rates. The outputs from the multiple classifiers are combined using various
combining rules like: majority vote, sum, max, product and median.
1.5
Evolution of OCR
The origin of character recognition can be found in 1870 when Carey invented
the retina scanner, and image transmission system using a mosaic of photocells. Later in
1890, Nipkow invented the sequential scanner which was a major breakthrough both for
modern television and reading machines. Character recognition as an aid to the visually
handicapped was at first attempted by the Russian scientist Tyurin in1900.
The OCR technology took a major turn in the middle of 1950s with the
development of digital computer and improved scanning devices. For the first time OCR
was realized as a data processing approach, with particular applications to the business
world. From that perspective, David Shepard, founder of the Intelligent Machine
Research Co. can be considered as a pioneer of the development of commercial OCR
equipment. Currently, PC-based systems are commercially available to read printed
documents of single font with very high accuracy and documents of multiple fonts with
1. Introduction
reasonable accuracy. Most of the available systems work on European scripts which are
based on Roman alphabets. Research reports on oriental language scripts are few, except
for Korean, Chinese and Japanese scripts. Depending on versatility, robustness and
efficiency, the commercial OCR systems can be divided into four generations.
The first generation systems can be characterized by the constrained letter shapes
which the OCRs read. Such machines appeared in the beginning of 1960s. The first
widely commercialized OCR of this generation was the IBM 1418, which was designed
to read a special IBM font, 407. The recognition method was logical template matching
where the positional relationship was fully utilized.
The next generation is characterized by the recognition capabilities of a set of
regular machine printed characters as well as hand-printed characters. At the early stages,
the scope was restricted to numerals only. Such machines appeared in early1970s. In this
generation, the first and famous OCR system was IBM 1287, which was exhibited at the
1965 New York world fair. In terms of hardware configuration, the system was a hybrid
one, combining analog and digital technology. The first automatic letter-sorting machine
for postal code numbers of Toshiba was also developed during this period. The methods
were based on the structural analysis approach.
The third generation can be characterized by the OCR of poor print quality
characters, and hand-printed characters for a large category character set. Commercial
OCR systems with such capabilities appeared roughly during the decade 1975 to 1985.
The fourth generation can be characterized by the OCR of complex documents
intermixing with text, graphics, table and mathematical symbols, unconstrained
handwritten characters, color document, low quality noise documents like photocopy and
fax etc. some pieces of work on complex documents provided good results. Although
many pieces of work on unconstrained handwritten character are available in the
literature, the recognition accuracy hardly exceeds 85%.
Among other commercial products, postal address readers are available in the
market. In the United States, about 60% of the hand printed is sorted automatically.
Reading aid for the blind is also available. An integrated OCR with speech output system
for the blind has been marketed by Xerox-Kurxweil for English language. At present,
more sophisticated optical readers are available for Roman, Chinese, Japanese and
Arabic text. These readers can process a document which has been typewritten or
printed. They can recognize characters with different fonts and sizes as well as different
formats including intermixed text and graphics. Although lot of research is carried out
1. Introduction
for the OCR in these scripts, no OCR systems are found for the recognition of
handwritten Indian scripts. Extensive research is being carried out in these languages
recently for the recognition of handwritten characters and words.
In a multi-lingual country like India, which has many languages with their own
distinctive scripts and rich literary traditions, it is particularly important to develop
computer systems that allow users to interact with them in Indian languages. There are
14 Indic scripts and there is a huge untapped potential for Indian population to access
Information Technology through Indian languages. Handwriting being a natural interface
to computers, recognition of handwritten Indian documents offers a huge area for
research. In spite of widespread use of computers, paper documents will continue to
remain important for a long period of time and hence it is necessary to have computer
systems that can seamlessly integrate paper documents with other electronically created
ones.
1.6
extensive research over recent years. This pattern recognition task is quite challenging
due to the variability in the writing style, similarity in the character shapes, presence of
modifiers and various other features of Indian scripts. Research is being carried out
extensively in Bangla and Devanagari scripts, the two most popular scripts in India. In
India, there are twenty two Indian official (Indian constitution accepted) languages,
namely Assamese, Bangla, Gujarati, Hindi, Konkani, Kannada, Kashmiri, Malayalam,
Marathi, Nepali, Oriya, Punjabi, Bodo, Dogri, Maithili, Manipuri, Santhali, Sindhi,
Sanskrit, Tamil, Telugu and Urdu. Different scripts are used for writing these official
languages. Most Indian scripts originated from ancient Brahmi through various
transformations. Two or more of these languages may be written in one script. For
example, Devanagari is used to write Hindi, Marathi, Rajasthani, Sanskrit and Nepali
while Bangla script is used to write Assamese and Bangla (Bengali) languages.
In the proposed work, we aim at developing a comprehensive system for
recognition of unconstrained handwritten Marathi characters.
Marathi is a language spoken by the Maharashtrian people of western India. It is
the official language of the state of Maharashtra and is the 4th most spoken language in
1. Introduction
India. It is spoken by about 63 million people. It is derived from Devanagari script and it
consists of 16 vowels and 36 consonants. Figure 1.2 and Figure 1.3 present the vowels
and the consonants in Marathi script respectively. Vowels are combined with consonants
with the help of specific characteristic marks. These marks occur in line, at the top, or at
the bottom of a character in a word and are called as modifiers. An illustration of how
the vowels combine with the consonants is shown in Figure 1.4.
Marathi is written from left to right. It has no upper and lower case characters as
in English; however the alphabet itself contains more number of symbols than that of
English. While line segments (strokes) are the predominant features for English, most of
the characters in Devanagari script are formed by curves, holes, and also strokes. Marathi
has conjunct characters which are formed by joining two or more consonants. Every
character has a horizontal line at the top called as the header line. The header line joins
the characters in a word. This script has two-dimensional compositions of symbols: core
characters in the middle strip, optional modifiers above and/or below core characters.
1. Introduction
10
Figure 1.5 shows a Marathi word partitioned into three character zones: A core
zone that contains most consonant, half consonant, vowel and conjunct forms (core
components), an upper zone containing ascenders or upper modifiers and a lower zone
containing descenders or lower modifiers. The core and upper zones are separated by the
header line.
1.7
1.8
issues other than this which make the recognition of handwritten characters a challenging
task and affect the recognition rate to a considerable extent. Some examples of such
issues are:
In case of compound characters the strategies for joining two or more consonants
are different. Characters may also get split during pre-processing,
1. Introduction
11
Segmentation of modifiers,
This demands an efficient system which takes care of these issues at all the stages
in the OCR system, from pre-processing to recognition.
1.9
Problem statement
As discussed earlier, there is a need of developing OCR for handwritten Indian
languages. A lot of research is required for handwritten Marathi script for development
of sophisticated systems. In the last few years, there has been a great interest in applying
artificial neural network (ANN) technology in various fields of conventional computing.
This is due to the fact that ANN provides parallelism, they learn from examples, has the
capacity of handling a classification problem comprising of large number of classes and
has generalization ability. This motivates the implementation of neural network for the
recognition of handwritten Marathi characters.
1. Introduction
12
1.10 Objectives
The following are the objectives of the proposed work:
To analyze the efficiency of these classifiers and find the optimum network
1.11 Scope
The scope of the proposed work is as follows:
A multilayer perceptron neural network is used for recognition along with other
types of recognition methods.
1. Introduction
13
1. Introduction
14