Sie sind auf Seite 1von 14

CHAPTER 1

INTRODUCTION

A Neural Network Based Handwritten Character Recognition for Marathi Script

Pattern recognition is a set of mathematical, statistical and heuristic techniques


used in executing man-like tasks on computers. Pattern recognition plays an important
role in many applications such as document processing, robot vision, optical character
recognition, writer identification and verification etc. Automation of pattern recognition
helps to speed up processing time as well as to automate processes without human
intervention. Computer recognition of characters or word is one of the most successful
applications in computer vision. It is an automated process that uses pattern recognition
and machine learning techniques to recognize characters or words. With the ever
increasing computational speed, memory, technical advances in scanning devices and
maturity of recognition techniques, a tremendous progress is seen in the development of
character recognition. Section 1.1 discusses optical character recognition. Section 1.2
presents the types of handwriting OCR systems. The research in OCR systems is
discussed in Section 1.3. The components of an OCR system are described in Section
1.4. Section 1.5 presents the evolution of OCR. Marathi script and its properties are
discussed in Section 1.6. The Institutes in India doing research in Devanagari are
mentioned in Section 1.7. The challenges in Marathi handwriting recognition are
mentioned in Section 1.8. Section 1.9 puts forth the problem statement, while Section
1.10 and 1.11 discusses the objectives and the scope of the proposed work respectively.
Section 1.12 presents the organization of the thesis and finally Section 1.13 presents the
concluding remarks.

1.1

Optical character recognition


Optical character recognition (OCR) is the recognition of printed or handwritten

text by a computer. This involves photo scanning of the text character-by-character,


analysis of the scanned-in image, and then translation of the character image into
character codes, such as ASCII, commonly used in data processing. In OCR processing,
the scanned-in image or bitmap is analyzed for light and dark areas in order to identify
each alphabetic letter or numeric digit. When a character is recognized, it is converted
into an ASCII code. Special circuit boards and computer chips designed especially for
OCR are used to speed up the recognition process. Research in OCR is popular for its
various application potentials in office automation, cheque verification in banks, postoffices, and a large variety of business and data entry applications. Other applications

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

involve reading aid for the blind, library automation, language processing and
multimedia design.

1.2

Types of handwriting OCR systems


There are two major problem domains in handwriting recognition: online and

offline. Online handwriting recognition involves the automatic conversion of text as it is


written on a special digitizer or personal digital assistant (PDA), where a sensor picks up
the pen-tip movements as well as pen-up/pen-down switching. That kind of data is
known as digital ink and can be regarded as a dynamic representation of handwriting.
The obtained signal is converted into letter codes which are usable within computer and
text-processing applications. The elements of an on-line handwriting recognition
interface typically include:

a pen or stylus for the user to write with.

a touch sensitive surface, which may be integrated with, or adjacent to, an output
display.

a software application which interprets the movements of the stylus across the
writing surface, translating the resulting strokes into digital text.
Online character recognition is sometimes confused with OCR. OCR is an

instance of offline character recognition, where the system recognizes the fixed static
shape of the character, while on-line character recognition instead recognizes the
dynamic motion during handwriting. For example, online recognition, such as that used
for gestures in the Tablet PC can tell whether a horizontal mark was drawn right-to-left,
or left-to-right. Online character recognition is also referred to by other terms such as
dynamic character recognition, real-time character recognition, and Intelligent Character
Recognition or ICR. On-line systems for recognizing hand-printed text on the fly have
become well-known as commercial products in recent years. Among these are the input
devices for personal digital assistants such as those running Palm OS. The Apple Newton
pioneered this product. The algorithms used in these devices take advantage of the fact
that the order, speed, and direction of individual lines segments at input are known. Also,
the user can be retrained to use only specific letter shapes. These methods cannot be used
in software that scans paper documents, so accurate recognition of offline character
recognition is still largely an open problem.

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

In offline handwriting recognition, digitized spatial information is available to the


recognition system, e.g. the image of the address scanned from an envelope or an amount
shown on a cheque. Off-line handwriting recognition involves the automatic conversion
of text in an image into letter codes which are usable within computer and textprocessing applications. Off-line handwriting recognition is comparatively difficult, as
different people have different handwriting styles and the information available to the
recognition system is limited.
In terms of processing word data, there are two approaches: the analytical
approach which treats the word as an orderly collection of parts. In such approaches, the
word image is segmented into subunits based on some properties or rules, e.g. characterlike properties or shapes that match a part list e.g. alphabet. Before recognition, the word
has to be reconstructed. Segmenting a word image into meaningful parts is itself a
difficult task; there are no techniques that are able to reliably segment a word into
characters. Although excessive research has been done along this direction, it remains an
open question as to whether it is the most promising approach.
Another approach is the holistic approach which considers the word as a single
entity. This avoids the problem of character segmentation, but is an area which has been
less studied than the analytical approach. However, research along this line has recently
gained considerable interest.

1.3

Research in OCR systems


Handwriting Recognition has an active community of academics studying it. The

biggest conferences for handwriting recognition are the International Conference on


Frontiers in Handwriting Recognition (ICFHR), held in even-numbered years, and the
International Conference on Document Analysis and Recognition (ICDAR), held in oddnumbered years. Both the conferences are scrutinized by the IEEE. Active areas of
research in these conferences include:

Character Recognition

Graphics Recognition

Document Image Analysis

Document Understanding

Camera-based Document Processing

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

Document Databases and Digital Libraries

Sketching Interfaces

Historical Documents

Handwriting Recognition Techniques

Preprocessing and Segmentation Techniques

Classifiers and their Combinations

Multiple Sources and Multiple Experts

Innovative Approaches in Handwriting Recognition

Soft Computing for Handwriting Processing and Understanding

Error Reduction and Performance Enhancement

Writer Verification and Identification

Multimedia Systems

Color Information Utilization

WWW Applications

The International Journal on Document Analysis and Recognition (IJDAR)


sponsored by the International Association for Pattern Recognition is focused on
publishing articles that cover all areas related to document analysis and recognition. This
includes contributions dealing with computer recognition of characters, symbols, text,
lines, graphics, images, handwriting, signatures, as well as automatic analyses of the
overall physical and logical structures of documents, with the ultimate objective of a
high-level understanding of their semantic content.

1.4

Components of an OCR system


Similar to any pattern recognition system, the character recognition system

contains the basic components as shown in Figure 1.1.

Character
image

Preprocessing

Feature
Extraction

Recognition

Figure 1.1 Components of an OCR

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

The character image is captured using an acquisition device such as flatbed


scanner. In pre-processing, the system carries out data cleaning tasks which may include
noise reduction, skew detection and correction, slant detection and correction,
normalization, binarization etc. This is usually followed by a segmentation process to
separate the lines, words and characters in the image.
Feature extraction is of vital importance to character recognition systems,
particularly handwriting recognition systems. It serves two main purposes: extracting the
most representative features that are used by classifiers, and reducing redundancies in
data. In online systems, input is composed of unit width line-segments with dynamic
information so that feature extraction is a relatively easy task. In offline systems, feature
extraction is affected by several factors:

Different background of documents

Non-uniform illumination of the scanner

Noise introduced by electronics and writing tools

Different qualities of paper and types of ink


In order to overcome these problems, many methods have been proposed for

offline handwriting recognition. In general, there are four basic approaches for pattern
recognition, namely, statistical, structural, artificial neural network and soft-computing.
In the statistical or decision-theoretic approach, the recognition is based on the
decision boundaries established in the feature space by statistical distributions of the
patterns. A decision is usually made by maximizing a posteriori probability, where the
recognition error of this approach is called Bayes error.
In structural (syntactic) approaches, each pattern is defined by using structural
descriptions or representations. The recognition is performed according to structural
similarities. This is based upon the fact that structural relationships between features are
also essential to recognize patterns.
Since the mid 1980s, neural networks have become popular in handwriting
recognition as they learn from examples, are robust, insensitive to noise and have
generalization ability. Among various architectures, the feed-forward neural networks
remain dominant because the well-known training methods, back-propagation of errors
and approximation function are available. Multi-layer perceptron (MLP) classifiers are
able to form complex hyper plane decision regions that can classify large number of
classes. In comparison with the three approaches introduced above, there is strong

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

evidence that structural approaches and neural-network based approaches may offer
better solution to the problems in handwriting recognition. Statistical approaches are
more suitable to the problems with large random variations in data, such as speech
recognition. Words and characters are highly structured entities and many uncertainties
in handwriting are not random by nature or may be difficult to model by traditional
probabilistic techniques. For such uncertainties statistical approaches may fail. In order
to develop handwriting systems that are able to make use of structural information and
handle different types of uncertainty, we need to combine different paradigms.
Recently, soft computing approaches are used by researchers to develop hybrid
handwriting recognition systems. It is a new problem-solving paradigm that combines
emerging techniques and theories such as neural networks, fuzzy logic, genetic
algorithms and other evolutionary methods. Such handwriting recognition systems can
be classified into two categories: in the first category, they recognize patterns based on a
single classifier that is formed by combining different soft computing methods. They
generally are multi-stage classifiers. The recognition systems belonging to the second
category make decisions based on results of several classifiers to produce higher
recognition rates. The outputs from the multiple classifiers are combined using various
combining rules like: majority vote, sum, max, product and median.

1.5

Evolution of OCR
The origin of character recognition can be found in 1870 when Carey invented

the retina scanner, and image transmission system using a mosaic of photocells. Later in
1890, Nipkow invented the sequential scanner which was a major breakthrough both for
modern television and reading machines. Character recognition as an aid to the visually
handicapped was at first attempted by the Russian scientist Tyurin in1900.
The OCR technology took a major turn in the middle of 1950s with the
development of digital computer and improved scanning devices. For the first time OCR
was realized as a data processing approach, with particular applications to the business
world. From that perspective, David Shepard, founder of the Intelligent Machine
Research Co. can be considered as a pioneer of the development of commercial OCR
equipment. Currently, PC-based systems are commercially available to read printed
documents of single font with very high accuracy and documents of multiple fonts with

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

reasonable accuracy. Most of the available systems work on European scripts which are
based on Roman alphabets. Research reports on oriental language scripts are few, except
for Korean, Chinese and Japanese scripts. Depending on versatility, robustness and
efficiency, the commercial OCR systems can be divided into four generations.
The first generation systems can be characterized by the constrained letter shapes
which the OCRs read. Such machines appeared in the beginning of 1960s. The first
widely commercialized OCR of this generation was the IBM 1418, which was designed
to read a special IBM font, 407. The recognition method was logical template matching
where the positional relationship was fully utilized.
The next generation is characterized by the recognition capabilities of a set of
regular machine printed characters as well as hand-printed characters. At the early stages,
the scope was restricted to numerals only. Such machines appeared in early1970s. In this
generation, the first and famous OCR system was IBM 1287, which was exhibited at the
1965 New York world fair. In terms of hardware configuration, the system was a hybrid
one, combining analog and digital technology. The first automatic letter-sorting machine
for postal code numbers of Toshiba was also developed during this period. The methods
were based on the structural analysis approach.
The third generation can be characterized by the OCR of poor print quality
characters, and hand-printed characters for a large category character set. Commercial
OCR systems with such capabilities appeared roughly during the decade 1975 to 1985.
The fourth generation can be characterized by the OCR of complex documents
intermixing with text, graphics, table and mathematical symbols, unconstrained
handwritten characters, color document, low quality noise documents like photocopy and
fax etc. some pieces of work on complex documents provided good results. Although
many pieces of work on unconstrained handwritten character are available in the
literature, the recognition accuracy hardly exceeds 85%.
Among other commercial products, postal address readers are available in the
market. In the United States, about 60% of the hand printed is sorted automatically.
Reading aid for the blind is also available. An integrated OCR with speech output system
for the blind has been marketed by Xerox-Kurxweil for English language. At present,
more sophisticated optical readers are available for Roman, Chinese, Japanese and
Arabic text. These readers can process a document which has been typewritten or
printed. They can recognize characters with different fonts and sizes as well as different
formats including intermixed text and graphics. Although lot of research is carried out
1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

for the OCR in these scripts, no OCR systems are found for the recognition of
handwritten Indian scripts. Extensive research is being carried out in these languages
recently for the recognition of handwritten characters and words.
In a multi-lingual country like India, which has many languages with their own
distinctive scripts and rich literary traditions, it is particularly important to develop
computer systems that allow users to interact with them in Indian languages. There are
14 Indic scripts and there is a huge untapped potential for Indian population to access
Information Technology through Indian languages. Handwriting being a natural interface
to computers, recognition of handwritten Indian documents offers a huge area for
research. In spite of widespread use of computers, paper documents will continue to
remain important for a long period of time and hence it is necessary to have computer
systems that can seamlessly integrate paper documents with other electronically created
ones.

1.6

Marathi script and its properties


Unconstrained handwritten character recognition for Indian scripts is an area of

extensive research over recent years. This pattern recognition task is quite challenging
due to the variability in the writing style, similarity in the character shapes, presence of
modifiers and various other features of Indian scripts. Research is being carried out
extensively in Bangla and Devanagari scripts, the two most popular scripts in India. In
India, there are twenty two Indian official (Indian constitution accepted) languages,
namely Assamese, Bangla, Gujarati, Hindi, Konkani, Kannada, Kashmiri, Malayalam,
Marathi, Nepali, Oriya, Punjabi, Bodo, Dogri, Maithili, Manipuri, Santhali, Sindhi,
Sanskrit, Tamil, Telugu and Urdu. Different scripts are used for writing these official
languages. Most Indian scripts originated from ancient Brahmi through various
transformations. Two or more of these languages may be written in one script. For
example, Devanagari is used to write Hindi, Marathi, Rajasthani, Sanskrit and Nepali
while Bangla script is used to write Assamese and Bangla (Bengali) languages.
In the proposed work, we aim at developing a comprehensive system for
recognition of unconstrained handwritten Marathi characters.
Marathi is a language spoken by the Maharashtrian people of western India. It is
the official language of the state of Maharashtra and is the 4th most spoken language in

1. Introduction

A Neural Network Based Handwritten Character Recognition for Marathi Script

India. It is spoken by about 63 million people. It is derived from Devanagari script and it
consists of 16 vowels and 36 consonants. Figure 1.2 and Figure 1.3 present the vowels
and the consonants in Marathi script respectively. Vowels are combined with consonants
with the help of specific characteristic marks. These marks occur in line, at the top, or at
the bottom of a character in a word and are called as modifiers. An illustration of how
the vowels combine with the consonants is shown in Figure 1.4.

Figure 1.2 Marathi vowels

Figure 1.3 Marathi consonants

Figure 1.4 Vowels combined with consonants

Marathi is written from left to right. It has no upper and lower case characters as
in English; however the alphabet itself contains more number of symbols than that of
English. While line segments (strokes) are the predominant features for English, most of
the characters in Devanagari script are formed by curves, holes, and also strokes. Marathi
has conjunct characters which are formed by joining two or more consonants. Every
character has a horizontal line at the top called as the header line. The header line joins
the characters in a word. This script has two-dimensional compositions of symbols: core
characters in the middle strip, optional modifiers above and/or below core characters.

1. Introduction

10

A Neural Network Based Handwritten Character Recognition for Marathi Script

Figure 1.5 An example of Marathi word

Figure 1.5 shows a Marathi word partitioned into three character zones: A core
zone that contains most consonant, half consonant, vowel and conjunct forms (core
components), an upper zone containing ascenders or upper modifiers and a lower zone
containing descenders or lower modifiers. The core and upper zones are separated by the
header line.

1.7

Institutes in India doing research in Devanagari


Some of the leading institutes in India doing research in Devanagari OCR are:

Indian Statistical Institute, Kolkata,

International Institute of Information Technology, Hyderabad,

Indian Institute of Science, Bangalore, and

Indian Institute of Technology, New Delhi.

1.8

Challenges in Marathi handwriting recognition


The Marathi script has large number of characters and vowels. There are various

issues other than this which make the recognition of handwritten characters a challenging
task and affect the recognition rate to a considerable extent. Some examples of such
issues are:

Variations in the writing style of the writers,

Variations in the font size, pen width, pen ink,

Shape changes due to pre-processing parameters,

Variations in the character spacing, skew and slant,

In case of compound characters the strategies for joining two or more consonants
are different. Characters may also get split during pre-processing,

1. Introduction

11

A Neural Network Based Handwritten Character Recognition for Marathi Script

Some character pairs are very much similar to each other,

The header line may or may not be drawn by the writer,

Segmentation of modifiers,

Segmentation of touching characters.

This demands an efficient system which takes care of these issues at all the stages
in the OCR system, from pre-processing to recognition.

1.9

Problem statement
As discussed earlier, there is a need of developing OCR for handwritten Indian

languages. A lot of research is required for handwritten Marathi script for development
of sophisticated systems. In the last few years, there has been a great interest in applying
artificial neural network (ANN) technology in various fields of conventional computing.
This is due to the fact that ANN provides parallelism, they learn from examples, has the
capacity of handling a classification problem comprising of large number of classes and
has generalization ability. This motivates the implementation of neural network for the
recognition of handwritten Marathi characters.

Figure 1.6 Characters used in the proposed system

1. Introduction

12

A Neural Network Based Handwritten Character Recognition for Marathi Script

The proposed system aims at recognizing the unconstrained handwritten Marathi


characters using neural networks. The input to the system is an image containing
handwritten characters. The output of the system is the recognized characters displayed
as text in Kiran font. The Marathi characters used in the proposed OCR system are
shown in Figure 1.6. They include vowels, consonants, and compound characters along
with split characters.

1.10 Objectives
The following are the objectives of the proposed work:

To develop a system for handwritten Marathi character recognition in an


authoritative manner outlining all the issues involved.
To select pre-processing algorithms that retains the shape of the character.
To work at feature extraction level and classification level and select combination
that gives best results.
To analyze the performance using various neural network classifiers and their
combination for the robust recognition of the handwritten characters.

To design hybrid systems that combine several classification and/or recognition


techniques along with neural network.

To analyze the efficiency of these classifiers and find the optimum network

1.11 Scope
The scope of the proposed work is as follows:

The proposed work focuses on the recognition of isolated handwritten Marathi


characters without any modifiers.

A multilayer perceptron neural network is used for recognition along with other
types of recognition methods.

Two evaluation parameters are to be used in the comparative analysis namely,


recognition rate and processing time.

1. Introduction

13

A Neural Network Based Handwritten Character Recognition for Marathi Script

1.12 Organization of the thesis


The rest of the thesis is organized as follows. Chapter 2 presents the literature
review. Chapter 3 puts forth the problem identification. The handwritten Marathi
numeral recognition system is discussed in Chapter 4. Chapter 5 describes the
handwritten Marathi character recognition system. Chapter 6 discusses the optimization
techniques for performance improvement. The hybrid models for accuracy improvement
are presented in Chapter 7. Chapter 8 describes the models for handwritten Marathi
compound characters. Finally Chapter 9 puts forth the discussions and conclusions on the
proposed work.

1.13 Concluding remarks


A brief introduction of OCR technology and its evolution is introduced in this
chapter. Marathi script and its features are introduced as well. Further the challenges and
the need for handwritten Marathi character recognition is emphasized.
The stages in the OCR system are discussed in brief. The soft computing
techniques are adopted recently to improve the recognition rate of offline handwritten
character recognition. The problem statement, objectives and the scope of the proposed
work are also discussed. The detailed literature survey related to Devanagari and Marathi
character recognition is presented in the next chapter.

1. Introduction

14

Das könnte Ihnen auch gefallen