Beruflich Dokumente
Kultur Dokumente
The system that we have proposed for automatic In spite of intensive research efforts, the degree of
processing bankchecks is composed by four main automation in processing bankchecks is still very
modules: data acquisition, image processing, data limited. Recent papers have addressed the problem of
recognition and database as shown in Figure 3. automatic processing of bankchecks [1],[2],[4].
Currently, two strategies are used for extracting the
user–entered data: thresholding techniques and image
R eal B an k ch eck subtraction. Different thresholding techniques have been
suggested to isolate the user–entered data from the
O p tical S can n er
bankchecks [2],[3],[7]. These techniques have shown
good results only if the bankchecks do not have
complex background patterns. If these techniques were
Im ag e P ro cessin g applied to bankchecks in which the background pattern
has colorful pictures and draws, it would be very
D ate difficult to find a threshold value to segment the
background from the other elements. Furthermore, it
would be very difficult to segment the printed
information from the user–entered data since they can
L iteral am o u n t
have similar gray levels. On the other hand, the
techniques based on image subtraction have shown
more robustness to segment the user–entered data.
C o u rtesy am o u n t Okada–Shridhar (1997) [7] have proposed a method for
extracting the user– entered data from American
bankchecks that have colorful pictures on the
background pattern
S ig n atu re Here in this paper we propose an approach
where the information in which we are not interested
will eliminate in a stage before the information
Fig. 3: System Overview extraction. The main idea is to handle only sub–images
that contain the user–entered data, said, the courtesy
The data acquisition module includes two devices, an amount, the literal amount, the date, the payee’s name,
optical scanner and a MICR scanner. Actually there are and the signature. After digitizing the bank check image
many devices that include both an optical and a MICR is position–adjust and a template will use to extract the
scanner. The optical scanner provides a digitized image areas where the user–entered data is supposed to appear.
of the bankchecks with 200 dpi spatial resolution and For each of the five resulting sub–images, the
256 gray levels for the image-processing module, while corresponding background pattern will subtract from a
the MICR scanner reads account number. The image sample stored in a database. The baselines present in the
processing module will compose of several algorithms sub–images will detect by using an algorithm based on
651
the projection profiles. The baselines detect will erase which will being processed. The basic template is
by substituting the corresponding positions with white presented in Fig. 4. The information present into the
pixels. The character strings present below the baseline white areas will maintain while the information that will
dedicate for signature will eliminate by other coincide with the black areas will eliminate. By
morphological subtraction operation between the applying the template we will be able to eliminate the
corresponding sub–image and a generated binary image redundant information, that is, the information in which
which contains every make–up character string to be we are not interest, and segment the different user–
eliminated. In this new approach we propose to include entered data. Figure 5 shows a resulted image after
a tracing recovering algorithm that will recover some applying the template operation. The output of the item
parts of the user–entered data that can be lost during the extraction algorithm are sub–images representing the
baseline erasing. This novel approach assumes that a user–entered items: digit amount, worded amount,
sample of every background pattern is available in a payees’ name, date and signature.
database.
A. Item Extraction
Due to the standardization of the layout structure of
bankchecks, it is reasonable to design a template for
extracting the interested data from any bankcheck, no
mattering that customer has filled–in it or which
financial institution has issued it, only using a prior
knowledge about the domain. As we are only interested
on the user–entered data, the template must include
every area where this data is supposed to appear into the Fig, 4: The template
bankchecks. The other areas can be eliminated without
compromising the understanding of the document.
Nevertheless, these areas are not located exactly at the
same position for every bankcheck. There are small
position variations, depending on the financial
institution that has issued the bankcheck. Thus, the
template must be adapted for these small variations in
order to avoid selecting wrong areas, what can cause the
loss or the degradation of the interested data. From a
basic template, we propose to construct a database with
the possible positions of the interested data. During the
processing, the database will access and the convenient Fig. 5: The extracted areas
parameters will select to adjust the template, according
to the financial institution that will issue the bankcheck
Post-Processing
Divide Slant Correction Neural Network
Image of
String into
Courtesy Digits Size Normalization
Amount
Thickness
Normalization
652
Neural network based recognition applications. In our endeavor, the recognition module
While template matching, structural analysis and neural will implement as an array of four neural networks work
networks have been very popular classification in parallel. The proposed recognition procedure is
techniques for character recognition, neural networks shown in Figure 7.
are now increasingly used in handwriting recognition
C. Signature verification
Although handwritten signatures are by no means the
most reliable means of personal identification, it
remains one of the most widely acceptable means of
personal identification. It is also non intrusive,
inexpensive and one of the most commonly used
personal identification systems. As by this time
extensive work have been done on this signature
verification field, and benefit comes because there is no
language boundary to apply any suggested method to
apply for other language. In our bankcheck processing
we found that An Off-Line Signature Verification
System Using Hidden Markov Model and Cross-
Validation [5] would be most significant where the
mean error rate is less then 1.5%.
VI. DATABASE
653
bankchecks regarding bank’s agency name and its [3] G. Dimauro, S. Impedovo, G. Pirlo, A. Salzo,
address will include into the agency–level, and it has to “Automatic Bankcheck Processing: A New
store as ASCII text. In this level is also including a list Engineered System”, Int. Journal of Pattern
of the agency’s customers and his/her account numbers. Recognition and Artificial Intelligence 11 (1997),
The account–level will contain the personal information 467–504.
about every customer and the data that is printed on the [4] J. E. B. Santos, F. Bortolozzi, R. Sabourin, “A
bankchecks, such as the customer’s name, his/her Simple Methodology to Bank Cheque
personal information, etc. This data will also store as Segmentation”, First Brazilian Symposium on
ASCII text. The size of the database is directly related Document Image Analysis (1997), 334–343.
to the number of the financial institutions, agencies and [5] E.J.R. Justino, A. El Yacoubi, F. Bortolozzi, R.
accounts. Sabourin, “An Off-Line Signature Verification
System Using Hidden Markov Model and Cross-
VII. DISCUSSION AND CONCLUSION Validation” XIII Brizilian Symposium on
Computer Graphics and Image Processing
While extensive efforts have already been devoted to (SIBGRAPI'00) p. 105
Latin and oriental check processing systems, to the best [6] S. Knerr, V. Anisimov, O. Baret, N. Gorski, D.
of our knowledge, no attempts have been made towards Price, J. C. Simon, “The A2iA Intercheque System:
an Bengali Handwritten check processing system. This Courtesy amount and Legal Amount Recognition
is probably due to the lack of supporting infrastructure for French Checks”, Int. Journal of Pattern
required to conduct, develop, and compare such systems Recognition and Artificial Intelligence 11 (1997),
in order to advance towards systems operating on real 505–548.
databases. Our system will address a complete solution [7] M. Okada, M. Shridhar, “Extraction of User
for extracting and identifying the information from Entered Components from a Personal Bankcheck
bankchecks. The other approaches focus mainly the Using Morphological Subtraction”, Int. Journal of
extraction of the digit and worded amounts, and Pattern Recognition and Artificial Intelligence 11
sometimes the extraction of the date [8], [10], [11]. (1997), 699–715.
Neither of them has proposed a solution for extracting [8] A. Bishnu and B. B. Chaudhuri, “Segmentation of
all the items of the bankchecks. By applying our Bangla handwritten text into characters by
proposed system, an automatic bankcheck processing recursive contour following”, Proc. 5th ICDAR,
system for payment might be feasible for practical pp.402-405, 1999.
applications [9] Yi-Kai Chen and Jhing-Fa Wang, “Segmentation of
Single-or Multiple-Touching Handwritten Numeral
REFERENCE String Using Background and Foreground
Analysis”, IEEE PAMI vol.22, 1304-1317, 2000.
[1] L. Koerich, L. L. Lee, “Automatic Extraction of [10] U. Pal, Sagarika Datta “Segmentation of Bangla
Filled–in Information from Bankchecks Based on Unconstrained Handwritten Text”, ICDAR 2003,
Prior Knowledge About Layout Structure”, First 1128-1132
Brazilian Symposium on Document Image Analysis [11] U. Pal, B. B. Chaudhuri “Automatic Recognition of
(1997), 322–333. Unconstrained Off-Line Bangla Handwritten
[2] G. Dimauro, S. Impedovo, G. Pirlo, A. Salzo, Numerals”, ICMI 2000, 371-378
“Bankcheck Recognition Systems: Re-Engineering
the Design Process”, III Int. Workshop on Frontiers
in Handwriting Recognition (1996), 419–426.
654