Sie sind auf Seite 1von 45



The problem of vehicle license plate recognition is an interesting one and over the
years has attracted a plethora of researchers and computer vision experts. The
applications of such a system are vast and can range from parking lot security to traffic
management. There are various approaches to the solution of this problem, such as
texture-based, morphology-based and boundary line-based. This dissertation, presents
a morphology-based approach for the identification of a license plate in the image of a
vehicle. The recognition process deviates from the conventional approach of using
Optical Character Recognition (OCR) systems and utilizes the concept of color
coherence vectors. Researchers have been proposed a variety of solutions for the
problem of license plate identification and recognition in images.

Researchers at Brno University have proposed a technique for the localization of

license plates that is based on applying convolution process on the image. The
application of the Hough transform has also been partially successful in reducing skew
from the segmenting license plates in an image. The utilization of enhanced edge-
detection techniques combined with others such as slope and projection evaluation is
another interesting solution to this problem. To attain faster processing speeds some
systems decide on a threshold for the size of the license plate and the character regions
within them. Then using fuzzy logic and neural network algorithms the character
regions are segmented and the characters within them are identified. A slightly
different approach to the segmentation problem is the mean shift segmentation
method. It identifies several candidate regions within a source image and utilizes
features such as rectangularity, aspect ratio and edge density to determine whether the
identified region is a license plate or not. All of the above research works strive to
maintain a correct balance between the accuracy of the algorithm and its speed. The


morphology-based identification approach is highly accurate and the color coherence

vector approach for recognition is extremely fast.

1.1 Background and Motivation

Massive integration of information technologies into all aspects of modern life caused
demand for processing vehicles as conceptual resources in information systems.
Because a standalone information system without any data has no sense, there was
also a need to transform information about vehicles between the reality and
information systems.
This can be achieved by a human agent, or by special intelligent equipment which is
be able to recognize vehicles by their number plates in a real environment and reflect
it into conceptual resources. Because of this, various recognition techniques have been
developed. Already in 1976, the Police Scientific Development Branch in the UK
started developing a system that was up and running in 1979. In 1981 the first arrest
due to a stolen car being detected by this system was made. However, since most
previous work has been done by private corporations, much of the underlying theory is
kept secret. But from the publically available articles it can be deduced that the
different solutions for automatic license plate recognition generally consist of two
1. Finding license plates in images
2. Reading text from license plates
The second part of the problem, reading text, is really just a subset of the vast field of
optical character recognition (OCR).

Number plate recognition systems are today used in various traffic and security
applications, such as parking, access and border control, or tracking of stolen cars. In
parking, number plates are used to calculate duration of the parking. When a vehicle
enters an input gate, number plate is automatically recognized and stored in database.
When a vehicle later exits the parking area through an output gate, number plate is


recognized again and paired with the first-one stored in the database. The difference in
time is used to calculate the parking fee.
Automatic number plate recognition systems can be used in access control. For
example, this technology is used in many companies to grant access only to vehicles
of authorized personnel.
In some countries, ANPR systems installed on country borders automatically detect
and monitor border crossings. Each vehicle can be registered in a central database and
compared to a black list of stolen vehicles. In traffic control, vehicles can be directed
to different lanes for a better congestion control in busy urban communications during
the rush hours.

1.2 Problem Statement

The problem is to detect a rectangular area of the number plate and to recognize the
vehicle’s identification number from the number plate area in an original image.
Humans define a number plate in a natural language as a “small plastic or metal plate
attached to a vehicle for official identification purposes”, but machines do not
understand this definition as well as they do not understand what “vehicle”,” road”, or
whatever else is. Because of this, there is a need to find an alternative definition of a
number plate based on descriptors that will be comprehensible for machines.
Let us define the number plate as a “rectangular area with increased occurrence of
horizontal and vertical edges”. The high density of horizontal and vertical edges on a
small area is in many cases caused by contrast characters of a number plate, but not in
every case. This process can sometimes detect a wrong area that does not correspond
to a number plate. Because of this, we often detect several candidates for the plate by
this algorithm, and then we choose the best one by a further heuristic analysis and then
the recognition algorithm is applied on the chosen candidate to the number form the
number plate.


1.3 Scope of the project

As a scope of a project, we have to detach or extract the number plate and recognize
the number from the snapshot acquired by the sensor or the camera or any hardware
specified for the purpose. This will be performed by first applying certain convolution
operation over the buffered image i.e. Rank Filtering and Sobel Operator (edge
detection). Projection of an image is further be taken for clipping the band, again
projection of the band is taken for clipping the plate.

In the case, of the skewed image we have applied Hough Transformation for
deskewing the. By the help of transformation we acquired an angle of tilt of an image
by which it is skewed. This angle is used for the deskewing the image.

The next step after the detection of the number plate area is a segmentation of the
plate. The segmentation is one of the most important processes in the automatic
number plate recognition, because all further steps rely on it. If the segmentation fails,
a character can be improperly divided into two pieces, or two characters can be
improperly merged together. We used a horizontal projection of a number plate for the
segmentation in the first phase. The second phase of the segmentation is an
enhancement of segments. The segment of a plate contains besides the character also
undesirable elements such as dots and stretches as well as redundant space on the sides
of character. We eliminate these elements and extract only the character

To recognize a character from a bitmap representation, there is a need to extract

feature descriptors of such bitmap. As an extraction method significantly affects the
quality of whole OCR process, it is very important to extract features, which will be
invariant towards the various light conditions, used font type and deformations of
characters caused by a skew of the image.


The optical character recognition technique has been frequently used for identifying
characters in the extracted image of a license plate. However, the processing time and
accuracy of this technique are questionable. This algorithm presented in this
dissertation presents an extremely fast and accurate method of recognizing license
plates. The algorithm can be termed as an illiterate one, in the sense that it does not
extract the characters within the image but it recognizes the image as a whole. To
build the initial database, images of the required license plates are preprocessed and
their parameters are stored. During the recognition process these parameters are
simply compared with those of the input image in constant time and the best match is
retrieved. Due to its static complexity it is an extremely fast technique for image


Exhaustive Literature Survey

2.1 Introduction

Automatic number plate detection and recognition (ANPDR) is a surveillance method

that uses image processing and optical character recognition on images to read the
license plates on vehicles. While the technology has been around for a long time but
the accuracy and reliability of the Anpdr systems are of main concern. Some of the
systems developed earlier have very low efficiency but now new Anpdr systems are
becoming more efficient and robust with the use of new technologies.

2.2 ANPDR System

ANPDR system typically consists of following processing steps:

1) Number plate Area detection
a) RGB to grayscale conversion
b) Convolution operation
c) Horizontal and vertical rank filtering
d) Image Projection
e) Band and plate clipping
f) Detection and correction of skew
2) Plate segmentation
3) Feature extraction
4) Character recognition


Number plate area detection

It is the process in which the number plate area is detected in a given snapshot. The
number plate can not be recognized by the computer system directly from the image
so we need a method to detect the number plate in the form that an be understood by

Figure 2.1: Number plate area in a snapshot

RGB to Grayscale conversion

It is a process in which a color image is converted in to a grayscale or black and white

image. Grayscale image carries only intensity values of an image. Processing on
grayscale image reduces overhead and increases the efficiency of the system. To
convert any color to a grayscale representation of its luminance, first one must obtain
the values of its red, green, and blue (RGB) primaries in linear intensity encoding, by
gamma expansion. Then, add together 30% of the red value, 59% of the green value,
and 11% of the blue value (these weights depend on the exact choice of the RGB
primaries, but are typical). Regardless of the scale employed (0.0 to 1.0, 0 to 255, 0%


to 100%, etc.), the resultant number is the desired linear luminance value; it typically
needs to be gamma compressed to get back to a conventional grayscale representation.

Convolution matrix

Convolution, the mathematical, local operation is central to modern image processing.

The basic idea is that a window of some finite size and shape--the support--is scanned
across the image. The output pixel value is the weighted sum of the input pixels within
the window where the weights are the values of the filter assigned to every pixel of the
window itself. The window with its weights is called the convolution kernel.

Horizontal and vertical rank filtering

Rank filter is an image processing term. Horizontally and vertically oriented rank
filters are often used to detect clusters of high density of bright edges in the area of the
number plate. The width of the horizontally oriented rank filter matrix is much larger
than the height of the matrix and vice versa for the vertical rank filter. Typically, in
digital filtering, pixels within a window are ranked by intensity values, and the center
pixel is replaced with a new value. The new value is calculated as a function of the
ranked pixels. Only the original pixel values are used in the ranking when determining
the new pixel value. Typical functions used in ranking are the median, mean and mode

Image Projection

The vertical projection is simply the sum (or possibly the mean value) of all of the
rows, and the horizontal projection is the same operation applied to the columns. It is
often used to identify spatially restricted objects within an image.


Band and plate clipping

The band clipping is an operation, which is used to detect and clip the vertical area of
the number plate (so-called band) by analysis of the vertical projection of the
snapshot. The plate clipping is a consequent operation, which is used to detect and clip
the plate from the band (not from the whole snapshot) by a horizontal analysis of such

Plate segmentation
The segmentation is one of the most important processes in the automatic numberplate
recognition.Segmentation refers to the process of partitioning a digital image into
multiple segments (sets of pixels). The goal of segmentation is to simplify and/or
change the representation of an image into something that is more meaningful and
easier to analyze. Image segmentation is typically used to locate objects and
boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the
process of assigning a label to every pixel in an image such that pixels with the same
label share certain visual characteristics.

Feature Extraction

When the input data to an algorithm is too large to be processed and it is suspected to
be notoriously redundant (much data, but not much information) then the input data
will be transformed into a reduced representation set of features (also named features
vector). Transforming the input data into the set of features is called features
extraction. Feature extraction involves simplifying the amount of resources required to
describe a large set of data accurately. When performing analysis of complex data one
of the major problems stems from the number of variables involved. Analysis with a
large number of variables generally requires a large amount of memory and
computation power or a classification algorithm which overfits the training sample and


generalizes poorly to new samples. Feature extraction is a general term for methods of
constructing combinations of the variables to get around these problems while still
describing the data with sufficient accuracy.

Character recognition

Character recognition is the translation of images into machine-editable text. Character

recognition is a field of research in pattern recognition, artificial intelligence and
machine vision. Optical character recognition (using optical techniques such as
mirrors and lenses) and digital character recognition (using scanners and computer
algorithms) were originally considered separate fields. Because very few applications
survive that use true optical techniques, the OCR term has now been broadened to
include digital image processing as well.

2.3 Conclusion

The Anpdr system includes two parts number plate area detection and character
recognition. First part deals with the detection of plate area and second part with the
recognition of characters on the identified number plate. Plate area detection includes
series of operations which are applied on the image to detect the number plate.


Chapter 3
Methods and Approaches

2.1 Detecting number plate Area

Process of automatic number plate recognition is a detection of a number plate area.

This problematic includes algorithms that are able to detect a rectangular area of the
number plate in an original image
Let us define the number plate as a “rectangular area with increased occurrence of
horizontal and vertical edges”. The high density of horizontal and vertical edges on a
small area is in many cases caused by contrast characters of a number plate, but not in
every case. This process can sometimes detect a wrong area that does not correspond
to a number plate. Because of this, we often detect several candidates for the plate by
this algorithm, and then we choose the best one by a further heuristic analysis.
Let an input snapshot be defined by a function f (x, y), where x and y are spatial
coordinates, and f is an intensity of light at that point. This function is always discrete
on digital computers, such as x∈ℕ, y∈ℕ , where ℕ0 denotes the set of natural
numbers including zero. We define operations such as edge detection or rank filtering
as mathematical transformations of function f .


Figure 2.1: sample snapshot

2.1.1 Applying Covolution

The detection of a number plate area consists of a series of convolve operations.

Modified snapshot is then projected into axes x and y . These projections are used to
determine an area of a number plate.
Each image operation (or filter) is defined by a convolution matrix. The convolution
matrix defines how the specific pixel is affected by neighboring pixels in the process
of convolution Individual cells in the matrix represent the neighbors related to the
pixel situated in the centre of the matrix. The pixel represented by the cell y in the
destination image is affected by the pixels x0…x8 according to the formula.


Figure 2.1

2.1.2 Horizontal and vertical rank filtering

Horizontally and vertically oriented rank filters are often used to detect clusters of
high density of bright edges in the area of the number plate. The width of the
horizontally oriented rank filter matrix is much larger than the height of the matrix (
w>> h ), and vice versa for the vertical rank filter ( w<< h ).
To preserve the global intensity of an image, it is necessary to each pixel be replaced
with an average pixel intensity in the area covered by the rank filter matrix. In general,
the convolution matrix should meet the following condition:

2.1.3 Sobel edge detector


The Sobel edge detector uses a pair of 3x3 convolution matrices. The first is
dedicated for evaluation of vertical edges, and the second for evaluation of horizontal

The magnitude of the affected pixel is then calculated using the formula
G(x y) =sqrt(G2 (x) .G2 (y)) . In praxis, it is faster to calculate only an approximate
magnitude as x y |G| =G(x)| +|G(y)|.

2.1.4 Projection of image

After the series of convolution operations, we can detect an area of the number
plate according to a statistics of the snapshot. There are various methods of statistical
analysis. One of them is a horizontal and vertical projection of an image into the axes
x and y .

The vertical projection of the image is a graph, which represents an overall

magnitude of the image according to the axis y . If we compute the vertical projection
of the image after the application of the vertical edge detection filter, the magnitude of
certain point represents the occurrence of vertical edges at that point. Then, the
vertical projection of so transformed image can be used for a vertical localization of
the number plate. The horizontal projection represents an overall magnitude of the
image mapped to the axis x .

Let an input image be defined by a discrete function f (x, y). Then, a vertical
projection y p of the function f at a point y is a summary of all pixel magnitudes in the


yth row of the input image. Similarly, a horizontal projection at a point x of that
function is a summary of all magnitudes in the xth column.

We can mathematically define the horizontal and vertical projection as:

where w and h are dimensions of the image.

2.1.5 Band Clipping

The band clipping is a vertical selection of the snapshot according to the

analysis of a graph of vertical projection. If h is the height of the analyzed image, the

corresponding vertical projection contains h values, such as

y∈<0; h -1>.

2.1.6 Horizontal detection – plate clipping

There is a strong analogy in a principle between the band and plate clipping.
The plate clipping is based on a horizontal projection of band. At first, the band must
be processed by a vertical detection filter. If w is a width of the band (or a width of the
analyzed image), the corresponding horizontal projection

In the second phase of detection, the horizontal position of a number plate is

detected in another way. Due to the skew correction between the first and second
phase of analysis, the wider plate area must be duplicated into a new bitmap. Let

be a corresponding function of such bitmap. This picture has a new


coordinate system, such as [0,0] represents the upper left corner and [w-1,h -1] the
bottom right, where w and h are dimensions of the area

2.1.7 Heuristic analysis and priority selection of number plate


In general, the captured snapshot can contain several number plate candidates.
Because of this, the detection algorithm always clips several bands, and several plates
from each band. There is a predefined value of maximum number of candidates, which
are detected by analysis of projections. By default, this value is equals to nine.

There are several heuristics, which are used to determine the cost of selected
candidates according to their properties. These heuristics have been chosen ad hoc
during the practical experimentations. The recognition logic sorts candidates
according to their cost from the most suitable to the least suitable. Then, the most
suitable candidate is examined by a deeper heuristic analysis. The deeper analysis
definitely accepts, or rejects the candidate. As there is a need to analyze individual
characters, this type of analysis consumes big amount of processor time

2.1.8 Priority selection and basic heuristic analysis of bands

The basic analysis is used to evaluate the cost of candidates, and to sort them
according to this cost. There are several independent heuristics, which can be used to
evaluate the cost I  . The heuristics can be used separately, or they can be combined
together to compute an overall cost of candidate by a weighted sum:
α = 0 .15α 1 + 0.25α 2+ 0.4α 3


Table 2.1

2.1.9 Deskewing mechanism

The captured rectangular plate can be rotated and skewed in many ways due to
the positioning of vehicle towards the camera. Since the skew significantly degrades
the recognition abilities, it is important to implement additional mechanisms, which
are able to detect and correct skewed plates. The fundamental problem of this
mechanism is to determine an angle, under which the plate is skewed. Then,
deskewing of so evaluated plate can be realized by a trivial affine transformation.
It is important to understand the difference between the “sheared” and
“rotated” rectangular plate. The number plate is an object in three-dimensional space,


which is projected into the two dimensional snapshot during the capture. The
positioning of the object can sometimes cause the skew of angles and proportions.

If the vertical line of plate v p is not identical to the vertical line of camera
objective v c, the plate may be sheared. If the vertical lines v p and v c are identical,
but the axis a p of plate is not parallel to the axis of camera a c, the plate may be

Figure 2.2

2.1.10 Detection of skew

Hough transform is a special operation, which is used to extract features of a

specific shape within a picture. The classical Hough transform is used for the detection
of lines. The Hough transform is widely used for miscellaneous purposes in the
problematic of machine vision, but I have used it to detect the skew of captured plate,
and also to compute an angle of skew.

The mathematical representation of line in the orthogonal coordinate system is

an equation y =a x+b , where a is a slope and b is a y-axis section of so defined line.
Then, the line is a set of all points [x, y] , for which this equation is valid. We know
that the line contains an infinite number of points as well as there are an infinite
number of different lines, which can cross a certain point. The relation between these
two assertions is a basic idea of the Hough transform.The equation y =ax+b can be
also written as b =x+a y , where x and y are parameters. Then, the equation defines a
set of all lines (a,b) , which can cross the point [x, y] . For each point in the “XY”


coordinate system, there is a line in an “AB” coordinate system (so called “Hough

2.1.11 Correction of skew

The second step of a deskewing mechanism is a geometric operation over an

image f (x, y) . As the skew detection based on Hough transform does not distinguish
between the shear and rotation, it is important to choose the proper deskewing
operation. In praxis, plates are sheared in more cases than rotated. To correct the plate
sheared by the angle q , we use the affine transformation to shear it by the negative
angle 

For this transformation, we define a transformation matrix A

where Sx and Sy are shear factors. The Sx is always zero, because we shear the plate
only in a direction of the Y-axis

Let P be a vector representing the certain point, such as P [x, y,1]where x and y
are coordinates of that point. The new coordinates [ x s , y s ,1]of that point after the
shearing can be computed as:
P s =P .A


where A is a corresponding transformation matrix. Let the deskewed number plate be

defined by a function s f . The function s f can be computed in the following way:

After the substitution of the transformation matrix A :

2.2 Segmentation of plate using a horizontal projection

Since the segmented plate is deskewed, we can segment it by detecting spaces

in its horizontal projection. We often apply the adaptive thresholding filter to enhance
an area of the plate before segmentation. The adaptive thresholding is used to separate
dark foreground from light background with non-uniform illumination

After the thresholding, we compute a horizontal projection px (x)of the plate

f(x, y). We use this projection to determine horizontal boundaries between segmented
characters. These boundaries correspond to peaks in the graph of the horizontal

The goal of the segmentation algorithm is to find peaks, which correspond to

the spaces between characters. At first, there is a need to define several important
values in a graph of the
horizontal projection (p )x :


2.2.1 Extraction of characters from horizontal segments

The segment of plate contains besides the character also redundant space and
other undesirable elements. We understand under the term “segment” the part of a
number plate determined by a horizontal segmentation algorithm. Since the segment
has been processed by an adaptive thresholding filter, it contains only black and white
pixels. The neighboring pixels are grouped together into larger pieces, and one of them
is a character. Our goal is to divide the segment into the several pieces, and keep only
one piece representing the regular character Piece extraction

Let the segment be defined by a discrete function f (x, y) in the relative

coordinate system, such as [0,0] is an upper left corner of the segment, and [w-1,h -1]
is a bottom right corner, where w and h are dimensions of the segment. The value of f
(x, y) is “1” for the black pixels, and “0” for the white space.
The piece R is a set of all neighboring pixels [x, y], which represents a continuous
element. The pixel [x, y] belongs to the piece R if there is at least one pixel [x’, y’]
from the R , such as [x, y] and [x’, y’] are neighbors

The notation means a binary relation “ a is a neighbor of b in a four-

pixel neighborhood”:


2.3 Feature extraction and normalization of characters

2.3.1 Normalization of brightness and contrast

The brightness and contrast characteristics of segmented characters are varying
due to different light conditions during the capture. Because of this, it is necessary to
normalize them. There are many different ways, but this section describes the two
most used: global and adaptive thresholding.

Through the histogram normalization, the intensities of character segments are

redistributed on the histogram to obtain the normalized statistics.
Techniques of the global and adaptive thresholding are used to obtain monochrome
representations of processed character segments. The monochrome (or black & white)
representation of image is more appropriate for analysis, because it defines clear
boundaries of contained characters.

2.3.2 Global Thresholding

The global thresholding is an operation, when a continuous gray scale of an

image is reduced into monochrome black & white colors according to the global
threshold value. Let 0,1 be a gray scale of such image. If a value of a certain pixel is
above the threshold t , the new value of the pixel will be zero. Otherwise, the new
value will be one for pixels with values above the
threshold t .

Let v be an original value of the pixel, such as vÎ 0,1 . The new value v¢ is
computed as:


Since the threshold t is global for a whole image, the global thresholding can
sometimes fail. So for over coming the drawback we use Adaptive Thresholding

2.3.3 Adaptive thresholding

The number plate can be sometimes partially shadowed or nonuniformly

illuminated. This is most frequent reason why the global thresholding fail. The
adaptive thresholding solves several disadvantages of the global thresholding, because
it computes threshold value for each pixel separately using its local neighborhood.

2.3.4 Local thresholding

The second way of finding the local threshold of pixel is a statistical
examination of neighboring pixels. Let x, y b e a pixel, for which we compute the
local threshold t . For simplicity we consider a square neighborhood with width 2 .r
+1. where [x -r, y +r ],[x-r, y-r], [x +r, y + r]and[x+r, y-r]are corners of such square.
There are severals approaches of computing the value of threshold:

2.3.5 Normalization of dimensions and resampling

Before extracting feature descriptors from a bitmap representation of a

character, it is necessary to normalize it into unified dimensions. We understand under


the term “resampling” the process of changing dimensions of the character. As

original dimensions of unnormalized characters are usually higher than the normalized
ones, the characters are in most cases downsampled.
When we downsample, we reduce information contained
in the processed image.There are several methods of resampling, such as the pixel-
resize, bilinear interpolation or the weighted-average resampling. We cannot
determine which method is the best in general, because the successfulness of particular
method depends on many factors. For example, usage of the weighed-average
downsampling in combination with a detection of character edges is not
a good solution, because this type of downsampling does not preserve sharp edges
Because of this, the problematic of character resampling is closely associated with the
problematic of feature extraction.

We will assume that mxn are dimensions of the original image, and m’xn’are
dimensions of the image after resampling. The horizontal and vertical aspect ratio is
defined as r x =m’/m and r y=n’/n , respectively.

2.3.6 Nearest-neighbor downsampling

The principle of the nearest-neighbor downsamping is a picking the nearest

pixel in the original image that corresponds to a processed pixel in the image after
resampling. Let f (x, y)be a discrete function defining the original image, such as 0
<=x <=m and 0 <=y <=n . Then, the function f(x, y)of the image after resampling is
defined as:

where 0 <= x’ < m’ and 0 <= y’< n’ .


If the aspect ratio is lower than one, then each pixel in the resampled
(destination) image corresponds to a group of pixels in the original image, but only
one value from the group of source pixels affects the value of the pixel in the
resampled image. Although the nearest neighbor downsamping significantly reduces
information contained in the original image by ignoring a big amount of pixels, it
preserves sharp edges and the strong bipolarity of black and white pixels. Because of
this, the nearest neighbor downsamping is suitable in combination with the “edge
detection” feature extraction method
2.3.7 Feature extraction
Information contained in a bitmap representation of an image is not suitable for
processing by computers. Because of this, there is need to describe a character in
another way. The description of the character should be invariant towards the used
font type, or deformations caused by a skew. In addition, all instances of the same
character should have a similar description. A description of the character is a vector
of numeral values, so-called “descriptors”, or “patterns”:
x =x0 ,…, xn1 
Generally, the description of an image region is based on its internal and
external representation. The internal representation of an image is based on its
regional properties, such as color or texture. The external representation is chosen
when the primary focus is on shape characteristics. The description of normalized
characters is based on its external characteristics because we deal only with properties
such as character shape. Then, the vector of descriptors includes characteristics such
as number of lines, bays, lakes, the amount of horizontal, vertical and diagonal or
diagonal edges, and etc. The feature extraction is a process of transformation of data
from a bitmap representation into a form of descriptors, which are more suitable for

If we associate similar instances of the same character into the classes, then the
descriptors of characters from the same class should be geometrically closed to each
other in the vector space. This is a basic assumption for successfulness of the pattern


recognition process. This section deals with various methods of feature extraction, and
explains which method is the most suitable for a specific type of character bitmap. For
example, the “edge detection” method should not be used in combination with a
blurred bitmap.

2.3.8 Skeletonization and structural analysis

The feature extraction techniques discussed in the previous two chapters are
based on the statistical image processing. These methods do not consider structural
aspects of analyzed images. The small difference in bitmaps sometimes means a big
difference in the structure of contained characters. For example, digits ‘6’ and ‘8’ have
very similar bitmaps, but there is a substantial difference in their structures.
The structural analysis is based on higher concepts than the edge detection
method. It does not deal with terms such as “pixels” or “edges”, but it considers more
complex structures (like junctions, line ends or loops). To analyze these structures, we
must involve the thinning algorithm to get a skeleton of the character. This chapter
deals with the principle of skeletonization as well as with the principle of structural
analysis of skeletonized image The concept of skeletonization

The skeletonization is a reduction of the structural shape into a graph. This

reduction is accomplished by obtaining a skeleton of the region via the skeletonization
algorithm. The skeleton of a shape is mathematically defined as a medial axis
transformation. To define the medial axis transformation and skeletonization
algorithm, we must introduce some elementary prerequisite terms.
Let N be a binary relation between two pixels [x, y] and [x’, y’], such as aNb means “
a is a neighbor of b ”. This relation is defined as:


The border B of character is a set of boundary pixels. The pixel [x, y] is a

boundary pixel, if it is black and if it has at least one white neighbor in the eight-pixel


The inner region I of character is a set of black pixels, which are not boundary

2.4 Recognition of characters

For the recognition of the characters from the segmented number plate area OCR
techniques are applied on the plate area. OCR process are applied with the use of ORC
API in java which includes various classes that performs the OCR process on a given
input image.
The different classes used are:
1. OCR()
2. a()
3. b()



3.1 Data Flow Diagram


3.2 Class Diagram

3.2 Modules Detailing

3.2.1 Number Plate Area Detection

Sub Processes

Sub Processes


SP-1. Edge detection and rank filtering

SP-2. Horizontal and vertical image projection

SP-3. Vertical detection - band clipping

SP-4. Horizontal detection - plate clipping

SP-5. Priority selection and basic heuristic analysis of bands

SP-6. Deskewing mechanism

3.2.2 Plate Segmentation

Sub Processes

Sub Processes

SP-1. Segmentation of plate using a horizontal projection

SP-2. Extraction of characters from horizontal segments

3.2.3 Feature extraction and normalization

Sub Processes

Sub Processes


SP-1. Normalization of brightness and contrast

SP-2. Normalization of dimensions and resampling

SP-3. Feature extraction

3.2.4 Recognition of characters

Sub Processes

Sub Processes

SP-1. OCR processing

3.3 Methods

Programing language used for project is JAVA .

AWT package present in java (java.awt.*). Is majorly utilized as it process
the image pixel by pixel For the purpose of applying the convolution matrices we have
method defined in it. And many more method is predefined in the package which is
use ful for the purpose of programming.
Another package which is used is java.util.* and* . which is used for
taking image as an input and storing the pixel values into the desired data structure.

Imported Classes and Packages




Method Descriptions:
CLASS: Photo (user defined class)

loadImage(String filepath): This user defined method takes filepath as a string

from the place where the image is to be loaded.
saveImage(String filepath,BufferedImage l): This method saves the image at a
destination provided.
imageMatrix(BufferedImage im): method is used to create a matrix of a
buffered image obtained as an input.
getBrightness(BufferedImage im,float[][] matrix): method helps in achieving a
brightness factor from the true color image.
getSaturation (BufferedImage im,float[][] matrix): method helps in achieving
a Saturation factor from the true color image.

getHue(BufferedImage im,float[][] matrix): method helps in achieving a Hue

factor from the true color image.

rankFilter(BufferedImage im,float[][] matrix):Scalable method to remove salt and

pepper noise from the image. Rank of the filter can be changed according to the
requirement of the snaps.


edgeDetector (BufferedImage im): Method is uses Sobel Operator to detect the

vertical and horizontal edges from the image.

onePoint(BufferedImage image, BufferedImage output,int thresholdLimit):Method is

for global thresholding in image value of the threshold can be changed according to

intTh(float hparray[],BufferedImage thimage): A threshold applied on a hparray

consider values only above specified range.
vTh(float vparray[],BufferedImage bf):A threshold applied on a vparray consider
values only above specified range.

plate(): Method to extract the number plate from the snapshot.

segTh(BufferedImage bf): This method use to segment the number plate extracted for
the snapshot.

Anpdr1(User defined class) :

hpProjection(BufferedImage ims):method uses the buffered image and add ups the
brightness value accross the horizontal vertex

vpProjection(BufferedImage ims): method uses the buffered image and add ups the
brightness value accross the verticle vertex


Initialproject(user defined class):

start(): use to initialise the swing windows (interface)
actionedPerformed(): use to add action to the buttons.



recognizeCharacters(java.awt.image.RenderedImage image): Method of a class uses imge as a

argument and convert it into text.

Chapter 4


Figure 4.1

A java swing interface having Button “load image” which will be used for the
purpose of loading image. As soon as the button is pressed the console will ask for the
snapshot present in the local disk.


Figure 4.2


As soon as the vehicle snapshot is loaded into the console. The “grayscale”
button will convert the original image into gray scale image.

After the image is converted into grayscale the “rank filter” buttom will apply
various rank filters on the image obtained from previous task

The image is then processed further and its horizontal projection is obtained by
the “horizontal projection” button.


After obtaining horizontal projection vertical projection is obtained by

“vertical projection” button

Now the required number plate area is displayed on the console and OCR
operation is performed on the image obtained by the “display number” button and the
number will be displayed on the small console on the right side of the entire console.

Implementation Results

YAHAN PAR whoh imge laga diyo jo maine sham

ko mail ki thi

Extensive testing has been conducted with more than 25 indian vehicles.
Images have been captured from various distances and viewing angles.
Image size has varied from 64K to 1M pixels. JPEG and PNG image
compression was tried along with a raw uncompressed gray level imagery.
Different daylight conditions were examined, from bright sunlight illumination
to half-darkness. Very frequently the plate zone has been in a shadow and the
contrast of characters has been poor with regard to the plate’s background.
Situations of mixed illumination, where certain portions of the plate were
shadowed, while the others were brightly illuminated, caused problems and
sometimes led to rejection of the whole plate.
The true license plate zone was correctly located and approved on
more than 85% of the images. The rest of the cases were rejected by one of
the consistency tests. It is important to stress that there have been zero false


positive errors, which explain the relatively high share of rejected plates due to
the conservative tests while approving plate “candidates”.

Comparison with existing State of Art Technologies

Vertical seg


Rotated plate

Noisy image

Vertical segmentation of the number plate is better than the existing technology.
As applied theshold before projection which help in eliminating the unwanted pixels
which adds a pusedo brightness.

Less number of plate detected which where rotated in exixted technology.

Same result observed in our project plate with inclination more than 20 degree where
having less precision in getting plate.

Noise removal is optimal as the increased rank of filter drastically make changes and
reduce the heavy noise fom the snapshot .




Earlier works relevant to the problem of license plate identification and

recognition have been reviewed and the need for a system that balances accuracy and
speed has been found.

The project utilizes the algorithmic and mathematical aspects of the automatic
number plate recognition systems, such as problematic of machine vision, pattern
recognition, OCR and neural networks. The problematic has been divided into several
steps, according to a logical sequence of the individual recognition steps. Even though
there is a strong succession of algorithms applied during the recognition process.

ANPR solution has been tested on static snapshots of vehicles, which

has been divided into several sets according to difficultness. Sets of blurry and skewed
snapshots give worse recognition rates than a set of snapshots, which has been
captured clearly. The objective of the tests was not to find a one hundred percent
recognizable set of snapshots, but to test the invariance of the algorithms on random
snapshots systematically classified to the sets according to their properties. The
experimental analysis of the illiterate yet potent license plate recognition algorithm has
resulted in an accuracy of eighty eight percent and takes an average processing time of
two seconds per image. Hence, this algorithm has attempted to strike a balance
between the accuracy, robustness and speed that a license plate identification and
recognition system must posses.


[1] Fraser N.: Introduction to Neural Networks,


[2] Fukunaga K.: Introduction to statistical pattern recognition, Academic Press,

San Diego, USA, 1990

[3] Gonzalez R., Woods R.: Digital Image Processing, Prentice Hall, Upper Saddle
River, New Jersey, 2002

[6] Kvasnicka V., Benuskova L., Pospichal J., Farkas I., Tino P., Kral A.:
Introduction to Neural Networks, Technical University, Kosice, Slovak Republic

[7] Minsky M., Papert S.: Perceptons. An Introduction to Computational

Geometry, MIT Press:. Cambridge, Massachusetts, 1969

[8] Shapiro V., Dimov D., Bonchev S., Velichkov V., Gluhchev G.: Adaptive License
Plate Image Extraction, International Conference Computer Systems and
Technologies, Rousse, Bulgaria, 2004

[9] Smagt P.: Comparative study of neural network algorithms applied to optical
character recognition, International conference on Industrial and engineering
applications of artificial intelligence and expert systems, Charleston, South Carolina,
USA, 1990

[10] Srivastava R: Transformations and distortions tolerant recognition of

numerals using neural networks, ACM Annual Computer Science Conference, San
Antonio, Texas, USA, 1991

[11] Wang J., Jean J.: Segmentation of merged characters by neural networks and
shortest-path, Symposium on Applied Computing, Indianapolis, Indiana, USA, 1993


[12] Zhang Y., Zhang C.: New Algorithm for Character Segmentation of License
Plate, Intelligent Vehicles Symposium, IEEE, 2003

[13], Quercus technologies, 2006