Sie sind auf Seite 1von 4

2nd International Conference on Electrical, Electronics and Civil Engineering (ICEECE'2012) Singapore April 28-29, 2012

Image Segmentation for Text Extraction


Neha Gupta, V .K. Banga
.

Abstract- This paper presents a methodology for extracting


text from images such as document images, scene images etc. Text
that appears in these images contains important and useful
information. Text extraction in images has been used in large
variety of applications such as mobile robot navigation, document
retrieving, object identification, vehicle license plate detection, etc.
In this paper, we employ discrete wavelet transform (DWT) for
extracting text information from complex images. The input image
may be a colour image or a grayscale image. If the image is colour
image, then preprocessing is required. For extracting text edges, the
sobel edge detector is applied on each subimage. The resultant
edges so obtained are used to form an edge map. Morphological
operations are applied on the processed edge map and further
thresholding is applied to o improve the performance.

Keywords- Image segmentation, Discrete Wavelet Transform,


Haar wavelets.

I. INTRODUCTION

Now days, with the advancement of the digital technology,


the more and more databases are multimedia in nature. The
databases usually contain images and videos in addition to
the textual information. The textual information is very Fig. 1 Various Kinds of Images
useful semantic information because it describes the image
or video and can be used to fully understand images and There are two different approaches have been used for text
videos. There are basically three kinds of images: Document extraction from complex images namely region based
image, Scene text images and Caption text images.
approach and texture based approach.
Document images may be in the form of scanned book
covers, CD covers or video images. Text in images or videos A. Region based methods
is classified as scene text and caption text. Scene text is also This approach uses the properties of the colour or gray
called as graphics text. Natural images that contain text are scale in a text region or their differences regarding the
called scene text. The name of caption text is artificial text background. This method is basically divided in two sub
and it is one in which text is inserted or superimposed in the categories: edge based and connected component (CC) based
image [7] . The diagram showing three types of images as in methods. The edge based method is mainly focus on the high
figure 1. contrast between text and background. In this method, firstly
text edges are identified in an image and are merged. Finally,
some heuristic rules are applied to discard non-text regions.
Neha Gupta is pursuing M. Tech (Electronics and Communication
Connected component based method considers text as a set
Engineering) in Department of Electronics & Communication Engineering of separate connected components, each having distinct
from Amritsar College of Engineering and Technology, Amritsar, Punjab, intensity and colour distributions. The edge based methods
India. (Email: nehajandialaug05@gmail.com). are robust to low contrast and different text size where as CC
Dr. Vijay Kumar Banga is working as a Professor and Head of the
Department of Electronics & Communication Engineering, Amritsar College based methods are somewhat simpler to implement, but they
of Engineering and Technology, Amritsar, Punjab, India (Email: fail to localize text in images with complex backgrounds [1].
vijaykumar.banga@gmail.com). 1.2 Texture based methods

182
2nd International Conference on Electrical, Electronics and Civil Engineering (ICEECE'2012) Singapore April 28-29, 2012

As we know that, text in images has distinct textural


properties which can be used to differentiate them from the
background or other non text regions [3]. This method is
based on the concept of textural properties. In this method,
Fourier transforms. Discrete cosine transform and wavelet
decomposition are generally used. The main drawback of
this method is that it is highly complex in nature but, in other
hand, it is more robust than the CC based methods in dealing
with complex background.

II. PROPOSED ALGORITHM


The block diagram of the proposed text extraction
algorithm is shown in figure2. The input image may be a
colour or gray scale image. If the image is colour image
then, preprocessing operation is applied on the image as
shown in the flowchart. In our algorithm, input data is a
colour image which is entered to the system and the
segmented text on a clear background is the output [9] .
A. Preprocessing
If the input image is a gray-level image, the image is
proceeded directly starting at discrete wavelet transform. If
the input image is coloured , then its RGB components are
combined to give an intensity image. Usually, colour images Fig. 2 Algorithm for text extraction
are normally captured by the digital cameras. The pictures
are often in the Red-Green-Blue colour space. Intensity
image Y is given by:
Y = 0.299R + 0.587G + 0.114B
Image Y is then processed with 2-d discrete wavelet
transform.
B. Discrete Wavelet Transform
In our proposed algorithm, we are using Haar discrete
wavelet transform which provides a powerful tool for
Fig. 3 Result of 2-D DWT Decomposition
modeling the characteristics of textured images. Most
textured images are well characterized by their contained
Therefore, processing time of the traditional edge
edges. In the field of signal analysis and image processing,
detection filters is slower than 2-d DWT. The reason we
the DWT is very useful tool. It can decompose signal into
choose Haar DWT because it is simpler than that of any
different components in the frequency domain [12] . In case
other wavelets. Some of the following advantages are as
of 1-d DWT, it decomposes signal into two components one
follows:
is average and another one is detail component. We are using
1. Haar wavelets are real, orthogonal and symmetric.
2-d DWT in which it decomposes input image into four
2. Its coefficients are either 1 or -1.
components or sub-bands, one average component(LL) and
3. It is the only wavelet that allows perfect localization in
three detail components(LL, HL, HH) as shown in figure 3
the transform domain.
[9] .
The three detail component sub-bands are used to detect
C. Extracting Text Edges
candidate text edges in the original image. Using Haar
In this step, three detail sub-bands are used to detect dense
wavelet, the illumination components are transformed to the
edges of the text blocks which are the distinct characteristics
wavelet domain. This stage results in the four LL, HL, LH
of the text blocks. By finding the edges in the three sub
and HH sub image coefficients. The traditional edge
images namely horizontal sub image, vertical sub image and
detection filters can provide the similar result as well but it
diagonal sub image, fusing the edges contained in each sub
cannot detect three kinds of edges at a time.
image. In this way candidate text regions can be found. In
this algorithm we use sable edge detector because it is
efficient to extract strong edges that are needed in this

183
2nd International Conference on Electrical, Electronics and Civil Engineering (ICEECE'2012) Singapore April 28-29, 2012

application. Next step is to form an edge map using


candidate text edges. Here we use a weighted OR operator.
To get binary edge map thresholding is to be applied. After
that we perform morphological dilation operation on binary
edge map. Basically the function of dilation is to fill the gaps
inside the obtained text regions.

D. Removing Non Text Regions


In order to improve the performance of the system, non
text regions are removed using some rules. To do so, we first
summarize the common attributes related to the horizontal
text as:
a. Text are bounded in size.
b. Text have special texture property.
c. Text are some blocks whose width are larger than their
heights.
d. Text always contain edges [12].
But the projection is a more efficient way to find such
high density areas. The projection profile is used to separate
text blocks into single line text. There are two types of
projection profile. One is horizontal profile and another one
is vertical profile. A horizontal profile is defined as vector of
the sum of the pixel intensities over each column and a
vertical profile is defined as vector of the sum of the pixel
intensities over each row. The horizontal and vertical
projections of the binary edge map are found. The average
value of maximum and minimum of the vertical projection is
taken as threshold and in the same way, maximum and the
minimum value of horizontal projection is taken as Fig. 4 (a) Original image (b) Extracted text regions
threshold. In case of vertical projection, the rows whose sum
of pixel intensities above the threshold are taken. Similarly, But this new text extraction algorithm is not sensitive to
in case of horizontal projection only the columns whose sum image colour or intensity, uneven illumination and reflection
of pixel intensities above the threshold is taken [8] . In this effects. This algorithm can handle both scene text images
way there is proper localization of text regions in the image and printed documents. Our main future work involves a
is found. Finally, a threshold is applied which result in the suitable existing OCR technique to recognize extracted text.
segmented text in a black background as shown in figure 4. The binary output can be directly used as an input to an
existing OCR system for character recognition without any
III. CONCLUSION preprocessing. Also the algorithm only analysis text box not
In this paper, we present a relatively simple and effective a single character which is in case of connected component
algorithm for text detection and extraction. This new text based method. Therefore, it requires less processing time
extraction algorithm automatically detect and extract text which is essential for real time applications [11].
from complex background images by applying DWT to the
images [13]. This algorithm is robust with respect to
different languages, font size, style, orientation, colour and REFERENCES
alignment of text and can be used in large variety of [1] Keechul Jung, Kwang In Kim and Anil K. Jain, Text information
application fields such as vehicle license plate detection to extraction in images and video: A Survey, Elsevier, Pattern
Recognition, vol.37 (5), pp 977997, 2004.
detect number plate of vehicle, mobile robot navigation to [2] M. Padmaja, J. Sushma, Text Detection in Color Images,
detect text based land marks, object identification etc. Most International Conference on Intelligent Agent & Multi-Agent Systems,
of the previous methods fail when the characters are not Chennai, 22-24 July 2009, pp. 1-6, 2009.
[3] Mohieddin Moradi, Saeed Mozaffari, and Ali Asghar Orouji,
aligned well or when the characters are too small. They also
Farsi/Arabic Text Extraction from Video Images by Corner
result in some missing characters when the characters have Detection, 6th, IEEE,Iranian conference on Machine Vision and image
very poor contrast with respect to the background [12]. processing ,Isfahan, iran,2010. IEEE.

184
2nd International Conference on Electrical, Electronics and Civil Engineering (ICEECE'2012) Singapore April 28-29, 2012

[4] Chung-Wei Liang and Po-Yueh Chen, DWT Based Text


Localization, International Journal of Applied Science and
Engineering, pp.105-116, 2004.
[5] Nikolaos G. Bourbakis, A methodology for document processing:
separating text from images, Engineering Applications of Artifcial
Intelligence 14, 35-41, 2001.
[6] C. Strouthopoulos, N. Papamarkos, A. E. Atsalakis, Text extraction in
complex color documents, The Journal of Pattern Recognition 35,
17431758, 2002.
[7] Chitrakala Gopalan and D. Manjula, Contourlet Based Approach for
Text Identification and Extraction from Heterogeneous Textual
Images, International Journal of Electrical and Electronics
Engineering 2(8), pp. 491-500, 2008.
[8] Dr.N.Krishnan, C. Nelson Kennedy Babu 2, S.Ravi 3 and Josphine
Thavamani, Segmentation Of Text From Compound Images,
International Conference on Computational Intelligence and
Multimedia Applications,Tamil naidu , India, vol. 3, pp.526-528, 2007.
[9] S.Audithan, RM. Chandrasekaran, Document Text Extraction from
Document Images Using Haar Discrete Wavelet Transform, European
Journal of Scientific Research ISSN 1450-216X, Vol.36, 2009.
[10] Xiaoqing Liu and Jagath Samarabandu, Multiscale edgebase Text
extraction from Complex images, IEEE Multimedia and Expo 2006,
International Conf., Tronto, Canada, pp. 1721-1724, 2006.
[11] Roshanak Farhoodi and Shohreh kasaei, Text Segmentation From
Images With Textured and colored Background, 13th Iranian
conference on Electrical Engineering .
[12] Xiao-Wei Zhang, Xiong-Bo Zheng, Zhi-Juan Weng, Text Extraction
Algorithm Under Background Image Using Wavelet Transforms,
Proceedings of the 2008 International Conference on Wavelet
Analysis and Pattern Recognition, Hong Kong, 30-31, Aug. 2008.
.

185

Das könnte Ihnen auch gefallen