Sie sind auf Seite 1von 4

A Rotation Method for Binary Document Images Using

DDA Algorithm
Nguyen Duc Thanh
University of Wollongong
Wollongong 2525, Australia
dtn156@uow.edu.au

ABSTRACT coordinate, and θ is the rotation angle. However, because of


DDA (Digital Differential Analyzer) is a famous algorithm used approximation, this approach generates white holes and over
commonly in computer graphics to interpolate integer coordinate segmented connected components (Fig. 3b).
pixels of a straight line. In this paper, we introduce a method of Some other methods use different techniques to overcome the
image rotation for binary document images using DDA algorithm above problem. These techniques can be clustered into four
with assumption that the true skew angles of the documents have groups: pixel based approach [8], [3], [6], [7], run (line) based
already been computed. The proposed method applies the main approach [4], [2], [10], boundary based approach [1], and block
idea of DDA algorithm with some modifications for the skew (of black pixels) matching based approach [5].
scanning lines along to the inverse direction of the skew angle. In
this method the ratios between the length of black runs and the In the work of Paeth [8], the author proposes a method in which
whole scan line are guaranteed. Thus the algorithm can overcome the rotation is replaced by three consecutive shearing transforms.
disadvantages of mathematical rotation such as white holes and In this method, some of white holes can be removed but the
over segmentation. Moreover, using DDA algorithm to topology of black pixels is not considered. In the same idea, after
approximate integer points helps this method reduce the number rotating the image by using the mathematical rotation as formula
of rotation operations. (1), Cheng et al [3] fill midpoints between rotated points of
distance of 2 or 5 . This algorithm needs time to process all
Categories and Subject Descriptors rotated pixels to compute the distances. Moreover, the topology of
I.4.9 [Image Processing and Computer Vision]: Application black pixels and white holes cannot be solved by only filling
these midpoints. Jiang et al [6] also decompose the traditional
geometric rotation process into a translation and local rotation in
General Terms which a mapping table is used to speed up the algorithm. As same
Algorithms as the method proposed by Paeth, this algorithm cannot maintain
the topology of image points. Also based on image pixels, Mahata
Keywords et al [7] introduce a novel approach for image rotation. The main
DDA, Rotation Algorithm, Skew Correction idea is to increase the resolution of titling in order to reduce the
rounding error. The binary input image is converted to grey scale
and expanded at higher resolutions. The expanded image is then
1. INTRODUCTION passed through a low pass filter and decimated in order to obtain
Skew estimation and correction is a crucial step in every OCR and the image with the original resolution. Some filters then are
Document Layout Analysis system. After detecting the skew applied. Finally, the filtered and decimated outputs which are real
angle, the image will be rotated with the detected skew. There are numbers are thresholded to obtain the final binary image.
many approaches proposed for the implementation of rotation
operator. The simplest version is to apply mathematically the In a different way from the above approach, Chien et al [4] do
rotation equation [9] that we will call “mathematical rotation”: rotation with runs of black pixels rather than each individual
black point. In this method, the two end points of black runs are
⎡ x '⎤ ⎡ cosθ sin θ ⎤ ⎡ x ⎤ extracted and then applied with a line drawing algorithm. White
⎢ ⎥=⎢ ⎥⎢ ⎥ (1)
holes are removed by using breakdown method. Cao et al [2]
⎣ y '⎦ ⎣ − sin θ cosθ ⎦ ⎣ y ⎦
recommend a method of skew correction using scanning line
where (x, y) is the original coordinate, (x’, y’) is the rotated
model in which each digital straight line with a detected angle can
be approximately expressed by several joined horizontal line
Permission to make digital or hard copies of all or part of this work for segments, whose y-coordinates are sequentially jumped by one
personal or classroom use is granted without fee provided that copies are pixel. This approximation is stored in a table in advance. In the
not made or distributed for profit or commercial advantage and that rotation process, the rotated coordinates of all pixels are
copies bear this notice and the full citation on the first page. To copy computed using the above table. On the other hand, Shima et al
otherwise, or republish, to post on servers or to redistribute to lists, [10] just use runs instead of the whole straight lines. In this
requires prior specific permission and/or a fee. approach, the image data is represented as horizontal and vertical
DocEng’08, September 16–19, 2008, São Paulo, Brazil.
runs. The rotation algorithm is then applied on the start and end
Copyright 2008 ACM 978-1-60558-081-4/08/09…$5.00.

267
points of these runs. This algorithm decomposes the rotation
matrix into two transformation matrices used for horizontal and −θ −θ
vertical runs respectively. The representation of image data as
runs and decomposition of rotation matrix let this method reduce
the complexity of processing time and memory as well.
For boundary analysis based approach, Avila et al [1] analyze the
edges of the document image by using edge mask table to find θ >0
critical points necessary to rotate the image. These points are then
used to construct a graph in which vertices are critical points and θ <0
edges are vectors connecting two neighboring nodes. The
algorithm simply rotates all critical points using mathematical Figure 1. Skew scanning line model.
rotation (1). Finally, rotated vectors are drawn and a scan flood-
fill algorithm is applied to the region delimited by them. Although
this method can solve the “white hole” problem and smooth d p'i p 'i +1
uneven edges, it consumes much more processing time than other
algorithms. p '0 p '1
Block matching is another approach for binary image rotation. In new point
this method, Chien et al [5] decompose the input image into d
coarse blocks of 9×9 pixels, then further split non-black blocks to p0
smaller ones of 3×3 pixels, called fine blocks so that each square p1
is overlapped by another one pixel horizontally and vertically. pi
pi +1
The predrawn mapping patterns (PMP) of all blocks are computed
in advance and stored in the memory. After dividing the entire
original into coarse/fine blocks, each of them is examined and
(a) (b)
only one pixel (upper left corner) of each block is rotated. Finally,
the corresponding PMP is determined by fetching from buffer and Figure 2. (a) Geometric distances are guaranteed, (b)
mask on the output image at the rotated position using OR-like Removing white holes by inserting new points.
operator. One drawback of this method is that it makes the
connected component thicker and/or in some cases, the white
holes are still remained.

2. DDA ALGORITHM
The main idea of DDA algorithm is to use the differential
equation of straight line [9]:
dy (a) (b) (c) (d)
c = (2)
dx Figure 3. (a) Original image, (b) Mathematical rotation, (c)
where c is a constant which represents the slope of the straight Block matching method, (d) Skew scanning line model.
line (its derivative).
Given the start point and end point of a straight line, we can
interpolate all points belonging to this line as follows: calculate Starting from the upper-left corner of the image, a skew line with
the incremental values called Dx and Dy and the slope of the the direction of the negated skew angle is determined. At each
straight line considered as Dy/Dx, where Dx and Dy are finite black pixel of this line, the Euclidean distance from it to the start
differences. The start point is the first point of drawing. At each point is computed. This distance is stored in the buffer and then
time of iteration, the next point is reached by increment the x- used to compute the rotated position. The rotated coordinate of a
value and y-value of the current point by Dx and Dy respectively. black pixel is determined simply by adding the x-coordinate of the
The iteration process can be done as: rotated start point with the distance from its original coordinate to
the start point’s original coordinate, stored in buffer, while the y-
xn+1 = xn + Dx coordinate is maintained (Fig. 2a). In more detail, the algorithm
(3) can be expressed as follows:
y n+1 = y n + Dy
Because all pixels are integer coordinate points, we must round Let p0 ( x0 , y 0 ) and p'0 ( x '0 , y '0 ) be the start point of a skew line
the real values of all points to integer values. Therefore, each step and its rotated coordinate respectively. Using DDA algorithm
of the iteration process is computed with two floating point
with the constant c = tan (−θ ) , where θ is the skew angle, we can
additions and two rounding operations.
compute all points of this line. Let pi ( xi , yi ) be an arbitrary point

3. THE PROPOSED METHOD of this line. Then the rotated coordinate of pi called p 'i ( x'i , y 'i )
The main idea of our proposed algorithm is to use DDA algorithm can be calculated as:
to compute all pixels along to skew scanning lines (Fig. 1).

268
⎧⎪ x' = x' + (xi − x0 )2 + ( yi − y0 )2
⎨ i 0
(4) 10
⎪⎩ y 'i = y ' 0

diff/real number of connected


or 8
Mathematical
⎧ 1 1 rotation

components
⎪ x ' i = x ' 0 + dx = x ' 0 + dy
⎨ cos (− θ ) sin (− θ ) (5) 6 Block matching
⎪ y 'i = y '0

4 Our method
where dx = xi − x 0 , dy = y i − y 0 , and θ is the skew angle.

By applying formula (4) or (5), the lengths of black runs as well 2


as the whole scanning line in rotation are guaranteed, i.e. the
shapes of rotated objects are not deformed after applying the
0
algorithm. In order to fill in white holes, all pixels between two
rotated points corresponding to two consecutive points in the 15 30 45
skew scanning line are filled in. For instance, let pi and pi +1 be angle (in degrees)
two consecutive points of the skew scanning line. The two rotated
points, p'i and p'i +1 may not be still consecutive. In this case, all
Figure 4. The comparison of our method with other methods
horizontal points between p'i and p'i +1 are filled in (Fig. 2b). by using diff metric.
Our proposed algorithm differs from other methods such as [3]
and [5], in that we do not scan the image once again to fill in
midpoints of distance of 2 or 5 to remove white holes but this
work is done through the scanning process once. 10%

Although the main idea of our method is originated from DDA


algorithm, the details are different. In the original version of DDA 8%
Mathematical
algorithm, the inputs are the start point and end point of a straight
degradation

rotation
line, and then the slope is computed using formula (2). On the 6% Block matching
contrary, in our algorithm, the inputs are the start point of a
straight line and its slope. The algorithm stops when the right or Our method
4%
bottom border of the image is reached.
The same computation is done for all scanning lines with start 2%
points are all pixels belonging to the left border of the image. In
order to scan all portions of the image, we need to apply this
algorithm with the top/bottom border of the image (Fig. 1). If the 0%
skew angle θ is positive, all pixels of the top border will be start 15 30 45
points. Otherwise, the bottom border will be used. angle (in degrees)

4. EXPERIMENTAL RESULTS Figure 5. The comparison of our method with other methods
Some experimental results are illustrated in figure 3. In this by using the metric proposed in [1].
π
figure, the original image is rotated by an angle of . We can
6
see that white holes are removed totally by our method while Table 1. The comparison on the degradation with other
these ones are still remained in some others. Figure 3.c is the methods using the images in Figure 3.
result of applying block matching method without filling Algorithm Total of pixels Incorrect Degradation
midpoints of distance of 2 or 5 . pixels
By applying formula (5), the proposed algorithm reduces Mathematical
5396 755 13.99%
significantly the number of rotation operations. Indeed, in formula rotation
(5), we can see that there are one multiplication and two additions Block
are required instead of four multiplications and two additions of 5396 348 6.45%
matching
the mathematical rotation as in formula (1). Furthermore, because
we actually just need to rotate start points locating on the left and Our method 5400 248 4.59%
top (bottom) border of the image (the remaining pixels are then
computed relatively on these start points), the number of rotation
operations used in this method is just H+W–1, where H and W are Actually, as far as we are aware, there is no official definition of
the height and width of the input image (the upper/lower-left measure for the accuracy of rotation algorithms. Some methods
corner is computed once). use OCR systems for evaluation the correctness of rotation

269
algorithms. However, this is not a general solution. One of the of mathematical rotation operator such as white holes are solved
reasons is that most OCR systems have their own preprocessing while the geometry relation between pixels, e.g. Euclidian
procedures used to enhance the input data before recognition. distance in our method, is still kept. In addition, compared with
Therefore, these procedures make the performance of different the mathematical rotation, using DDA algorithm reduces the
rotation methods with different accuracy appear to have the same computational complexity. This paper also recommends a
accuracy. In this paper, we recommend a simple measure called measure for evaluation of the accuracy of rotation algorithms.
diff which is the difference between the rotated image’s number Finally, we have tested and compared the proposed methods with
of 4-white connected components and the original image’s. Thus, other methods using different measurements.
smaller this number is, better the algorithm performs. In figure 3,
there is no white connected component (“while hole”) in the 6. REFERENCES
foreground region of the input image (Fig. 3a), while this number
[1] Avila, B.T., Lins, R.D., and Oliveira, L. 2005. A New
is 308 with mathematical rotation (Fig. 3b), 45 with block
Rotation Algorithm for Monochromatic Images. In
matching method (Fig. 3c), and 1 with our proposed method (Fig.
Proceedings of ACM DocEng’05 (Nov. 2005), 130-132.
3d). If the 8-connected component detection is used instead, there
is no white hole in our method. In figure 4, we illustrate the [2] Cao, Y., Wang, S., and Li, H. 2003. Skew Detection and
comparison results by applying the recommended measure in Correction in Document Images Based on Straight-Line
which the horizontal axis corresponds to the skew angle and the Fitting. Pattern Recognition Letters, 24 (Aug. 2003), 1871-
vertical axis corresponds to the ratio of diff over the real number 1879.
of connected components in the input image. In this experiment, [3] Cheng, H.D., Tang, Y.Y., and Suen, C.Y. 1990. Parallel
we illustrate our algorithm with three binary images of A4 size, Image Transformation and Its VLSI Implementation. Pattern
300 dpi of resolution, and the average number of white connected Recognition, 23, 1113-1129.
components is around 20000. Each image is then rotated with one
of three different angles: 150, 300, and 450. The rotation is done [4] Chien, S.I. and Baek, Y.M. 1998. A Fast Black Run Rotation
by three ways: the mathematical rotation, block matching, and our Algorithm for Binary Images. Pattern Recognition Letters,
proposed algorithm. From this illustration, we can see that our 19 (Apr. 1998), 455-459.
method gives the minimum number of white connected [5] Chien, S.I. and Baek, Y.M. 2001. Hierarchical Block
components compared with the others. Matching Method for Fast Rotation of Binary Images. IEEE
Transactions on Image Processing, 10 (Mar. 2001), 483-489.
The measurement recommended above just evaluates the ability
of solving the “white hole” problem. In order to evaluate the [6] Jiang, H.F., Han, C.C., and Fan, K.C. 1997. A Fast Approach
degradation of foreground objects caused by the rotation, we to the Detection and Correction of Skew Documents. Pattern
apply the metric proposed by Avila et al [1]. In this paper, the Recognition Letters, 18 (Jul. 1997), 675-686.
authors suggest that the degradation is computed as the [7] Mahata, K. and Ramakrishnan, A.G. 2000. A Novel Scheme
percentage of the number of different pixels between the original for Image Rotation for Document Processing. In Proceedings
image and twice-rotated image over the total pixels of the twice- of the 7th International Conference on Image Processing, 2
rotated image. For this, the given input image is firstly cropped to (Sep. 2000), 594-596.
remove white margins then rotated with the angle θ . The
[8] Paeth, A.W. 1990. A Fast Algorithm for General Raster
temporary result is cropped again and then rotated with −θ . Rotation. In Proceedings of Graphics Gems, Academic Press
Finally, the result is cropped to create the twice-rotated image. Professional, Inc., San Diego, CA, USA, 179-195.
With this measurement, we use the same images as previous
experiment and the comparison result is shown in figure 5. From [9] Pokorny, C. 2002. Computer Graphics: An Object-Oriented
this illustration, we can also see that the degradation of the Approach. BPB Publications, New Delhi.
proposed method is low and somewhat stable. It ranges from 4% [10] Shima, Y. and Ohya, H. 2006. A High Speed Rotation
to 6%. With the images in figure 3, the result is given by table 1. Method for Binary Document Images Based on Coordinate
Operation of Run Data. In Proceedings of IS&T/SPIE, 6064
5. CONCLUSIONS (Feb. 2006), 139-147.
This paper introduces a rotation algorithm for binary document
images using DDA algorithm. In this method, inherent problems

270

Das könnte Ihnen auch gefallen