Beruflich Dokumente
Kultur Dokumente
`
)
) ( ) t g k h (
k
(
=
(
)
( de t
`
)
) k ( ) (
t g k h
) k
E =
(
(
)
( )
2
t f (
= | ( ) t |
2
e = J
) + )
=
= )
)
(t f
can be written as
J
R k h
gg
(i k ). (4.6)
k
Since the spectrum of the signal corresponds to the Fourier transformed correlation, equation (4.6) can be
rewritten as
( )
S
S
fg
( f ) = S f H
gg
( f ). (4.7)
If we can estimate the noise spectrum S
nn
(f) and assume that the noise and the original signal are uncorrelated,
we can estimate S
fg
(f) by
fg
( ) = S ( f ) = S
gg
( f ) S ( f ) . (4.8) f
ff nn
Then, h(k) can be solved by
f H ( ) =
S
fg
( f )
. (4.9)
S ( ) f
gg
The Wiener filter has been noted to perform well for the GPR signal, though the problem of estimating the
noise spectrum S
nn
(f) remains unsolved. This problem may be approached by a statistical diagnosis of the overall
signal.
4.2 Feature Extraction
The dynamic thermography and the GPR C-scan normally have a large set of image sequences. This section
provides information on how to extract features from multiple data.
4.2.1 The Karhunen and Loeve Transformation (KLT)
KLT is also called the Hotelling transformation or the principle component analysis (PCA) [18]. When we
have a large set of image sequences, the number of image samples needs to be reduced for global processing
unless images are selected randomly. KLT is a method to reduce the number of images while minimizing the
representation error.
Assume a gray scale image x
nm
, where n = 1, 2, , N, m = 1, 2, , M, N is the number of images in a sequence,
and M is the number of pixels in an image. A vector corresponding to one pixel position along an image
sequence is called a dixel or a dynamic pixel. A dixel represents the dynamic thermal evolution of a point in an
N dimensional space [13].
The idea of KLT is that dixels originating from the same object tend to form a cluster. The main axes of the
transformed space are the directions, which maximize the distinction between clusters [19].
- 21
Chapter 4 Signal and Image Processing
Figure 4-1 A diagram of an image sequence with m pixels and n images
The position of a dixel is given as a vector y
m
= {y
1m
, y
2m
,, y
Nm
}in an N dimensional Euclidean space.
The m indicates the m
th
pixel in each image. Figure 4-1 shows an example of an image sequence. This sequence
has m pixels in each image and n images in a sequence. The arrow going through the circled pixels indicates one
dixel. The algorithm follows [13].
Since the gray value of pixels can be distributed along different ranges in each image, it is necessary to
normalize each image by subtracting its average value.
nm
x =
nm
y
n
, (4.10)
where
n
| | = =
nm
y E
M
1
=
M
m
nm
y
1
. (4.11)
This process makes x
nm
have zero mean as
n
| | = =
nm
x E
1
M
1
=
M
m
nm
x 0 = . (4.12)
A unity vector u can convert the image vector x
m
into a parameter r
m
as
T
r = u x
m
, (4.13)
m
where u has a binding condition,
2 T 2
u = u u =1 or g(u) = u = 1 0 . (4.14)
The mean value of r
m
should be equal to zero by the normalization of the dixel cloud.
|r E | = 0 (4.15)
m
The goal of this method is to find the optimal values of u, which makes the variation of r
m
maximized.
2
m
| f (u) = var(r ) = r E
m
| (4.16)
- 22
Chapter 4 Signal and Image Processing
The main axis is r
1
, and the unit vector is u
1
in Figure 4-2.
Figure 4-2 A dixel space [11]
The solution can be found by the Lagrange multipliers as
f g = 0 . (4.17)
The variation of r
m
can be rewritten as a function of u as
T T T T T
f ( ) = E|u x x u| = u E|x x |u = Cu u , (4.18) u
m m m m
where C denotes the covariance matrix as
T
C = E|x x
m
|. (4.19)
m
The Lagrange multiplier equation (4.17) can be interpreted as two parts, f and g.
f is interpreted as equation (4.20) from equation (4.18),
N
|
u
f ( ) =
u
i
u
i
f
.
|
|
|
, (4.20)
i =1
\
T
f =
u
T
(
(
Cu + u
T
C
u
i
u
(
(
= Cu e + u
T
Ce
i
, (4.21)
u
i
u
i
i
where e
i
is the i
th
unit vector.
Since the covariance matrix C is a symmetric matrix, equation (4.21) can be reduced as
T
f = 2 Cu e . (4.22)
u
i
i
g is interpreted as equation (4.23) from equation (4.14),
T T
g =
u
T
(
(
u + u
T
u
i
u
(
(
= u e + u
T
e
i
= 2 u e . (4.23)
u
i
u
i
i i
Finally, equation (4.17) is rewritten as,
- 23
Chapter 4 Signal and Image Processing
N
T
f =
| e u
i
(Cu = | g u
2
u
N
||Cu = 0 . (4.24)
i
u)| u
1
u|
i=1
Equation (4.24) can be solved only if Cu = u . This is an eigenvector problem, where is the eigenvalue and
u is the corresponding eigenvector. u
i
represents N possible solutions for u, which result in the same number of
. From Cu
i
= u , equation (4.18) is concluded as
i i
T
f (u ) = u Cu
i
= u
i
. (4.25)
i i
T
i
u
i
=
i
Equation (4.25) indicates that the eigenvector u
i
, corresponding to the largest eigenvalue
i
, represents the
direction for which the quadratic moment is its maximum [13]. The parameter for the new orthogonal set of axes
r
i
, where i = 1, 2, , N, can be produced from
T
r
i
= u
i
x . (4.26)
m
Going back to Figure 4-2, KLT extracts two dixel axes r
1
and r
2
from a two-dimensional dixel space, (x
1
, x ).
Normally the dimension of the data of an actual mine detection case is greater. The GPR C-scan of Figure 3-3
has 500 images, and the dynamic IR of Figure 3-7 has 94 images.
By projecting dixels on the r
1
axis, the feature extraction of an image sequence can be performed. Usually the
first orthonomal (orthogonal and normal) axis is considered as the feature direction, but sometimes multiple
directions can be considered for the optimal choice.
(a) (b) (c)
Figure 4-3 An IR image sequence of a minefield [30]; images taken at (a) noon, (b) afternoon, and (c) evening
Figure 4-3 shows an example of an image sequence. These images were captured at different times with an
infrared camera, AGEMA, available within a 3~5m band, while keeping the same position and angle. (a) was
captured at about noon, (b) at 5 PM, and (c) at 10 PM. The entire set of images was captured 49 times during a
24 hour period.
- 24
2
Chapter 4 Signal and Image Processing
Since the character of the transformed data by KLT is not the pixel value of a gray image but the relative
difference between each pixel, contrast enhancement is required in post processing. Contrast enhancement will
be introduced in Section 4.4.
(a) (b) (c)
8
Figure 4-4 Transformed image from Figure 4-3 by KLT; (a) 1
st
transformed image, (b) 2
nd
transformed image, (c)
th
transformed image
1
Figure 4-4 is the result of KLT from Figure 4-3. (a) is the first transformed image, (b) is the second, and (c) is
the eighth. Since the first transformed image is expected to have the most discriminative features, (a) shows the
most distinctive contrast. Unless some prior information is given, the first transformed image is used for the
feature data.
4.2.2 The Kitller and Young Transformation (KYT)
Because KLT treats all classes as a single scattergram, KLT chooses the main axes by considering the minimal
representation error rather than the maximum discrimination ability. If the noise component is prominent in the
entire sequence, noise can be considered as an important factor to select the main axes. KYT compensates for
the weak discrimination ability of KLT by normalizing the variance within the classes [13].
In this case, the total covariance matrix C can be split in
C = + , (4.27)
where is the covariance matrix within the classes and is the covariance matrix between the class averages.
The solution can be achieved by the solution of an eigenvalue and eigenvector problem as
u = u , (4.28)
where the eigenvector u
i
, corresponding to the largest eigenvalue
i
, provides the direction for which the
distinction between the classes is at its maximum.
1
Linear stretching method is used to enhance the contrast.
- 25
Chapter 4 Signal and Image Processing
(a) (b) (c)
(d) (e)
Figure 4-5. Process of KYT [13]
Figure 4-5 is the process of KYT [13]. When two dixel classes are given as shown in (a), KYT rotates the
original dixel classes as shown in (b). Then, it normalizes the variance within the classes as shown in (c), and
applies KLT to find the direction of main axes as shown in (d). Finally, it transforms the classes into the original
scattergram as shown in (e). The arrows KY
1
and KY
2
indicate the first and second transformed images by KYT
and KLT.
Although some concepts of KYT are similar to KLT in the sense of an eigenvalue problem, some additional
information should be determined to perform this transformation, for example, the class average, variance, and
relative weight. These classes are determined by delimiting the dixel clouds manually.
- 26
Chapter 4 Signal and Image Processing
4.3 Gray Scale Morphology Application
The term morphology has been mostly used for the binary image case in image processing. Basic functions,
dilation and erosion, are performed by the structure elements with various shapes and sizes. The repetition of the
basic functions performs the second level functions, opening and closing. While mixing these operations of the
first and second level, a region-based processing (for example, boundary extraction, region filling, and thinning)
can be done by putting forward or pulling backward the image.
The gray-scale application is somewhat different from the binary image case. Smoothing, extracting gradient,
edge detection, noise removal, contrast enhancement, and finally a region-based segmentation, called the
watershed algorithm, can be done using dilation and erosion.
The structuring element (SE) should be explained prior to the actual morphology operators. SE is a simple
matrix or an image with a relatively smaller size than the object image. SE has two functions. It defines the
neighborhood around the origin, and it adds an offset value through the corresponding region of SE [20].
(a) (b) (c)
Figure 4-6. Examples of SE; (a) 55 octagonal neighborhood, (b) 55 diamond neighborhood, (c) 55 pyramid shape
offset
Figure 4-6 (a) is an example of an octagonal neighborhood. This SE is frequently used in mine detection
applications, because the octagonal shape is somewhat similar to the round shape of mines. The origin is circled.
The origin has a 55 neighborhood
1
except for the four corners. When an object image meets these four corners,
the correspondent points will not be considered as neighborhood pixels. (b) is a diamond shaped neighborhood.
It has relatively fewer neighborhood pixels than the octagonal case. (c) is an offset distribution of a pyramid
shape. The origin has the highest offset, and the boundary pixels have relatively lower offsets. If every SE pixel
has the same offset value, it is called a flattop filter.
The basic operators, dilation, erosion, opening, and closing, will be introduced in the following section.
1
Sometimes nn is represented as the size of radius r, from n = 2r + 1. For example, 55 corresponds to r = 2.
- 27
Chapter 4 Signal and Image Processing
4.3.1 Operators
Dilation and erosion are similar to the discrete two-dimensional convolution [18],
M 1 N 1
, ( , , ) ( , f ( y x ) y x b ) =
1
f ( x b n m y m n), (4.29)
MN
m=0 n=0
where x = 0, 1, 2, , M-1 and y = 0, 1, 2, , N-1.
Dilation of an image function f by SE b can be derived as
f , , ( , ( ) = ( f b)( t s ) = max{f (s t x y) + y x b ) | (s x), (t y)D ; ( y x )D
b
}, (4.30)
f
,
where D
f
and D
b
are the domains of f and b [18].
The displacement parameter condition (s-x),(t-y)D
f
means that SE has to be completely contained by the set
being dilated. This corresponds to the 2D convolution in equation (4.29) with the max operation replacing the
sums of convolution and the addition replacing the products of convolution. f(-x,-y) is the flipped f(x,y) with
respect to the origin. As in convolution, the function f(s-x,t-y) means that the flipped f(x,y) is shifted by positive
s and t.
Since dilation is based on choosing the maximum value of f+b in a neighborhood defined by the shape of SE
the general effect of dilation on a gray-scale image has two parts. Firstly, if all offset values of SE are positive,
the output image tends to be brighter than the input. Secondly, the dark details of the input image are either
reduced or eliminated if the size of the dark area is smaller than SE.
Erosion of an image function f by SE b can be derived as
1
( ) = ( f b)( t s ) = min{f (s + t x + y) y x b ) | (s + x), (t + y)D ; ( y x )D
b
} , (4.31) f , , ( ,
f
,
where D
f
and D
b
are the domains of f and b [18].
In equation (4.30) and (4.31), the function of dilation and erosion is dual, while the condition is the same. The
function f(s+x,t+y) means that f(t) is shifted by negative s and t.
Since erosion is based on choosing the minimum value of f b in a neighborhood defined by the shape of SE
the general effect of performing erosion on a gray-scale image is the opposite from dilation. Firstly, if all the
offset values of SE are positive, the output image tends to be darker than the input. Secondly, the bright details
of the input image are either reduced or eliminated if the size of the bright area is smaller than SE.
The opening of an image function f by SE b can be described as
( ) = f b = ( f b) b , (4.32) f
which means the erosion of f by b, followed by a dilation by b [18].
The purpose of opening is to remove small bright details, while leaving the overall gray levels and larger bright
features relatively undisturbed. The initial erosion removes small bright details, but also darkens the image. The
1
The sign, inside of , may be more common for erosion, but it is presented as in this report.
- 28
Chapter 4 Signal and Image Processing
subsequent dilation increases the brightness of the image without reintroducing the bright details removed by the
previous erosion.
The closing of an image function f by SE b can be described as
f ( ) = f b = ( f b) b , (4.33)
which means the dilation of f by b, followed by an erosion by b [18].
Closing is the dual function of opening. An opposite result from opening is expected as dilation and erosion.
Closing is generally used to remove small dark details, while leaving the overall gray levels and larger dark
features relatively undisturbed. The initial dilation removes small dark details but also brightens the image. The
subsequent erosion decreases the brightness of the image without reintroducing the dark details removed by the
previous dilation.
- 29
Chapter 4 Signal and Image Processing
(a) (b)
(c) (d) (e)
Figure 4-7 Examples of morphology functions; (a) original image, (b) dilated image, (c) eroded image, (d) opened
image, (e) closed image
Figure 4-7 shows examples of morphology functions. Table 4-1 profiles the average, minimum, and maximum
value of each image. (a) is the original image. (b) is dilated by a 55 flat octagonal shaped SE from (a). (c) is
eroded, (d) is opened, and (e) is closed by the same SE as (b).
Bright details are enhanced, and dark areas are shrunk in (b) and (e) by removing the dark pixels. This effect
seriously increases the average value of (b), but it does not in (e). Dark details are enhanced, and bright areas are
shrunk in (c) and (d) by removing the bright pixels. This effect seriously drops the average value of (c), but it
does not in (d).
Image (a) (b) (c) (d) (e)
Average 98.7 120.7 78.3 92.2 105.3
Minimum 3 9 3 3 9
Maximum 238 238 213 213 238
Table 4-1 Average, minimum, and maximum value of Figure 4-7
1
1
With an 8-bit color map, the maximum gray value of a pixel is 255 and the minimum is 0.
- 30
Chapter 4 Signal and Image Processing
4.3.2 Morphological Gradient
The main goal of the morphological gradient transformation is to highlight gray level contours.
When an image function f is continuously differentiable, the gradient is equal to the modulus of the gradient of
f ,
2
( ) =
|
f |
2
| f |
f g | +
\
y
.
|
| . (4.34)
\
x
.
The simplest way to approximate this modulus is to calculate the difference between the highest and the lowest
pixels within a window, centered at each point x [21]. In other words, it is the difference between the dilated
function ( ) f and the eroded function ( ) f as
1
f g ( ) = ( f ) ( f ) . (4.35)
(a) (b)
Figure 4-8 Morphological gradient; (a) original image, (b) gradient image
4.3.3 Smoothing and Noise Reduction Using the Alternating Sequential Filter
Opening and closing are introduced in Section 4.3.1. The combination of these operators can remove noise and
can smooth the texture in an image. This is called the alternating sequential filter (ASF).
Usually, ASF performs well as a repeated operation rather than a single operation.
Two kinds of ASF are defined here, white ASF and black ASF. The white ASF is defined as
( ) =
2
2
n
f
1 1 3 3 n n
, (4.36)
where denotes the opening of the previous result, denotes the closing of the previous result, and the number
is the correspondent size of SE [11].
Equation (4.36) can be rewritten as
( ) = ( ( ( ( ( ( f )))))). (4.37)
n
f
n n 2 2 1 1
(
1
Some references define the morphological gradient as f g ) = | ( f ) ( f )|/ 2.
- 31
Chapter 4 Signal and Image Processing
The white ASF opens the object image with the smallest SE, and closes the previous result with the same SE. It
then opens again the previous result with the larger size of SE, and closes again with the same SE, etc.
The black ASF is the dual operation of the white ASF. Every step is the same as the white ASF except the
black ASF begins with a closing operation instead of an opening [11].
The black ASF is defined as
( ) = , (4.38)
n
f
1 1 2 2 3 3 n n
which is the same as
( ) = ( ( ( ( ( ( f )))))). (4.39)
n
f
n n 2 2 1 1
The goal of ASF is to remove noise or to smooth an image without disrupting the major components of the
image. The result relies highly on the maximum size of SE. If a precise result is desired, only small SE will be
applied. Otherwise, filtering steps will be repeated until the desired result is obtained.
(a) (b) (c) (d)
Figure 4-9 An example of white ASF; (a) original image [30], (b) 77 size filtering, (c) 1515 size filtering, (d) 2323
size filtering
Figure 4-9 shows the result of the white ASF from Figure 4-4 (a), which is the result of KLT from an IR image
sequence of a test minefield [30]. (b) is obtained by equation (4.37) when f is (a) and n is 7. (c) is obtained when
n is 15, and (d) when n = 23. Only the odd numbered SE have been applied, for example, 33, 55, 77, and
99. The large white circle located in the lower left corner of each image is suspected to be a mine, but the other
black and white dots or small circles are negligible. ASF is applied to remove those dots or circles. At the
segmentation step, (a) will cause over-segmentation, but (d) can be a reasonable condition for segmentation. The
segmented result will be presented in Chapter 6.
- 32
Chapter 4 Signal and Image Processing
(a) (b)
(c) (d)
Figure 4-10 Intensity value on the black line in Figure 4-9; (a) ~ (d) corresponding to (a) ~ (d) of Figure 4-9
Image (a) (b) (c) (d)
Average 144.3 143.8 143.0 141.5
Minimum 0 20 57 68
Maximum 255 252 245 239
Table 4-2 Average, minimum, and maximum value of Figure 4-9
Figure 4-10 is the intensity value on the black line in Figure 4-9. The graphs (a), (b), (c), and (d) correspond to
(a), (b), (c), and (d) in Figure 4-9. Table 4-2 profiles the average, minimum, and maximum values of Figure 4-9.
The original image has many small peaks and valleys in Figure 4-10 (a). In each step, ASF smoothes an object
if it is smaller than SE. In (b), there are still some narrow valleys at x = 0~30, 90~110, 120~130 and peaks at x =
110~120, 130~140, 180~200, which are circled. These valleys and peaks disappear in (c), but the large white
circle remains even in (d).
In Table 4-2, the average gray value of the filtered image in Figure 4-9 has not significantly changed, even
though the maximum and minimum value have become closer to each other.
ASF has some advantages that other filters do not have. Firstly, the size of the object to be identified is
selectable by choosing the maximum size of SE. Secondly, ASF does not affect the overall property of the
image.
- 33
Chapter 4 Signal and Image Processing
4.4 Contrast Enhancement
Since the contrast between the background and the mine target is usually not large enough, the raw sensor
image rarely has enough information. The purpose of contrast enhancement is to enhance the difference between
the mine target and the background to distinguish them. Two methods are introduced in this section;
morphological contrast enhancement and histogram equalization.
4.4.1 Morphological Contrast Enhancement
The first step in morphological contrast enhancement is to find peaks and valleys from the original image.
Peaks are light shades of gray tone image, while valleys are dark. Peaks are obtained by subtracting the opening
from the original image, and valleys by subtracting the original image from the closing as
( f p ) = f ( f ), (4.40)
( f v ) = ( f ) f , (4.41)
where p(f) denotes the peaks, v(f) denotes the valleys, (f) denotes the opening, and (f) denotes the closing of
an image function f.
Preprocessing is necessary to improve the contrast [11]. This is done by multiplying constants with peaks and
valleys as
( p( f ) = f p )c
1
, (4.42)
where
max( f ) max(I )
c
1
=
( )|
.
1
(4.43)
max| f p
( v( f ) = f v )c
2
, (4.44)
where
min( f ) min(I )
c
2
= . (4.45)
f max|v( )|
The contrast-enhanced image is obtained by the summation of the original image, the peaks, and the negative
valleys [11].
f = f + p( f ) v( f ) (4.46)
An example of the morphological contrast enhancement is shown in Figure 4-11. The histogram equalization
may show a better result, which will be explained in the next section.
1
I indicates the entire gray level. The 8-bit gray level varies within [0 255]. max (I) is 255, and min (I) is 0.
- 34
Chapter 4 Signal and Image Processing
4.4.2 Histogram Equalization
The probability of the k
th
gray level in an image f can be described as
n
k
p ( ) = , (4.47) f
k f
n
where k [0 L-1], L is the number of gray levels in an image, n
k
is the number of times the k
th
level appears in
the image, and n is the total number of pixels in the image.
A plot of p
f
(f
k
) versus k is called a histogram, and the goal of histogram equalization is to obtain an image with
a uniform histogram.
The uniform histogram can be achieved by
( )
=
= =
k
j
j
k k
n
n
f T g
0
( )
=
=
k
j
j f
f p
0
, (4.48)
keeping two conditions [18],
(a) T(f
k
) is single valued and monotonically increasing in the range k [0 L-1].
(b) Also, T(f
k
) should be T(f
k
) [0 L-1] for k [0 L-1].
The transformed image g has a uniform gray level probability.
n
k
p
g
( ) = = c , (4.49) g
k
n
where ideally c is a constant through the entire gray level k [0 L-1].
Figure 4-11 presents examples of various contrast enhancement methods. The images on the left are contrast-
enhanced images, the middle graphs are the gray level on the black line in the left hand images, and the right
hand graphs are the overall histogram of the left hand images. The original image in (a) is the same image as
Figure 4-3 (a), which was captured with an IR camera from a test minefield. There is a possible mine target in
the lower left corner of the image.
The gray level of the original image (a) is limited within a range of 150 to 200. Useful information cannot be
achieved in this situation. The linear stretched image (b) is better than (a), but most pixels are still distributed in
the upper half range of the gray level. Image (c) was enhanced by equation (4.46) using a 77 sized octagonal
SE, but the gray level was not enhanced enough. Small peaks and valleys can be easily removed by ASF, but
eventually a wider range of gray level is desired. The histogram-equalized image shows the best result in (d).
The histogram shows almost uniform distribution except for critically high or low levels. ASF can remove small
peaks and valleys easily, and only the suspected white circle will remain.
- 35
Chapter 4 Signal and Image Processing
(a)
(b)
(c)
(d)
Figure 4-11 Examples of contrast enhancement; (a) original image [30], (b) linear stretched case, (c) morphological
octagonal enhanced case, (d) histogram equalized case; (left) transformed images, (middle) gray level on
the black line in the left hand images, (right) histogram of the left hand images
- 36
Chapter 4 Signal and Image Processing
4.5 Segmentation Using Watershed
The watershed algorithm is a region-based segmentation technique. Usually, two properties of an image are
considered to segment it, edge and region. Watershed is used when the edge information is not good enough to
segment the image. The concept of the watershed algorithm originated from geology. It is also introduced in the
context of mathematical morphology.
Image data can be interpreted as a topographic surface where the gray levels represent altitudes. A catchment
basin is defined as the region where all points flow downhill to a common point. The high gradient region,
called watershed lines, corresponds to the high watersheds, and the low gradient region corresponds to the
catchment basins. If we consider a local region where all rainwater flows to a single location, this might not
seem to be applicable to intensity-based images, but it makes sense if the object is a gradient magnitude image.
4.5.1 Basic Concept
There are two basic approaches to the watershed image segmentation.
The first starts with finding a downstream path from each pixel of the image to a regional minimum. The
regional minimum is defined as a point, which does not have a descending path in its neighborhood. Assuming
two points s
1
and s
2
of a surface S, a descending path can be defined as a sequence {s
i
} of points of S as
s
i
(x , f (x )), s
j
(x , f (x
j
)) i j f (x ) f (x
j
). (4.50)
i i j i
In other words, a point s S belongs to a minimum, if there is no existing downstream path starting from s.
A catchment basin is defined as a set of pixels for which their respective downstream paths all end up at the
same altitude minimum. Catchment basins of a topographic surface are homogeneous in the sense that all pixels
in the same catchment basin are connected with the minimum altitude by a simple path of pixels that have
monotonically decreasing altitude. Such catchment basins represent the regions of the segmented image.
However, no rules exist to define the downstream paths uniquely for digital surfaces, while the downstream
paths are easy to determine for continuous altitude surfaces by calculating the local gradients.
The second approach is dual to the first. Instead of identifying the downstream paths, the catchment basins are
filled from the bottom [21, 22]. There is a hole in each local minimum, and the topographic surface is immersed
in water step by step. If two catchment basins merge as a result of further immersion, a dam is built all the way
to the highest surface altitude. The dam represents the watershed line. When the flooding reaches the highest
level, only the dam, called the watershed line, has remained.
From this point in this report, the watershed algorithm is assumed to be this flooding process. More detail
about the flooding process will be introduced in Section 4.5.3 and Section 4.5.4.
- 37
Chapter 4 Signal and Image Processing
4.5.2 Geodesic Functions
This section introduces some important mathematical operators for the watershed algorithm.
In the framework of digital pictures, a gray tone image can be represented by a function, f : Z
2
Z . The
point of the space Z
2
may be the vertices of a square or a hexagonal grid, and f(x) is the gray value of the image
at the point x. Every space will be assumed as Z
2
from this point, unless other dimensions are mentioned.
A section of f at level i is defined as
( f X
i
) ( ) { i} x f = , (4.51)
and
( ) ( ) { i} x f f Z
i
= . (4.52)
They have a complementary relation as
( f X
i
) ( ) f Z
c
i 1
= . (4.53)
The distance function of every point y of Y to the nearest point of Y
c
is
y Y , ( ) ( ) Y y dist
c
y d , = , (4.54)
where Y
c
is the complementary set of Y.
A section of d at level i is given by
( )
i
d X ( ) { } Y i y d y = = :
i
B , (4.55)
where B
i
is a disk of radius i, and means an erosion [22, 23].
(a) (b)
Figure 4-12 An example of a distance function; (a) a binary image, (b) the distance function of (a)
Figure 4-12 is an example of the distance function. A set of points Y and the complementary set Y
c
are given as
the white and black areas in (a). The distance function of every point of Y to Y
c
can be presented as Figure 4-12
(b). The brightest area indicates the pixels, which have the maximum distance to the complementary set.
- 38
Chapter 4 Signal and Image Processing
Geodesic distance is the distance between two points within the set where the two points belong. The geodesic
distance function, d
X
(x,y), is defined as the length of the shortest path between x and y, where both points exist in
the set X.
(a) (b)
Figure 4-13 An example of geodesic distance function; (a) a set of points X and a point x, (b) geodesic distance
function from x within X
Figure 4-13 is an example of geodesic distance function. There is a point x in the set X in (a). The black dot
represents a pixel x, and the white H shape represents a set X. The geodesic distance function from the point x to
an arbitrary point y in the set X is represented as a gray level in (b). The brighter value represents the longer
distance. The dotted line indicates the same Euclidean distance. Since the paths toward the right part of H have
to take a bypass, the distance to the upper left part of H is relatively shorter than to the right part of H, though
the Euclidean distance is the same.
4.5.3 Reconstruction
Letting Y be any set, included in X, the set of all points of X at a finite geodesic distance from Y can be
computed as,
R
X
(Y ) = {x X : y d Y ( y x ) }. (4.56) ,
X
,
R
X
(Y) is called the X-reconstructed set by the marker set Y [23, 24]. It is made of all the connected components
of X, centered at Y.
Two gray image functions f and g are considered in the same way under the condition of f g. The
correspondent sections of these two functions at level i are X
i
(g) and X
i
(f). Since f g, X
i
(f) is obviously included
in X
i
(g). For every level i, a new set can be obtained by reconstructing X
i
(g) using X
i
(f) as a marker. The new
sets, R
X
i
( )
(X ( )) , define a pile of embedded sections of a new function, called the reconstruction of g by f, f
g i
and is denoted as R
g
(f). The dual reconstruction of g by f, under the condition f g, is denoted as R*
g
(f) [23, 24].
It is obtained by reconstructing the sections Z
i
(g) using Z
i
(f) as a marker. X
i
(f) and Z
i
(f) are complementary to
each other as equation (4.53).
- 39
Chapter 4 Signal and Image Processing
This reconstruction and dual reconstruction process extracts the regional maximum and minimum.
(a) (b)
(c) (d)
Figure 4-14 Finding regional maxima and minima by reconstruction; (a) function f and f - 1, (b) reconstruction R
f
(f-
1) and regional maxima k
M
(f), (c) function f , f + 1, and regional minima k
m
(f), (d) reconstruction
R*
f
(f+1)
In order to find the regional maximum, the function f and f 1 are overlapped. Figure 4-14 (a) is the vertical
slice of the overlapped functions. Then, the reconstruction of f using f - 1 as a marker is obtained as R
f
(f-1). This
is the light gray area in Figure 4-14 (b). Since Figure 4-14 is a slice of a two-dimensional gray level image, the
actual shape of R
f
(f-1) has a volume. The set of local maximum M(f) can be found by the difference between the
function f and R
f
(f-1) [23] as
( ) = f M f ( f R
f
) 1 . (4.57)
M(f) is presented as a set of binary data and as the dark gray area in Figure 4-14 (b).
( )
( ) 1 = x k
f M
, if ( ) f M x (4.58)
( )
( ) 0 = x k
f M
, if ( ) f M x (4.59)
For the regional minimum case, the function f and f + 1 are overlapped as Figure 4-14 (c). The dual
reconstruction of f using f + 1 as a marker is obtained as R*
f
(f+1). This is the light gray area in Figure 4-14 (d).
The set of regional minimum m(f) can be found by the difference between R*
f
(f+1) and f [23] as
- 40
Chapter 4 Signal and Image Processing
( ) ( f R f m
f
=
*
) +1 f , (4.60)
m(f) is presented as a set of binary data, the same as M(f) in Figure 4-14 (c).
( )
( ) 1 x = k
f m
, if ( ) f m x (4.61)
( )
( ) 0 x = k
f m
, if ( ) f m x (4.62)
These sets of regional maxima and minima will be used for markers for the marker-based watershed algorithm.
Let Y be composed of n connected components Y
i
. Then, the geodesic zone of influence of Y
i
is the set of
points of X that are at a finite geodesic distance from Y
i
and are closer to Y
i
than to any other Y
j
. The geodesic
zone of influence of Y
i
is denoted as z
X
(Y
i
) [22].
, z
X
(Y ) = {x X : d ( Y x ) , j d i ( Y x ) < d ( Y x
j
)} (4.63)
i X
,
i X
,
i X
,
The entire set of the zones of influence Y in X, IZ
X
(Y), are defined as
IZ
X
(Y ) =
z
X
(Y ). (4.64)
i
i
The geodesic skeleton by the zones of influence of Y in X is obtained as the boundaries of z
X
(Y
i
) in the set X,
and it is denoted as, SKIZ
X
(Y) [22]. This is defined as
SKIZ
X
(Y ) = X \ IZ (Y ), (4.65)
X
where \ means the set difference.
Figure 4-15 Geodesic SKIZ of a set Y included in X
In Figure 4-15, the light gray region is z
X
(Y
i
), the sets of the zones of influence Y in X. The narrow region,
which is not included in both z
X
(Y
1
) and z
X
(Y
2
) but in the upper set of X, is the SKIZ for the upper area, and the
region not included in both z
X
(Y
3
) and z
X
(Y
4
) but in the lower set of X is the SKIZ for the lower area.
The watershed transformation by flooding may be directly transposed into the method using the sections of the
function f. Figure 4-16 is the topological interpretation of Figure 4-15.
- 41
Chapter 4 Signal and Image Processing
(a) (b) (c)
Figure 4-16 Watershed construction using a geodesic SKIZ
1
There is a section Z
i
(f) of f at the level i, and the flood has reached the level i in Figure 4-16 (a). In the next step,
the flooding of Z
i+1
(f) is performed in the zones of influence of connected components of Z
i
(f). The SKIZ, which
are not included by any of Z
i
(f) but Z
i+1
(f), remains as a result of the flooding as shown in (b). Some connected
components of Z
i+1
(f), which have not been reached by the flood, are defined as minimum at the level i +1. This
is the white area in (a). This minimum should be added to the flooded area in (c).
The section at the level i of the catchment basins of f is obtained by
W
i+1
( ) = |IZ
Z
i +1
( )
(Z ( f ))| m ( f ), (4.66) f
f i i+1
where m
i
(f) is the minima of the function at the level i [22].
f IZ
Z
i +1
( )
(Z
i
( )) for Figure 4-16 is the gray area in (b) excluding the SKIZ.
f
The minima at level i +1 are given by
f
i +1
( f ) \ R
Z
i +1
( )
(Z
i
( f )), (4.67) m
i +1
( ) = Z
f
where R
Z
i +1
( )
(Z ( )) is the reconstruction of Z
i+1
( f ) using Z
i
( f ) as a marker. f
f i
W
i+1
(f) for Figure 4-16 is the gray area in (c) excluding the boundary and SKIZ.
This iterative algorithm is initiated with W
1
( f ) =. At the end of the process, the watershed line DL(f) is
equal to the complementary set of the highest section of the catchment basins [22], and is defined as
( f DL ) = W
c
( f ), (4.68)
N
where max( ) = N . f
The watershed line for Figure 4-16 is the boundary line of (c) including the SKIZ.
4.5.4 General Watershed vs. Marker-Based Watershed
This section applies the general watershed algorithm and the marker-based algorithm to an image and
compares the results.
1
We perform flooding only on two levels, from i to i + 1, for convenience.
- 42
Chapter 4 Signal and Image Processing
The gradient image is often used for the flooding object in the watershed transformation, because the main
criterion for segmentation is the homogeneity of gray value.
(a) (b) (c)
Figure 4-17 An example of watershed segmentation; (a) original image, (b) morphological gradient, (c) segmented
result
Figure 4-17 (a) is a synthesized image consisting of a few bright circles. The inside background has some gray
values, but the outside background has almost zero gray value. The goal is to separate these circles from the
background. The morphological gradient of (a) by equation (4.35) using the octagonal shaped SE sized 77 is
obtained as (b). The gradient image has high gray values on the edges between the circles and the background.
These edges will be the watershed lines during the flooding process.
(a) (b)
Figure 4-18 Topological view of Figure 4-17; (a) original surface, (b) gradient surface
Figure 4-18 shows a topological view of the previous example. The bright circles in Figure 4-17 (a) can be
interpreted as hills in a topological manner shown in Figure 4-18 (a). The edge lines of Figure 4-17 (a) are
located in a high level in Figure 4-18 (b), the topological view of the gradient image. The plain area in Figure
4-17 (a) is located in sinks in Figure 4-18 (b). So, the edge and plain area of the original image can be
interpreted as local maxima and minima in the gradient image.
- 43
Chapter 4 Signal and Image Processing
The flooding process begins from the local minima in Figure 4-18 (b). Each minimum m
i
(f) of the topographic
surface is pierced, and the whole surface is plunged into a lake with a constant vertical speed. When multiple
floods, from different sources, merge into one, a dam is built to avoid this event. At the end of this flooding,
only the dams will remain. These dams define the watershed of the function f, and separate the various
catchment basins, which contain one minimum for each of them. The flooding is not applied to the original
image but to the gradient image. After these steps, this process caused too many watershed lines in Figure 4-17
(c). This is over-segmentation and an undesired result.
The over-segmentation is a serious drawback of watershed segmentation. The over-segmentation is due to the
fact that every local minimum was considered to be the center of catchment basins. These minima are produced
by small variations or noise, and not all of them are important. The marker-based watershed algorithm is an
adaptive method to overcome this over-segmentation problem [22, 23, 24].
The main difference between the general flooding algorithm and the marker-based algorithm is the flooding
source. The marker-based watershed flooding grows from only the markers, while the general watershed
flooding grows from each local minimum. If enough information about the object image is provided, optimal
markers can be selected manually, otherwise the regional maxima and the SKIZ, obtained by flooding from the
regional maxima, are usually used as markers. Figure 4-19 shows the process of the marker-based watershed
algorithm.
Figure 4-19 Process of the marker-based watershed algorithm; (a) markers with the original image, (b) markers
with the gradient image, (c) segmented result
The first step for this work is to find the inside markers by finding the regional maxima of the original image.
The second step is to find the outside markers by finding the SKIZ of each catchment basin. The SKIZ will
correspond to the borders, originating in the regional maxima. This process is performed by flooding the
inverted original image from the inside markers. Figure 4-19 (a) shows both the inside and outside markers in
the original image. This is the result of the first flooding. The final step is to flood the gradient image from both
the inside and outside markers. (b) shows markers in the gradient image, and (c) shows the segmented result.
The following images are the topological view.
- 44
Chapter 4 Signal and Image Processing
(a) (b)
Figure 4-20 Topological view of the marker-based watershed algorithm; (a) markers on the original surface, (b) the
final watershed lines created on the gradient surface
Figure 4-20 (a) is the topological surface of Figure 4-19 (a). The peaks on each hill are the inside markers, and
the SKIZ (the borders of each hill region) become the outside markers. Although the outside markers may seem
messy in the center area, the minor SKIZ does not much affect the flooding result as shown in (b).
The marker problems, such as how to find markers and how to decide important markers, have been
determined to be the most important matter. The size of a marker does not affect its performance; only its
existence matters. The size of some of the inside markers is extremely small in Figure 4-19 (a) (b), and Figure
4-20 (a), but they performed the same role as other inside markers for their catchment basins.
- 45
Chapter 5 Proposed Method The Selection Rule
Chapter 5. Proposed Method The Selection Rule
A method of finding existing targets from a set of candidates, obtained from the segmented features, is
proposed in this chapter. This is quite an experiment-based method rather than a theory-based one. The proposed
method has been considered as a tool for IR data processing.
This chapter explains how to obtain candidates and how to make a selection from the candidates. Two
examples with synthetic data are presented in this chapter, and the experiments with actual mine data will be
presented in Chapter 6.
5.1 Candidates
Prior to any decision, how to make a set of candidates should be explained. The candidates are obtained from
the segmented feature data. According to the concept of dynamic thermal application, the feature data can be
extracted by KLT from an IR image sequence, and the watershed processing can perform the segmentation.
Dynamic thermography was introduced in Section 3.2.2, and KLT in Section 4.2.1.
There are several ways to apply KLT to an IR image sequence.
A simple way is to apply KLT to the entire image sequence as one packet and to extract only the first
transformed image. The first transformed image becomes the feature data, but it is rare that a single image
provides enough information.
The ideal way is to apply KLT in the same way but to extract the second transformed image as well as the first.
In most cases, these two images provide enough information to find one or two targets.
If the first two images cannot provide enough information, we have to extract more transformed images. In this
case, a statistical method to distinguish the mine target from false alarms is required. Also, it should be noted
that the higher order transformed images have less information than the lower order images.
After feature extraction, segmentation is performed. The marker-based watershed algorithm will be used for
the experiments in Chapter 6. After segmentation, a set of candidates is selected according to their size or shape
from the segmented image set.
5.2 Selection Rule
The term selection instead of detection is used, because we actually select the target from a set of
candidates.
The goal of this selection process is to distinguish mine targets from false alarms. Unfortunately, there is not
any straight rule to decide which is a mine target and which is a false alarm. The idea is that mine targets should
be detected in every feature image, while false alarms should appear only once. This assumption is similar to
- 46
Chapter 5 Proposed Method The Selection Rule
KLT, in Section 4.2.1. The mine target is selected by the existence of candidates at the approximate location in
each feature image.
A simple case is that mine targets are shown in both feature images, while false alarms are shown only in one
of them. In that case, if the candidate appears in both feature images, it is considered as the mine target. The
logical AND function of the set of candidates can be the mathematical description.
(a) (b) (c)
Figure 5-1 An example of a simple selection; (a) candidates in the 1
st
feature image, (b) candidates in the 2
nd
feature
image, (c) selected target
Figure 5-1 shows an example of the simple case, consisting of two feature images and one target. The first
feature image has two candidates in (a) as {A, B}, and the second feature image has three candidates in (b) as
{C, D, E}. Since the candidate A and C exist in similar location in each image, they are defined as the same
candidate. The exact location or size is not important, because this is feature data as a set of candidates rather
than raw sensor data as a set of image pixels. The important information is only the existence of candidates at
the approximate location. Two sets of candidates, {A, B} and {A, D, E}, are given, and candidate A is selected
as the mine target in (c). The strength of the signal, corresponding to the size of each candidate, is not important.
The only matter considered here is the existence of the candidate. The candidate B has a strong signal in (a), but
is not considered as the mine target because it does not exist in (b). The mathematical expression can be derived
as
({ , , , , , , B A f },{ E D C }) = { B A }{ E D C } {A }, if {C } {A }, (5.1)
where f( ) is the decision function.
A complicated case exists when many feature images are provided but mine targets are not shown in all feature
images because the target signal is relatively weak compared to the background. This situation can happen
frequently, because most APM, made of plastic or wood, do not have distinctive characteristics. In this case, the
statistical probability should be considered as a decision method because mine targets cannot be distinguished
from false alarms.
- 47
Chapter 5 Proposed Method The Selection Rule
(a) (b) (c) (d)
Figure 5-2 An example of a complicated selection; (a) candidates in the 1
st
feature image, (b) candidates in the 2
nd
feature image, (c) candidates in the 3
rd
feature image, (d) two selected mine targets
Figure 5-2 shows an example of the weak signal case. Three sets of candidates, {A, B}, {C, D, E}, {F, G, H},
are provided. If any candidates seem to be the same object by matching the entire set, they are marked as the
same candidate. Then, A C F and E H. The final independent set of candidates can be counted as shown in
Table 5-1.
Candidates A B D E G
Feature 1 (a) O O X X X
Feature 2 (b) O X O O X
Feature 3 (c) O X X O O
Probability 1 1/3 1/3 2/3 1/3
Decision (d) O X X O X
Table 5-1 Statistical probability of candidates in Figure 5-2
Since candidate A is found in all feature images, A is selected as the mine target. Candidate E is found twice,
but other targets are found once. The matter of whether twice-found candidates should be considered as mine
targets or false alarms should be assisted by the statistical probability of previous experience. Regarding the
twice-found candidates as mine targets and the once-found candidates as false alarms, two mine targets are
selected as shown in Figure 5-2 (d).
Sometimes, candidates located on the boundary of feature images are ambiguous matter. The possibility of
false alarms from the boundary candidates is high, because filtering or segmentation cannot be performed
beyond the boundary. Thus, boundary candidates are not desired, but multiple sets of raw data from several
- 48
Chapter 5 Proposed Method The Selection Rule
positions can avoid locating the target on the boundary. It is recommended to disregard the boundary candidates
during the selection procedure in order to reduce the number of false alarms.
1
As previously mentioned, there is no straight rule to confirm the existence of mine targets from an ambiguous
signal. An automatic selection function algorithm has not been provided yet. A reasonable selection should be
assisted by optimal choice of the feature image and by proper segmentation, prior to the selection process.
1
The experiments in Chapter 6 regard the boundary candidates as valid candidates. Therefore, some false alarms are
expected.
- 49
Chapter 6 Experiments
Chapter 6. Experiments
The object of this chapter is to implement and test the proposed method and the image processing techniques
introduced in this report with actual mine data. Most IR related image processing methods, introduced in Section
3.2 and Chapter 4 are involved. In order to show the result of each image-processing method, each step is
presented in the form of an image, even though this may seem to be a repeating step.
The data source and detailed information of each set of data are described prior to each experiment summary.
Most data for these experiments are downloaded from the signature database managed by the Joint Research
Center
1
and the Unexploded Ordnance Center
2
.
In Section 6.1, the entire procedure is briefly described. The actual experiments will be presented in the
following sections as case studies.
6.1 Procedure
The reason why only the IR image sequences are considered as feature data is because the targets are buried
underground. The target signal is very weak in this situation, so there is no guarantee that the targets will be
discovered by a single measurement. The existence of the target appears very ambiguous in the raw data set.
Sometimes, even noise is more dominant than the target signal. Thus, the correlation or covariance of pixels in
an image sequence is more useful for feature data than the pixel value of a single image.
The proposed application consists of five stages.
At the first stage, the feature images are extracted from the given image sequence by KLT, introduced in
Section 4.2.1. Then, linear stretching or histogram equalization, introduced in Section 4.4.2, enhances the
contrast. Since smoothly varied contrast is desired for segmentation, linear stretching would generally be used
unless a robust feature is necessary.
At the second stage, ASF, introduced in Section 4.3.3, is applied to the result of the first stage to remove small
clutter and noise. Since the filtering intends to squash peaks and valleys at this stage, critical filtering may be
applied to achieve a feature image with a homogeneous gray level variation. This homogeneous variation is
required by the segmentation in the next stage.
At the third stage, the marker-based watershed algorithm, introduced in Section 4.5.4, is applied for the
segmentation. The morphological image functions, in the second and third stage, use the circle shaped SE,
1
Joint Research Center: the European Unions scientific and technical research laboratories, located in Belgium, Germany,
Italy, the Netherlands, and Spain. They manage a landmine signature database at http://apl-database.jrc.it .
2
Unexploded Ordnance Center: mine detection research organization sponsored by the Department of Defense, USA.
- 50
Chapter 6 Experiments
introduced in Section 4.3, because mine targets are expected to have a round shape. The expected result is a set
of segmented parts in the form of an image.
At the fourth stage, the segmented sets are labeled for the discrimination process. If a labeled set satisfies the
condition to be a candidate, this set is assigned as a candidate. Although there is not a definite rule, some rule
can be determined from the property of targets. Firstly, the shape of a mine target is expected to be round in
most cases. Secondly, some sets can be excluded by their size if the actual scale of the image is provided.
At the final stage, the candidates are matched with each other and selected as the mine target by the selection
rule, introduced in Section 5.2. The goal of this work is to discriminate the mine targets from false alarms. The
desired result is two distinctive sets, false alarms and mine targets.
- 51
Chapter 6 Experiments
6.2 Case Study 1
The Royal Military Academy (RMA) built a test minefield in Meerdael, Belgium. They performed a
measurement campaign with the Belgian Army, and JRC has managed the signature database. Two kinds of IR
sensors were used, AGEMA and TICM2. The AGEMA sensor is available at the 3 ~ 5m band, and the TICM2
at 8 ~ 12m. Various types of soil environments were measured, and two of them, gravel and sand, are tested in
this report. The ground truth data was not provided from the database, but literature published by RMA, [25],
reported the approximate location and the number of mines.
Table 6-1 profiles the site specifications of the data for case studies 1 to 3.
Collector Minefield Location Soil Condition Sensor Type
RMA Meerdael, Belgium
Gravel,
Sand
AGEMA (3-5m),
TICM2 (8-12m)
Table 6-1 Site specifications of Meerdael test minefield [30]
The data of case study 1 were collected with the AGEMA sensor at the gravel field. Figure 6-1 shows sampled
images, and Table 6-2 profiles the data specification. The data set consists of 48 images, taken at intervals of 30
minutes during a 24 hour period with size 256256. The cell shaped texture comes from the gravel.
Number of Targets Date and Time Number of Frames
2 April 2, 11:50 ~ April 3, 11:30, 1998 48 (1 per 30 minutes)
Table 6-2 Data specification acquired with the AGEMA sensor at the gravel field [30], [25]
(a) (b) (c) (d)
Figure 6-1 Sampled images from the data set acquired with the AGEMA sensor at the gravel field [30]; images
taken at (a) 17:47 (b) 23:47 (c) 05:50 (d) 11:30
- 52
Chapter 6 Experiments
(a) (b) (c) (d)
Figure 6-2 Procedure for the 1
st
feature; (a) 1
st
transformed image by KLT, (b) filtered image by the white ASF with
SE sized from 33 to 3535, (c) inside and outside markers, (d) gradient image of (b)
1
(a) (b) (c) (d)
Figure 6-3 Procedure for the 2
nd
feature; (a) 2
nd
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 3535, (c) inside and outside markers, (d) gradient image of (b)
Since it is very difficult to extract useful information from this raw data set, feature extraction is required. KLT
was used to extract feature images, and linear stretching was used for post processing to enhance the contrast.
The first and second transformed images are selected as feature data, as shown in Figure 6-2 (a) and Figure 6-3
(a). Then, ASF is applied to the two transformed images to remove noise and squash the gravel texture. The
white ASF with the round shaped SE sized from 33 to 3535 results in two images, Figure 6-2 (b) and Figure
6-3 (b). The double marker-based watershed requires two sets of markers, inside and outside. The inside
markers are found as the regional minima of the filtered image. The outside markers are found by the watershed
process of the filtered image, flooded from the inside marker. The white area indicates the inside markers, and
the lines indicate the outside markers in Figure 6-2 (c) and Figure 6-3 (c). The morphological gradient of the
filtered image is calculated as Figure 6-2 (d) and Figure 6-3 (d).
1
The contrast of the gradient image is enhanced to show the feature clearly. The actual gradient data used in the experiment
is not enhanced. This is also true for the following experiments.
- 53
Chapter 6 Experiments
(a) (b) (c)
Figure 6-4 Results from the 1
st
feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a) (b) (c)
Figure 6-5 Results from the 2
nd
feature; (a) segmented result, (b) labeled set, (c) selected candidates
Now, the feature data are ready for segmentation. After flooding the gradient image from the markers, the
object images are segmented as shown in Figure 6-4 (a) and Figure 6-5 (a). The segmented sets are labeled as
Figure 6-4 (b) and Figure 6-5 (b). Although there is no definite rule to make a decision, the candidates can be
discriminated by size and shape from the segmented set. If a potential candidate is too big, too small, or the
shape is too far from a round shape, it is excluded from the final candidates. The final set of candidates for each
feature image is shown in Figure 6-4 (c) and Figure 6-5 (c). Counting from left to right and top to bottom, the
candidates are arranged as two sets, {A, B, C, D, E, F} in Figure 6-4 (c) and {G, H, I, J} in Figure 6-5 (c).
Considering the existence of candidates at approximate locations, we can determine that B G, D H, E J.
Those candidates are selected as the mine targets in Figure 6-6 (a).
- 54
Chapter 6 Experiments
(a) (b) (c)
Figure 6-6 The mine targets; (a) selected mine targets, (b) actual mine targets [25], (c) false alarm
Figure 6-6 (a) presents the selected mine targets, and (b) shows the actual mine targets from the literature
published by RMA. Two mines were found successfully, but a false alarm occurred as shown in (c).
- 55
Chapter 6 Experiments
6.3 Case Study 2
The data of case study 2 were collected with the same sensor as in case study 1, but the soil condition was sand.
Figure 6-7 shows some sampled images, and Table 6-3 profiles the data specifications. The data set consists of
44 images taken at intervals of 30 minutes during a 24 hour period with size 256256. This data set has a
relatively smoother texture than the previous gravel case.
Number of Targets Date and Time Number of Frames
1
1 April 1, 13:08 ~ April 2, 11:04, 1998 44 (1 per 30 minutes)
Table 6-3 Data specifications acquired with the AGEMA sensor at the sand field [30], [25]
(a) (b) (c) (d)
Figure 6-7 Sampled images from the data set acquired with the AGEMA sensor at the sand field [30]; images taken
at (a) 17:38, (b) 23:05, (c) 04:35, (d) 11:04
1
Some poor images were eliminated.
- 56
Chapter 6 Experiments
(a) (b) (c) (d)
Figure 6-8 Procedure for the 1
st
feature; (a) 1
st
transformed image by KLT, (b) filtered image by the white ASF with
SE sized from 33 to 3535, (c) inside and outside markers, (d) gradient image of (b)
(a) (b) (c) (d)
Figure 6-9 Procedure for the 2
nd
feature; (a) 2
nd
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 3535, (c) inside and outside markers, (d) gradient image of (b)
The application procedure is the same as previously stated. Two feature images were extracted by KLT as
shown in Figure 6-8 (a) and Figure 6-9 (a). The white ASF, with the round shaped SE sized from 33 to 3535,
was applied to the two KLT transformed images in Figure 6-8 (b) and Figure 6-9 (b). The inside and outside
markers were found as shown in Figure 6-8 (c) and Figure 6-9 (c). Figure 6-8 (d) and Figure 6-9 (d) are the
gradient images.
After the watershed segmentation, Figure 6-10 (a) and Figure 6-11 (a) are achieved.
- 57
Chapter 6 Experiments
(a) (b) (c)
Figure 6-10 Results from the 1
st
feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a) (b) (c)
Figure 6-11 Results from the 2
nd
feature; (a) segmented result, (b) labeled set, (c) selected candidates
The segmented sets are labeled as shown in Figure 6-10 (b) and Figure 6-11 (b). Figure 6-10 (c) and Figure
6-11 (c) show the selected candidates from the labeled set.
Counting from the left to right and top to bottm in Figure 6-10 (c) and Figure 6-11 (c), the candidate set is
obtained as {A, B, C, D, E, F} for the first feature and {G, H} for the second feature. Matching the candidates
by approximate location, we can determine that D G. The selected mine target is matched with the actual mine
target from related literature [25] without any false alarm.
Figure 6-12 The selected mine target without any false alarm
- 58
Chapter 6 Experiments
6.4 Case Study 3
The data for case study 3 were collected with the TICM2 sensor at the gravel field. Figure 6-13 shows some
sampled images, and Table 6-4 profiles the data specifications.
(a) (b) (c)
Figure 6-13 Sampled images from the data set with the TICM2 sensor at the gravel field [30]; images taken at (a)
17:30 (b) 23:15 (c) 05:00
The data set consists of 94 images taken at intervals of 15 minutes during a 24 hour period with size 520340.
Similar texture to case study 1 is noted.
Number of Targets Date and Time Number of Frames
2 April 2, 11:45 ~ April 3, 11:00, 1998 94 (1 per 15 minutes)
Table 6-4 Data specifications at the gravel field with the TICM2 sensor [30], [25]
- 59
Chapter 6 Experiments
(a) (b) (c)
Figure 6-14 Procedure for the 1
st
feature; (a) 1
st
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 5555, (c) inside and outside markers
(a) (b) (c)
Figure 6-15 Procedure for the 2
nd
feature; (a) 2
nd
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 5555, (c) inside and outside markers
After KLT application, Figure 6-14 (a) and Figure 6-15 (a) were extracted as the first and second transformed
imaged. Since the texture of gravel is too dominant in the overall image, the feature images need critical filtering.
The white ASF, with the round shaped SE sized up to 5555, filtered the object image to Figure 6-14 (b) and
Figure 6-15 (b). The inside markers of the first feature were obtained by the regional maxima, while those of the
second feature were obtained by the regional minima. This happened because the relative intensity of the target
signal against the background is opposite in the first and second feature images. The markers are presented in
Figure 6-14 (c) and Figure 6-15 (c).
- 60
Chapter 6 Experiments
(a) (b) (c)
Figure 6-16 Results from the 1
st
feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a) (b) (c)
Figure 6-17 Results from the 2
nd
feature; (a) segmented result, (b) labeled set, (c) selected candidates
The watershed flooding segmented the object image into Figure 6-16 (a) and Figure 6-17 (a). Figure 6-16 (b)
and Figure 6-17 (b) show the labeled set, and Figure 6-16 (c) and Figure 6-17 (c) show the selected candidates.
Counting from left to right and top to bottom, the candidate set was obtained as two sets, {A, B, C, D, E, F} for
Figure 6-16 and {G, H, I} for Figure 6-17. Matching the two sets by approximate location, we can determine
that C G and D I. Thus, these two candidates were selected as the mine targets as shown in Figure 6-18 (a),
but the actual mine targets are H as well as G in (b). Also, candidate I turned out to be a false alarm.
(a) (b) (c)
Figure 6-18 The mine targets; (a) selected mine targets, (b) actual mine targets [25], (c) failed mine target
This is an example of a failed case. Figure 6-18 (c) is the failed mine target. This happened because the first
transformed image of KLT did not show the existence of the missed mine; only the second transformed image
did. This is a good example of the fact that the first transformed KLT image cannot guarantee discovery of mine
- 61
Chapter 6 Experiments
targets even though the image shows the strongest target signal. Sometimes, the features of the higher order are
equal or even higher in importance than the lower order features.
According to the literature, detected target G was buried recently, but undetected target H had been buried for a
long time [25]. The surface effect had disappeared when the measurement was performed. The surface effect,
introduced in Section 3.2.2, occurred by soil disturbance when the ground was dug to bury the mine.
- 62
Chapter 6 Experiments
6.5 Case Study 4
The previous cases were simple cases with one or two mines, but this case is very complicated.
A commercial company, E-OIR, performed a measurement campaign with their sensor at the test minefield in
Fort Belvoir, Virginia.
Table 6-5 profiles the site specifications. The soil condition was earth instead of gravel or sand, and the earth
was covered by light grass. The Amber Galileo LWR IR sensor was used, which is available at the 8 ~ 9m
band. The images were taken from a far distance to provide a large view. All conditions of this measurement
were quite close to an actual mine detection situation. Some undesirable texture, caused by irregular earth
component and grass, was expected.
Collector Minefield Location Soil Condition Sensor Type
E-OIR Ft. Belvoir, VA USA Earth covered by light grass Amber Galileo LWR (8-9m)
Table 6-5 Site specifications of Ft. Belvoir test minefield [29]
Table 6-6 profiles the data specifications. This image sequence includes seven mine targets. 96 images were
taken at intervals of 15 minutes during a 24 hour period with size 222140.
Number of Targets Date and Time Number of Frames
7 April 13, 15:00 ~ April 14, 14:45, 1998 96 (1 per 15 minutes)
Table 6-6 Data specifications by the Amber LWR sensor at the earth field covered by grass [29]
- 63
Chapter 6 Experiments
(a) (b) (c)
Figure 6-19 Sampled images from the data set acquired with the Amber LWR sensor at the earth field in Ft. Belvoir
[29]; images taken at (a) 15:00, (b) 0:45, (c) 05:45
Mine Type M-15 M-19 PGMDM RAAM FFV-028 TM-62 VS 1.6
Position 57,30 129,36 181,22 212,22 65,89 121,89 194,92
Condition Buried Buried Surface Surface Buried Buried Buried
Table 6-7 Ground truth data of targets [29]
Figure 6-19 shows sampled images, and Table 6-7 presents the ground truth data. Except for the RAAM, every
target is an ATM. Two targets, the PGMDM and the RAAM, were laid on the surface, and the other targets were
buried underground. Figure 6-19 (a) presents the location of mines as white pixels. The mines are laid on two
lines. Four mines are laid on the first line, M-15, M-19 PGMDM, and RAAM, counting from left to right. Three
mines are laid on the second line, FFV-028, TM-62, and VS 1.6, counting from left to right. These mines are
named A, B, C, D, E, F, and G. As mentioned in Table 6-7, target C and D were laid on the surface, while the
other five targets were buried underground.
Since the distance is too far, and the targets seem too small, the disturbance of soil thermal condition around
the mines is expected to be stronger than the mine itself.
- 64
Chapter 6 Experiments
(a) (b) (c)
Figure 6-20 Procedure for the 1
st
feature; (a) 1
st
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 1919, (c) inside and outside markers
(a) (b) (c)
Figure 6-21 Procedure for the 2
nd
feature; (a) 2
nd
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 1111, (c) inside and outside markers
(a) (b) (c)
Figure 6-22 Procedure for the 3
rd
feature; (a) 3
rd
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 99, (c) inside and outside markers
(a) (b) (c)
Figure 6-23 Procedure for the 4
th
feature; (a) 4
th
transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 33 to 1111, (c) inside and outside markers
- 65
Chapter 6 Experiments
More feature data are necessary in this case, because the target signal is relatively weaker than other cases and
the number of expected targets is high. Four feature images are extracted from the first four transformed images
of KLT. After KLT application, linear stretching was used for the previous cases, but histogram equalization,
introduced in Section 4.4.2, was used for more robust features in this case. Sometimes, histogram equalization is
useful to enhance a weak feature from the background. The result of KLT and histogram equalization is (a) from
Figure 6-20 to Figure 6-23.
The actual shapes of mine targets are seen in Figure 6-21 (a). Glancing over the area of target A, there is a
black circle in the white circle on the first line. The black circle indicates the actual size of target A by the
volume effect. The white circle is the area affected by the surface effect. Usually, both effects are useful in
finding mines, but the surface effect cannot occur from targets on the surface.
Critical filtering, the white ASF with SE sized up to 1919, was applied to the first feature. The fivc dark areas,
caused by the surface effect, are clearly noticeable in Figure 6-20 (a). Targets C and D do not affect the thermal
condition of soil because they are laid on the ground. In order to extract the relatively small target signal from
targets C and D, the white ASF, with the SE sized up to 99 or 1111, was applied to other features. Those
results are shown in (b) from Figure 6-20 to Figure 6-23.
Depending on whether the expected target signal is positive or negative, the regional maxima are extracted for
the first and fourth feature, and the regional minima are extracted for the second and third feature as inside
markers. Outside markers can be extracted from the inside markers by the flooding process. The markers are
presented in (c) from Figure 6-20 to Figure 6-23.
- 66
Chapter 6 Experiments
(a) (b) (c)
Figure 6-24 Results from the 1
st
feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a) (b) (c)
Figure 6-25 Results from the 2
nd
feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a) (b) (c)
Figure 6-26 Results from the 3
rd
feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a) (b) (c)
Figure 6-27 Results from the 4
th
feature; (a) segmented result, (b) labeled set, (c) selected candidates
Watershed processing was performed for each feature, and the results are presented in (a) from Figure 6-24 to
Figure 6-27. The segmented sets were labeled as shown in (b) from Figure 6-24 to Figure 6-27. (c) from Figure
- 67
Chapter 6 Experiments
6-24 to Figure 6-27 show the selected set of candidates. Because of the high number of expected targets and
feature images, a lot of candidates were selected, but many of them appear to be false alarms. Four sets of
candidates are given. The first set has 6 candidates, second 8 candidates, third 20 candidates, and fourth 10
candidates.
Since many candidates are indicated in this case, we did not name each candidate but show the results in a
table. Table 6-8 represents the distribution of candidates by the targets and false alarms.
Targets A B C D E F G False Total
1
st
O O X X O O O 1 6
2
nd
O O X X O O2
1
O 2 8
3
rd
O O O O X O X 15 20
4
th
O O O O O O X 4 10
Total 4 4 2 2 3 4 2 22 44
Probability 4/4 4/4 2/4 2/4 3/4 4/4 2/4
Decision (2/4) O O O O O O O X
Table 6-8 Distribution of Candidates for Case Study 4
The buried targets, A, B, E, F, appeared very frequently, but the surface laid targets, C and D, appeared only
twice. It may sound strange that a visible surface-laid target is unnoticeable, but a buried target is noticeable, but
that happened in this case. This is caused by the lack of surface effect. The twice-captured signals, from the
surface targets, are made by direct IR radiation from the surface of the mine itself.
(a) (b)
Figure 6-28 The mine targets; (a) selected mines without false alarm, (b) ground truth [29]
22 out of 44 candidates turned out to be false candidates. They were selected as candidates but turned out not
to be actual mine targets. Every false candidate occurred only once at the same location. If the decision level of
1
In Figure 6-25 (c), the second and third candidates on the second line indicate the soil disturbance around target F. Even
though both candidates do not present the mine target directly, their signal was made by the target F. So, these candidates
are counted as the target F.
- 68
Chapter 6 Experiments
probability is assigned as 2/4, every mine target is selected, and no false alarm has occurred. The selected mine
targets are shown in Figure 6-28 (a). By matching these ground truth, a successful location of 7 mine targets is
confirmed. This is a very good result.
- 69
Chapter 7 Conclusions
Chapter 7. Conclusions
Two areas have been studied in the concept of mine detection, sensor technology and image processing. The
image processing area has received more attention in this report.
Various image-processing methods for mine detection have been studied such as filtering, feature extraction,
morphology, contrast enhancement, segmentation, and visualization. Also, a method to find a mine target from
multiple candidates has been proposed. All methods were tested with actual mine data.
The most serious problem in mine detection applications is the ambiguity of the target signal. In order to
overcome this ambiguity problem, most research has been conducted in two ways. Firstly, methods to extract
multiple signals from a source or to enhance the ambiguous signal to a noticeable level have been studied.
Among the introduced image processing methods in this report, filtering, feature extraction, contrast
enhancement, and visualization would be examples. Secondly, many research groups have developed a new
detection device with multiple sensors, called sensor fusion. Of course, the development of new kinds of sensors
is another possibility.
Since improvement in the image processing level is limited unless the sensor provides good information about
the target, the development of an improved detection device should be done prior to implementing the improved
image processing method. The current trend of the next generation of detection devices is towards an armored
vehicle or a portable unit with multiple sensors.
Future work in the image processing area will also involve fusion. A global method, able to accept data from
multiple sensors and to visualize them by the same concept, will be the next generation of image processing
methods for mine detection applications. For example, accepting target signals from IR, MD, and GPR, then
visualizing the actual shape of the target on the screen after analyzing all sensor data would be one solution.
Although some image-processing methods, referred to in this report, are very sensor-related ones, most can be
used with future generation sensors, which will appear soon or have already appeared.
- 70
Chapter 8 References
Chapter 8. References
1. UN Landmine Database in the UN Mine Action Service; http://www.un.org/Depts/dpko/mine.
2. Hidden Killers 1998: The Global Landmine Crisis, US Department of State, Bureau of Political-Military
Affairs, Office of Humanitarian Demining Programs, Sep. 1998.
3. A. Sieber, Localization and Identification of Anti-Personnel Mines, European Commission Joint Research
Center International Workshop, Nov. 1995.
4. Landmine Database of the Norwegian Peoples Aid Mine Actions in Angola; http://www.angola.npaid.org/.
5. L. Kempen, Physical Principles for Anti-Personnel Mine Detection: a Survey of Three Sensing Principles,
Technical Report, IRIS-TR-0047, Dept. of Electronics and Information Processing, Vrije Universiteit
Brussel, May 1997.
6. R. Ekstein, Anti-Personal Mine Detection Signal Processing and Detection Principles, Master Thesis,
Dept. of Electronics and Information Processing, Vrije Universiteit Brussel, 1997.
7. L. Kempen and H. Sahli, Ground Penetrating Radar Data Processing: a Selective Survey of the State of the
Art Literature, Technical Report, IRIS-TR-0060, Dept. of Electronics and Information Processing, Vrije
Universiteit Brussel, Jan. 1999.
8. M. Acheroy, M. Piette, Y. Baudoin, and J. Salmon, Belgian Project on Humanitarian Demining (HUDEM)
Sensor Design and Signal Processing Aspects, Jul. 2000.
9. J. Brooks, L. Kempen, and H. Sahli, Ground Penetration Radar Data Processing: Clutter Characterization
and Removal, Technical Report, IRIS-TR-0059, Dept. of Electronics and Information Processing, Vrije
Universiteit Brussel, Jan. 1999.
10. L. Kempen, A. Katarzin, Y. Pizurion, C. Corneli, and H. Sahli, Digital Signal/Image Processing for Mine
Detection, Part 2: Ground based Approach, in Proceedings Euro Conference on Sensor Systems and Signal
Processing Techniques applied to the Detection of Mines and Unexploded Ordnance, pp. 54-59, Oct. 1999.
11. G. Ederra, Mathematical Morphology Techniques Applied to Anti-Personnel Mine Detection, Master
Thesis, Dept. of Electronics and Information Processing, Vrije Universiteit Brussel, 1999.
12. C. Bruschini and B. Gros, A Survey of Current Sensor Technology Research for the Detection of
Landmines, in Proceedings the International Workshop on Sustainable Humanitarian Demining, vol. 6, pp.
18-27, Sep. 1997.
- 71
Chapter 8 References
13. L. Kempen, M. Kaczmarec, H. Sahli, and J. Cornelis, Dynamic Infrared Image Sequence Analysis for Anti
Personnel Mine Detection, in Proceedings IEEE Benelux Signal Processing Chapter, Signal Processing
Symposium, pp. 215-218, Mar. 1998.
14. L. Peters Jr., J. Daniels, and J. Young, Ground Penetrating Radar as Subsurface Environmental Sensing
Tools, in Proceedings IEEE International Conference, vol. 82, no. 12, pp. 1802-1822, Dec. 1994.
15. J. E. McFee and Y. Das, Advances in the Location and Identification of Hidden Explosive Munitions,
Defense Research Establishment Suffield, no. 548, pp. 83, Feb. 1991.
16. K. Russell, J. McFee, and W. Sirovyak, Remote Performance Prediction for Infrared Imaging of Buried
Mines, in Proceedings SPIE Detection and Remediation Technologies for Mines and Minelike Targets II,
vol. 3079, pp. 762-769, 1997.
17. Thermal Neutron Analysis, Ancore Inc.; http://www.ancore.com.
18. R. Gonzalez and R. Woods, Digital Image Processing, Addison Wesley, 1992.
19. S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, 1998.
20. H. Heijimans, Morphological Image Operators, Academic Press, 1994.
21. S. Beucher and C. Lantuejoul, Use of Watershed in Contour Detection, in Proceedings International
Workshop on Image Processing: Real Time Edge and Motion Detection and Estimation, Sep. 1979.
22. S. Beucher, The Watershed Transformation applied to Image Segmentation, in Proceedings 10th
Conference on Signal and Image Processing in Microscopy and Microanalysis, Sep. 1991.
23. E. Dougherty, Mathematical Morphology in Image Processing, Marcel Dekker, 1992.
24. J. Roerdink and A. Meijster, The Watershed Transform: Definitions, Algorithms, and Parallel Strategies,
Fundamenta Informaticae, vol. 41, pp. 187-228, 2000.
25. P. Verlinde, M. Acheroy, and Y. Baudoin. The Belgian Humanitarian Demining Project (HUDEM) and the
European Research Context, in Proceedings Chiba University Workshop on Humanitarian Demining, Apr.
2001.
26. P. Machler, Detection Technologies for Anti-Personnel Mines, in Proceedings Symposium on
Autonomous Vehicles in Mine Countermeasures, vol. 6, pp. 150-154, Apr. 1995.
27. M. Schachne, L. Kempen, D. Milojevic, H. Sahli, Ph. Ham, M. Acheroy, and J. Cornelis, Mine Detection
by Means of Dynamic Thermography: Simulation and Experiments, in Proceedings IEE 2
nd
International
Conference on the Detection of Abandoned Landmines, pp. 124-128, Oct. 1998.
- 72
Chapter 8 References
28. UWBGPR measurement at the Royal Military Academy, Belgium on May 31, 1999.
29. Ft. Belvoir Minefield in Virginia, USA. Collected by E-OIR on Jan. 13-14, 1998.
30. Meerdaal Test Minefield in Belgium. Collected by the Royal Military Academy on Apr. 1-3, 1998.
- 73