Beruflich Dokumente
Kultur Dokumente
MULTI-MODAL DECOMPOSITION
TECHNIQUES FOR CLASSIFYING
DIFFERENT MEDICAL IMAGES
BY
Inguva Ananda Prasad
SRF INTERN
FENGS121
1 INTRODUCTION
The IMF obtained is iteratively passed through the sifting process [ref] until mean envelope
becomes monotonic. The Stopping criterion is achieved when then resultant mean envelope
converges. Residue is the portion of the image which is remained after stopping criterion.
The 2D decomposition by sifting process of an image provides a representation that is easy
to interpret.
Detection of local extrema means finding the local maxima and minima points from the
given data. A data point/pixel is considered as a local maximum (minimum), if its value is
strictly higher (lower) than all of its neighbors. Let A be an M X N 2D matrix represented
by (2)
x11 x12 x13 . . . x1n
x21 x22 x23 . . . x2n
A=
. . . . . . . . . . . . . . . . .
xd1 xd2 xd3 . . . xdn
Where A is a matrix of size M X N and A-MN is the element at Mth row and Nth column.
be W X W. Then,
Figure 1 (a) A sample 8 × 8 data matrix; (b) local maxima matrix obtained from (a); and (c)
local minima matrix obtained from (a).
(
LocalM aximum AM N > AKL
AM N = (2)
LocalM inimum AM N < AKL
Where
M
W −1 XW −1
K=M+ − (3)
2 0
2
N
W −1 XW −1
L=N+ − (4)
2 0
2
Multi-modal decomposition techniques for classifying different medical images 3
Generally, a 3 X 3 matrix (i.e., W = 3) results in an optimum extrema matrix for a given
2D data. Let us consider the 8 X 8 data matrix given in Figure (a) for illustration purpose.
The maxima matrix given in Figure (b) and minima matrix given in Figure (c) are obtained
when a 3 X 3 neighboring matrix is used for every point in the matrix. For finding extrema
points at the boundary or corner, the neighboring points within the matrix that are beyond
the image are neglected.The center element of the first m atrix i s a l ocal m aximum, the
center element of the second matrix is a local minimum, while the center element of the
third matrix is neither a local maximum nor a local minimum.
When testing for IMF criteria in the sifting process, two tests must be passed. The number
of extrema and zero-crossings must not differ by more than one. The second criterion is
that the mean between the upper and lower envelopes must be close to zero according
to some criterion. The criteria that have been considered so far , in certain situations, to
over decomposition. As an improvement, Acoording to [in bemd paper folder] proposed an
approach to choosing stopping criteria (4) in order to guarantee globally small fluctuations
in the mean while taking into account locally large excursions. This is accomplished by
introducing at each sifting iteration an amplitude and an evaluation function. They used
two thresholds, one designed to ensure globally small fluctuations in the mean of the cubic
splines from zero, and the second allowing small regions of locally large deviations from
zero.
However, EMD must be used cautiously when extracting the IMFs. When locating the
extrema o at each sifting process, the end points (boundary conditions) are to be treated
differently in order to minimize error propagations due to finite observations in length.
4 Anand Inguva.
Skull stripping
Is H k(x, y) NO
is an IMF
YES
NO
if H converges
YES
YES
Stopping criterion
In Variational mode decomposition, we decompose our input signal into a number of sub-
signals called modes which have some specific properties while reconstructing the input.
Here, the property of each mode uk is chosen to be its bandwidth in frequency domain. In
other words , we require each mode k to be mostly compact around a center pulsation ωk
determined along with decomposition. (3)
1. For each mode uk ,compute the signal by means of Hilbert transform to obtain a
unilateral frequency spectrum.
2. For each mode, shift the mode frequency to baseband with the help of exponential
function to the estimated center frequency.
3. The bandwidth is now estimated through the H1 smoothness of the demodulated signal
, i.e; the squared L2 − space of the gradient, the resulting variational problem is
represented by the following equation (3)
( )
X j −jωk t 2
minuk ,ωk = k ∂t δt + ∗ uk (t) e k2 (5)
πt
k
Subject to:
X
uk = f ; (6)
k
where f is the signal, u is its mode, ω is the frequency, δ is the Dirac distribution, t is time
script, k is number of modes, and * denotes convolution.
The functional to be minimized, stemming from this definition of 2D analytic signal, is:
( )
→
− → −
k ∇ uAS,k (→
−
h i
x ) e−j ( w k , x ) k22
X
minuk ,ωk (7)
k
b (→
λ −ω) 2
bi (→
−
X
u ω)+ k2 }(8)
2
k
6 Anand Inguva.
which yields in the Wiener-filter result:
b (→
−
→
− →
− →
− λ ω ) 1
∀→
−
ω ∈ Ωk : Ωk = {→
−
ω |→
−
ω .→
−
X
bn+1
u (ω) = f (ω) − ubi ( ω ) + ω k ≥ 0}(9)
1 + 2α|→
−
ω −→
−
b
k
2 ω k |2
i6=k
Optimizing for ωk is similar to the 1-Dimensional VMD, the difference is that here we are
considering domains to be the half planes, so that there will be two components
( )
→
− →
− −j (→
−
w k ,→
−
h i
n+1 x)
X
2
ω k = arg minω
bk k∇ u (x)e
AS,k k 2 (10)
k
The minimization is solved by letting the first variation w.r.t ω vanish. The resulting solutions
are the first moments of the mode⣙s power spectrum on the half-plane Ωk .
K
X
f (x, y) = IM F (K) + res(x, y); (12)
k=0
I
X
res(k) = resi (k) (13)
i=1
Where IM F (k) is a two dimensional domain denoting the kth mode of f(x,y) and
res(k) is residue.
Discrete Wavelet transform decomposes the input into set of orthogonal wavelets.The
Discrete Wavelet transform for a 1-Dimensional signal is given by
∞
" #
X
ylow [n] = Down − sampling2 x [k] g [2n − k] (14)
k=−∞
∞
" #
X
yhigh [n] = Down − sampling2 x [k] h [2n − k] (15)
k=−∞
The co-efficients ylow are the approximation co-efficients obtained from a low pass filter
g(n) and yhigh are the detailed co-efficients obtained from a high pass filter h(n).For a given
bi-dimensional image, the row and column decomposition are used to obtain 2D DWT
image.
The first level of decomposition contains the approximation image[low-low band] and also
contains the image details in horizontal[low-high],vertical[high-low] and diagonal [high-
high] directions.For the second level of decomposition, the approximation image is further
decomposed to extract the approximation and detailed co-efficients at second level.This can
be repeated for next level of decomposition. (1)
in our work we have use 3 mother wavelets for our decomposition.
8 Anand Inguva.
6 FRACTAL DESCRIPTORS
where β is the slope and constant is the intercept.The parameters c and β are estimated by
least squares.In our work ,we have divided the frequency space into 24 directions.so we will
get a slope and intercept in each direction.The average of slopes and average of constants
are used as fractals descriptors for classification purpose using power law.
Triangular prism method was introduced by Clarke during 1986 (ref). Initially it was used
in the to find the fractal dimension of topological surfaces. The schematic view of triangular
prism method is depicted in the figure.
Suppose the corner pixels are A, B, C, D and their heights are h(i, j) , h(i, j +1) , h(i +1,
j) and h(i +1, j +1) respectively. Figure 5 shows example of a triangular prism formed for a
particular grid size. Let the center pixel be E and its height be,
From the grayscale image f (x,y) shown in the figure based on the scaling factor S, (
2<S<M/2 where M is the size of the image) a square grid S x S is considered for forming the
triangular prism . From the square grid the corner pixels (11,8,9,4) marked as points P,Q,R,S
in fracture surface respectively. Their index values will be f(i,j), f(i,j+s),f(i+s,j),f(i+s,j+s)
and the center pixel will be denoted by f(i+s/2,j+s/2). From the center pixel the adjacent
points height is determined by
1
h0 = (f (i, j) + f (i, j + s) + f (i + s, j) + f (i + s, j + s)) (19)
4
Then the Area of the triangle t1 consist of points PTS is expressed using
p
AP T S = l1 (l1 − a1 )(l1 − b1 )(l1 − c1 ) (20)
where l1 =(a1 +b1 +c1 )/2 and the a1 ,b1 ,c1 are calculated using the equations
p
a1 = [f (i, j + s) − f (i + s, j + s)]2 + s2 (21)
r
s2
b1 = [f (i + s, j + s) − h0 ]2 + (22)
2
r
s2
c1 = [f (i, j + s) − h0 ]2 + (23)
2
The area of other triangle t2,t3,t4 are calculated correspondingly. So, the Approximate area
of the fractal structure is calculated by the equation
To find the entire image fractal structure area for the scaling factor s varying from 2 to M/2
will be denoted by
N (s)
X
A(s) = Ai,j (25)
i,j=1
where s is the scaling factor and (i,j) are the pixel intensity pair and N(s) is the number
boxes that need to cover the entire image for sxs.
nr (i, j) = l − k + 1; (26)
Therefore, the number of boxes covering the gray scale value for all the grids are
X
Nr = n(i, j); (27)
i,j
12 Anand Inguva.
Including the above fractals, we have also taken the mean (10) of zero-crossing
distance(e1 ),the mean(e2 ) (10), standard deviation(e3 ) (10) of the instantaneous amplitude
and also the average of((e1 ),(e2 ),(e3 )) as feature vector for each IMF are defined as
(max + min)
e1 = (28)
M XN
where max and min denotes the total number of extrema points of a given IMF.M and
N are number of rows and columns respectively.
2. Mean is given by
M N
1 XX
e2 = A(i, j) (29)
M XN i=1 j=1
4. Average value
e1 + e2 + e3
e4 = (31)
3
7
X
LBP = s(gp − gc )2p (32)
p=0
where
(
1 x≥0
s(x) = (33)
0 x<0
Multi-modal decomposition techniques for classifying different medical images 13
where gc is the gray level value of the central pixel, gp is the gray value of its neighbors
around gc and P is the number of neighbors. A pattern number is computed by comparing
the gc value with those of its neighborhood. Figure illustrates the basic LBP operator.
Conventional LBP requires 256 bins to store all possible patterns. The concept of uniform
patterns is introduced to reduce the number of possible bins. It effectively captures the
fundamental information of textures. A uniformity measure U is defined as
P
X −1
U (LBP ) = |s(gp−1 − gc ) − (g0 − gc )| + |s(gp − gc ) − (g0 − gc )| (34)
p=1
The binary pattern having U≤2 is designated as uniform pattern and is signified by nine
separate bins. The patterns having U ≥3 are termed as non-uniform patterns and are
represented in a single bin. Hence, LBP requires totally ten bins to store all the binary
patterns.
Figure 6 LBP
q
1X
gp = gp,r (p = 0, 1, 2, ...P − 1) (35)
q r=1
14 Anand Inguva.
47 83 52 48 73 69 44
60 69 86 74 44 1 0
1
53 47 33 71 86 1 33 71 1
0
77 84 73 67 70 1
0
54 52 71 64 80 84 70
(11010110)2 = 214
Figure 7 Schematic view of LGS calculation
47 83 52 48 73 69 1 74
60 69 86 74 44 1 33 0
53 47 33 71 86 0 1
1 0
77 84 73 67 70 73
54 52 71 64 80 52 0 64
(11010010)2 = 210
Figure 8 Schematic view of ELGS calculation
where, gp,r is the gray value of the rth sampled pixel along the radial direction of the
given neighbor pixel gp,q denotes the number of pixels being averaged, and P denotes the
number of neighbor pixels. P = 8 is used in the diamond sampling structure through the
letter. According to our experiments, it achieves the best classification performance when
q = 3.
Uniform patterns are more likely to occur compared with nonuniform patterns in natural
images because of high pixel correlations. In addition, noise may change an uniform pattern
into an unstable nonuniform pattern, whereas the texture information in nonuniform patterns
is ignored or weakened by only using the uniform patterns and grouping all nonuniform
patterns into a single bin of P + 1 when forming the LBP feature histogram. Therefore,
we need to find a mechanism to recover the noise-corrupted nonuniform patterns back
to possible uniform patterns so as to make more use of the discriminative information in
nonuniform patterns.It is widely accepted that gray level of the central pixel gc is similar to
that of its adjacent neighbor pixels gp (p = 0, 1, . . . , 7) in a local image patch. In view of
this, we propose to find a local adaptive quantization threshold t to replace the original fixed
gc so as to restore the noise-corrupted nonuniform patterns. More precisely, the threshold t
is constrained to either the original central pixel gc or the mean value of only two adjacent
neighbor pixels gc . Let φ(t) denote the set of all candidates of the possible quantization
thresholdφ(t), then can be formally defined as
The same as the original LBP, we first keep the original quantization threshold unchanged
as t = gc . If the extracted LBP pattern itself is uniform pattern, all the following steps stay
the same as the original LBP. If the extracted LBP pattern is nonuniform pattern, then we
go to replace the original quantization threshold gc by gm (m = 1, 2, . . . , 8) so as to restore
possible uniform patterns. In this case, if only no uniform pattern can be extracted, then a
nonuniform pattern is unavoidably generated, we select t = gc
The goal of the PCA (7) technique is to find a lower dimensional space or PCA space (W)
that is used to transform the data (X = x1 ; x2 ; : : : ; xN ) from a higher dimensional space
(RM ) to a lower dimensional space (Rk ), where N represents the total number of samples
or observations and xi represents ith sample, pattern, or observation. All samples have the
same dimension. In other words, each sample is represented by M variables, i.e. each sample
is represented as a point in M-dimensional space. The direction of the PCA space represents
the direction of the maximum variance of the given data as shown in Figure. As shown in
the figure, the PCA space is consists of a number of PCs. Each principal component has a
different robustness according to the amount of variance in its direction.
where λl is the eigenvalue of the l-th component. The value of a contribution is between
0 and 1 and, for a given component, the sum of the contributions of all observations is
16 Anand Inguva.
equal to 1. The larger the value of the contribution, the more the observation contributes
to the component A useful heuristic is to base the interpretation of a component on the
observations whose contribution is larger than the average contribution (i.e., observations
whose contribution is larger than 1/I). The observations with high contributions and different
signs can then be opposed to help interpret the component because these observations
represent the two endpoints of this component.
Brain magnetic resonance images were obtained from the Open Access Series of Imaging
Studies (OASIS) database. Thus, our sample includes 62 images MRI magnetic resonance
images from which half are normal and remaining half are affected with AD based on clinical
dementia rating (CDR) scores.In this study, the DWT is performed by using Daubechies-
4,Haar and symlet-4 wavelet function at first and second decomposition levels. The number
of bidimensional intrinsic mode functions and number of variational modes (k) are arbitrarily
set to three for 2 reasons. First, the computational time will be less.Second, whatever is
chosen k, the features are extracted from high frequency components which are the first
two sub-images.The modes of BEMD and VMD are shown in figures below.figure(9) and
figure(8) are the modes of EMD And VMD of normal images. figure(10) and figure(12)
are the modes of EMD And VMD of normal images.In additional to that we are alsp
plotting the 3D plots of respective images for illustration purpose. Along with these multi-
resolution techniques, we have done binary image analysis like local binary pattern,local
graph structures and diamond sampling structure and calculated the features using histogram
of the resultant output images.
By the above results we can sayu that VMD shows highest accuracy in classifying normal
images while time domain i.e; the fractals are applied to the direct input images,it shows
highest specificity thus they were more able to identify the diseased images. VMD based
fractal descriptors achieved the highest accuracy The VMD based approach is followed by
EMD and DWT, and the lowest overall accuracy was obtained by BEMD.
Multi-modal decomposition techniques for classifying different medical images 17
BEMD
VMD
BEMD
VMD
BEMD
VMD
BEMD
VMD
Finally, In this study we focused on fractals derived from VMD space against those estimated
in DWT, EMD, and time domain. Indeed, we found that image fractal descriptor estimated at
first and second multi-modal decomposition techniques is better image characteristics than
those obtained directly from the non-decomposed image. This could be extended to other
image classification problems in order to draw general conclusions regarding the ability of
fractal descriptors obtained in the DWT, EMD, and VMD components in understanding of
the structures of different types of images.
References
Bhuiyan S, Adhami RR, Khan JF. Fast and adaptive bidimensional empirical mode
decomposition using order-statistics filter based envelope estimation. EURASIP Journal
on Advances in Signal Processing. 2008 Jan 1;2008:164.
Niang O, Deléchelle É, Lemoine J. A spectral approach for sifting process in empirical
mode decomposition. IEEE Transactions on Signal Processing. 2010 Nov;58(11):5612-
23.
Antal, B. and Hajdu, A., 2012. : An ensemble-based system for microaneurysm detection
and diabetic retinopathy grading.., IEEE transactions on biomedical engineering, 59(6),
pp.1720-1726.
Bashier HK, Hoe LS, Hui LT, Azli MF, Han PY, Kwee WK, Sayeed MS. Texture
classification via extended local graph structure. Optik-International Journal for Light
and Electron Optics. 2016 Jan 31;127(2):638-43.
11 ACKNOWLEDGEMENT
It gives me immense pleasure to be associated with this project. The duration of the internship
was a joyful and illuminative learning process.
First of all, I would like to thank Indian Academy of Sciences for giving me the
opportunity to work as a Summer Research Fellow at CMERI, Durgapur. I consider myself
privileged to have got the opportunity to do my project in the prestigious institution.
I would like to express my gratitude to the HR Department, CMERI for providing
accommodation, food and other basic facilities and for making my stay in the institution
easier.
I am highly indebted to my guide Mr.Srinivasan Aruchamy, Scientist, ROBOTICS AND
AUTOMATION group, CSIR-CMERI for being a source of immense knowledge and for
their valuable guidance, keen interest and constant supervision at various stages of my
internship.
A special word of thanks to my parents for believing in me and for their unparalleled
love and support.