Sie sind auf Seite 1von 29

Analysis of Phonocardiogram Signals using Variational

Mode Decomposition

End Term B. Tech project report submitted in partial fulfilment of the requirements for the B.Tech
degree in Instrumentation Engineering

in the

Department of Electrical Engineering

by

Sanmitra Banerjee
Roll No: 14IE10026

Under the supervision of

Dr. Anirban Mukherjee

Department of Electrical Engineering


Indian Institute of Technology Kharagpur, Kharagpur, India
Spring-2018
CERTIFICATE

This is to certify that the Report entitled “Analysis of Phonocardiogram Signals using Varia-
tional Mode Decomposition, submitted by Sanmitra Banerjee to the Indian Institute of Technology
Kharagpur, is a record of bona fide research work under my supervision and is worthy of consider-
ation for the award of the degree of Bachelor of Technology in Instrumentation Engineering.

Date: Signature

2
Abstract

According to World Health Organization(WHO), cardiovascular diseases (CVDs)are


the number one cause of death globally, and more people die annually from CVDs than
from any other cause. The prevalence of CVDs in India is a growing concern. In-
addition to this, the deaths due to lung diseases in India are on the rise accounting for
11 per cent of the total deaths. As many as 142.09 in every one lakh, died of one form
of lung disease or the other giving India the dubious distinction of ranking first in lung
disease deaths in the world as reported by WHO and appeared as the news item of the
Hindu dated July 01, 2015. The rural population of developing countries like India is
incapable to access quality health care facilities due to shortage of medical professionals
and resources. Traditionally, abnormal heart and lung sounds are detected using stetho-
scopes, which are subjective and requires expertise of the listener for accurate detection
of abnormalities present. The computerized methods of heart and lung sound analysis
provides a deeper way to analyze heart and lung sounds for diagnostic purpose. Efficient
cardiopulmonary diagnostic analysis demands an accurate separation of heart and lung
sounds from combined heart and lung sound recordings, which in itself is a challenging
problem due to their spectral overlap and presence of background noise. In this report,
a novel noise-robust technique for solving this problem has been proposed. Once the
lung sound free heart sounds are obtained, they are segmented into cardiac cycles us-
ing Variational Mode Decomposition(VMD) and Shannon Energy (SE). The segmented
cardiac cycles are further tested for the presence of murmurs in noisy scenario. In total,
40 sets of cardiac cycles containing S3 are obtained from publicly available databases
and are used to evaluate the performance of the proposed method in noiseless condi-
tion. It is able to detect the S3 correctly even when the normalized amplitude of S3
is 14.1%, whereas the existing method based on the Hilbert Variation Decomposition
(HVD) requires at least 16.13% of the normalized amplitude of S3 in comparison to the
normalized amplitude of the highest peak present in the cardiac cycle. In addition to
this, the result shows that the proposed method can detect S3 in noisy cases up to SNR
level of −5 dB.
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Materials and Methods 4


2.1 Separation of Heart and Lung Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Proposed method for Heart and Lung Sound Separation . . . . . . . . . . . . 7
2.1.3 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.4 Mode Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.5 Combination of slowest mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Segmentation and Detection of S1 and S2 using Variational Mode Decomposition . . 8
2.2.1 Proposed method for S1 and S2 Segmentation and Detection . . . . . . . . . 8
2.2.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 VMD and Selection of Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.4 Normalization and Low Pass Filtering . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5 Shannon Energy and Thresholding . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.6 Post processing for location of S1 and S2 . . . . . . . . . . . . . . . . . . . . 9
2.3 Detection of the Third Heart Sound, S3 using VMD . . . . . . . . . . . . . . . . . . 9
2.3.1 Smoothed Pseudo Wigner-Ville Distribution . . . . . . . . . . . . . . . . . . . 9
2.3.2 Proposed Method for S3 Detection . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Detection of Cardiac Murmurs using Peak Envelope Bandwidth . . . . . . . . . . . . 12
2.4.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.2 Cardiac Cycle Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.3 Feature Extraction for Murmur Detection . . . . . . . . . . . . . . . . . . . . 12
2.4.4 Murmur Diagnosis using Thresholding . . . . . . . . . . . . . . . . . . . . . . 13

3 Results and Discussion 13


3.1 Separation of Heart and Lung sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.2 Lung Sound Separation Error Analysis . . . . . . . . . . . . . . . . . . . . . . 13
3.1.3 Comparison of LS Separation Results with Existing Schemes . . . . . . . . . 14
3.2 S1 and S2 Detection results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Average Absolute Error (E1 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.2 Average Detection Time Error (E2 ) . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 S3 Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Detection of Cardiac Murmurs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.2 Peak Envelope Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.3 Probability Distribution of Peak Envelope Bandwidth . . . . . . . . . . . . . 20

4 Conclusion 20

5 Acknowledgment 24
1 Introduction
1.1 Motivation
Cardio-vascular diseases (CVDs) are one of the most prevalent causes of mortality globally. Any
medical technique which can help to detect signs of heart and lung disease could therefore have a
significant impact on world health. This serves as a motivation for exploring novel techniques that
utilize signal processing towards this end. Specifically, we are interested in creating the first level
of screening of cardiac pathologies that may assist medical personnel. Many resource-poor regions
of the world, especially in developing nations, are struggling to provide basic health care for people
due to a number of reasons. Principal reasons behind this include a dearth of doctors, poor access
to advanced devices, and a supply-chain infrastructure that is unable to provide sufficient consum-
ables, calibration, and maintenance for medical equipment The lack of proper medical attention for
patients in developing nations has fatal consequences. As noted by WHO in the recent decade, the
major causes of deaths worldwide are shifting from infectious diseases to more chronic ones, like
chronic cardiac diseases (CCD). Timely diagnosis will help the patients be aware of the necessary
steps to control these diseases.
The third HS, S3 , is a normal finding in children and pregnant women, but in adults, es-
pecially above 40 years, its presence may be associated with heart failure [1]. The detection of S3
with a stethoscope is a challenging task due to its low amplitude characteristics which necessitates
efficient auscultatory skills of experienced physicians [2]. Even more alarming are the statistics for
CVDs, which are the leading cause of death globally as per the fact data sheet of WHO reviewed
on September 2016. Analyzing heart sound signals using digital signal processing can help diagnose
problems with large cost-saving implications by equipping the most basic level of healthcare profes-
sionals with the facility to monitor patients vital signs, hence improving public health in general.
To untrained ears, heart murmurs are noise-like auditory vibrations caused by the turbulence
of blood flowing within the cardiovascular system as a result of various heart diseases. The cause
of turbulent flow can be attributed to many possible anatomical alterations such as blood flowing
through a narrowed orifice (aortic stenosis), blood flowing into a chamber of larger size from a
smaller diameter (aortic aneurysm), rapid and voluminous blood flow, or a combination of other
anatomical alternations. In an era where modern diagnostic tools are quickly emerging, cardiac
auscultation of turbulence caused by vibrations can reveal a unique aspect of the cardiovascular
information not obtainable by other diagnostic devices. Therefore, as a traditional tool for more
than hundreds of year, cardiac auscultation remains a valuable bedside investigative process for
physicians. Our goal is to make it more effective and accurate. Analyzing heart sound signals
using digital signal processing can help diagnose problems with large cost-saving implications by
equipping the most basic level of healthcare professionals with the facility to monitor patients vital
signs, hence improving public health in general.

1.2 Literature Review


The knowledge of various signal processing techniques is applied to the PCG signals for each step of
the analysis, information extraction, and processing, needed for the automatic diagnosis of the heart
diseases. It is important to segment the heart sounds into clinically meaningful segments such as
the first heart sound S1 and the second heart sound S2 for developing an automatic heart disorder
diagnosis tool based on PCG signals. The diagnostic features can be subsequently extracted for
each type of sound once the correct identification of these sound components is complete. However,
it is reported that the S1 and the S2 heart sound detection is one of the major and the most difficult
task in the heart sound analysis [3].
Literature survey reveals that all the available HSS (Heart Sound Segmentation) tech-
niques may be classified into two major categories: algorithms that are based on ECG reference
and others which are independent, which are further classified into supervised and unsupervised
methods [3]. The HSS algorithms based on ECG reference use QRS complex and the T-wave

1
to locate the precise positions of the heart sounds S1 and S2 , respectively. Though this class of
techniques is robust, it requires additional hardware circuitry besides imposing some constraint to
patient’s comfort. Therefore, many researchers prefer to segment the heart sounds S1 and S2 using
the second technique that does not require ECG synchronization. It overcomes the limitation of
using additional hardware and wiring arrangement, besides keeping the patient comfortable [3, 4].
Liang et al. [5] have presented an envelogram based approach for the S1 and S2 segmentation, by
using normalized average Shannon energy, which attenuates the effect of the low noise and allowing
the low intensity sounds to be detected easily. The Matching pursuit of Mallat and Zhang [6], a
decomposition analysis based on the Wavelet Transform, has been used for the time frequency
analysis of the PCG and for the independent segmentations of S1 and S2 into their components. It
decomposes a signal into a series of time-frequency atoms by an iterative process based on selecting
the largest inner product of the signal [6, 7]. D. Kumar et al. [4] have proposed an algorithm for
the segmentation and the classification of S1 and S2 heart sounds without ECG reference. The
fundamental heart sound lobes are identified using a fast wavelet transform and the Shannon en-
ergy in the first stage, followed by the classification based on Mel-frequency coefficients and neural
network. The advantage of this technique is its invariance to the recording location and its inde-
pendence on particular sound characteristics. Barabaa et al. [8] have reported a robust method
where a modified music beat tracking algorithm is used for the detection of the heart sounds S1 and
S2 , without ECG reference. This algorithm provides good results even in the presence of murmurs
or noisy signals. An HSS method that uses energy-based and simplicity-based features computed
from multi-level wavelet decomposition coefficients is proposed in [9]. This technique exploits the
timing information of S1 and S2 , based on the biomedical domain knowledge.
Hidden Markov Model (HMM), a probabilistic approach has been used for the detection
of S1 and S2 in [10]. In this method, the detection of the first (S1 ) and the second (S2 ) heart sound
is performed using a network of two HMM’s with grammar constraints to parse the sequence of
systolic and diastolic intervals. Ana Castrol et al. [11] have developed a segmentation algorithm for
the extraction of the heart sound components (S1 and S2 ) based on its time and frequency charac-
teristics. This method exploits the timing information of the cardiac cycle (systolic and diastolic
periods) and the spectral component using wavelet analysis. However, this technique does not ad-
dress about the detection of low energy events and the misclassification of some spurious events as
heart sounds. An unsupervised and low complexity algorithm for the detection of the heart sounds
S1 and S2 using Empirical Mode Decomposition (EMD) has also been studied. In this technique,
the PCG signal is decomposed into some specific functions called the Intrinsic Mode Functions
(IMF) that extract essential characteristics of the signal in the time domain. The suitably selected
IMF for further analysis has been found to be effective in the segmentation of S1 and S2 . Chrysa
D. Papadaniil et al. [12] have presented an efficient method for the detection of S1 and S2 , using
Ensemble Empirical Mode Decomposition (EEMD) combined with kurtosis features. EEMD has
been proposed to overcome the limitations of mode mixing problem, occurring in EMD [12]. The
intention of the studying an alternative, adaptive, and data-driven signal decomposition technique
is fulfilled by applying EMD and EEMD, which adjusts more to the waveforms characteristics [12].
The advantages of applying EMD using IMF for heart sound analysis have been reported in [13].
It highlights the merits of EMD in handling nonstationary and nonlinear signals as an adaptive
time-varying filter, because of which it has been widely applied to biomedical signals such as the
analysis of gastroesophageal information, the interference reduction in electrogastrograms, and for
the dynamics of cerebral autoregulation in stroke and hypertension including the heart sounds.
EEMD is a noise-assisted data analysis that evolved from EMD [14]. However, EEMD is an itera-
tive method and hence it may put a computational burden on the biomedical signal analysis.
Removing HS signals from respiratory signals has been studied in many research works so far. The
easiest way to cancel HSs is to high-pass filter the respiratory signals. However, due to spectral
overlaps of the HSs and LSs, parts of the signal information may be lost. Therefore, LS extraction
scheme requires non stationary approach such as Emperical Mode Decomposition (EMD). Different
methods based on adaptive filtering [15] [16], timefrequency filtering [17], wavelet denoising [18]
and modulation filtering [19] have been proposed to overcome this problem. In [20], blind source

2
separation has been used to separate HSs and LSs from multichannel recordings.Initially, least
mean square (LMS) based adaptive filters were used for lung sounds separation [15]. Later, a bet-
ter LS extraction algorithm based on the integration of HS-localization and recursive least square
has been proposed [21].Previous knowledge about the heart sounds are required for the adaptive
filter technique, which renders it impractical in many cases. To tackle this problem, Kompis and
Russi reuse the processed signal as the reference input for LMS adaption [22]. Still this method
cannot guarantee distortion free processed output. On the other hand, another non-stationary LS
extraction algorithm proposed by Mondal et al. [23], uses empirical mode decomposition (EMD)
and high-pass filtering to separate LS from BS recording. However, the standard EMD has been
reported for the shortcoming of mode mixing, which is defined as a single intrinsic mode function
(IMF)containing signals either of widely disparate scales, or a signal of similar scale residing in
different IMF components. As a consequence, another LS extraction algorithm based on the exten-
sions of EMD including ensemble EMD (EEMD), multivariate EMD (M-EMD), and noise assisted
M-EMD (NAM-EMD) was investigated in [24].

Barma et al. mentioned, “S3 exhibits a narrow frequency band of nearly 30 − 90 Hz with
low amplitude, appears for approximately 70 ± 15 ms during diastole just 100 − 150 ms after of the
occurrence of S2 (Dub) resulting from the blood vibrations inside the ventricles or their walls” [25].
The time-frequency representation (TFR) [2, 25] may be used to detect the third HS, S3 . The TFR
methods like the Short-Time Fourier Transform (STFT) and the Wigner-Ville Distribution (WVD)
exhibit both time and frequency resolutions [26–28]. The Stockwell Transform (S-transform), a bet-
ter approach as compared to the STFT has been reported for heart sound analysis in [29], [30]. The
major demerit of the STFT is the resolution trade-off between temporal and spectral domain [31].
Though better resolution can be obtained using the WVD, it has demerit of its inherent bilinear
characteristics leading to the undesirable cross-term interference [32]. An application of non-linear
decomposition technique, the Hilbert-Huang Transform (HHT) has been reported for detection of
the S3 [1]. However, it is difficult to separate a few of the components having low energy from the
HLS [25]. The application of non-linear decomposition techniques like HHT and Hilbert Variation
Decomposition (HVD) result in the generation of subcomponents, that is also non-linear and non-
stationary in nature [33]. As a result, non-linear TFR techniques like the WVD and the SPWVD
are preferred for time-frequency localization of non-linear and non-stationary signals [34]. The
SPWVD has shown the advantage over the WVD as it eliminates the cross-term effects besides
offsetting the biased estimation present due to TFR analysis [35].

The main challenge in the context of heart murmur classification is feature extraction [36]. In
the past few years many researchers have attempted to solve this problem, having achieved sig-
nificant results. Namely, a feedforward neural network for murmur detection and two Learning
Vector Quantization (LVQ) networks were employed for murmur classification in [37]. The murmur
classification system is designed to categorize the detected murmur into 3 common types: Mitral
Stenosis, Aortic insufficiency and Mitral insufficiency. Two LVQ networks were trained and tested
using the level 4 and level 5 discrete wavelet transform approximations of the heart signals. The
accuracy rates of the detection and classification subsystems were 73.12% and 67.05%,respectively.
In [38], a binary hierarchical three classifiers murmurs into atrial flutter, aortic regurgitation, aortic
stenosis and mitral stenosis. The author used the morphological characteristics of power spectral
distribution of the heart sound in the frequency domain as input features. The reported murmur
detection accuracy was 86.88% for atrial fibrillation sounds, 89.98% for sounds exhibiting aortic
valvular disorders. In [39], a feed-forward neural network classifier was employed. Here, input
features were extracted from the spectrogram of averaged heart cycles. The system attained 85%
accuracy in the discrimination between normal sounds, sounds with aortic stenosis and sounds with
aortic regurgitation. The main limitation of the previous approaches is that they were evaluated
on individual rather than benchmark databases, making it difficult to perform a fair comparison
between different methods.

3
2 Materials and Methods
2.1 Separation of Heart and Lung Sounds
Heart sounds interfere with lung sounds in a manner that hampers the potential of respiratory
sound analysis in terms of diagnosis of respiratory disease [24]. Although modern stethoscopes can
assist in hearing the lung sounds (LSs) more clearly, heart sounds (HSs) still interfere with the respi-
ratory sounds. LSs exhibit wideband power spectrum; however, most of the energy is concentrated
in frequencies below 200 Hz. On the other hand the main frequency components of the HSs are
found in the range of 20−150 Hz, thereby showing a spectral overlaps with LSs. Peak frequencies
of HSs are shown to be lower than those of LSs [40].

2.1.1 Mathematical Background


Here, we cite the definition of VMD given in [41]. The purpose of the VMD technique is to
decompose an input composite signal into a discrete number of user defined sub-signals (modes),
that have specific sparsity properties while reproducing the input. The sparsity prior to each mode
is chosen to be its bandwidth in spectral domain i.e, each mode k is mostly compact around a
center frequency ωk . Dragomiretskiy et al. [41] have mentioned the following scheme in order to
assess the bandwidth of a mode:
1. Analytic signal having unilateral frequency spectrum corresponding to each mode µk is ob-
tained using Hilbert transform.

2. The frequency spectrum of each mode is shifted to baseband, by mixing with an exponential
tuned to the respective estimated center frequency.

3. The H1 Gaussian smoothness of the demodulated signal, (i.e. the squared L2 -norm of the
gradient) is used to estimate the bandwidth.
The resulting constrained variational problem for the signal f is the following:
( 2 )
X  j
  X
∗ µk (t) e−jωk t

min ∂t δ(t) + s.t. µk = f (1)
µk ,ωk πt
2
k k

The reconstruction constraint is addressed by introducing a quadratic penalty term and Lagrangian
multipliers in order to render the problem unconstrained. Therefore, augmented Lagrangian argu-
ments L can be defined as follows:
2 2
X  j
  X X
−jωk t

L (µk , ωk , λ) = α ∂t δ(t) + ∗ µ (t) e + f − µ + hλ, f − µk i

k k
πt

k 2
k

k
2

Here α denotes a constant equal to the variance of the noise present. The saddle point of the
augmented Lagrangian L in a sequence of alternate direction method of multipliers (ADMM),
yields the solution of the original minimization problem (1). The ADMM algorithm is used to solve
convex optimization problems by breaking them into smaller pieces, as a result each of them become
easier to handle. The ADMM optimization in context to VMD is explained in Algorithm 1. Next
the details of how the respective subproblems can be solved is explained.
A. Minimization w.r.t. µk
To update the modes µk , the subproblem (2) of ADMM can be written as the following equivalent
minimization problem:
(    2 2 )
n+1
j −jω t
X λ
∂t δ(t) + πt ∗ µk (t) e
µk = arg min α k + f − µi + (6)
µk ∈R

2 2 2

4
Algorithm 1 ADMM optimization concept for VMD
Initialize µ1k , ωk1 , λ1 , n ← 0
repeat
n←n+1
for k = 1 : K do Update µk :
n+1
 n+1 n+1 n

µk ← argmin L µ1 , ..., µk−1 , µk , µk+1, ..., µnK, ω1n , ..., ωK
n
, λn (2)
µk

end for
for k = 1 : K do Update ωk :
 n+1 n+1 n+1 n+1

ωkn+1 ← argmin L µ1 , ..., µK , ω1 , ..., ωk−1 , ωk , ωk+1
n n
, ..., ωK , λn (3)
ωk

end for
Dual ascent: !
X
λ n+1 n
←λ +τ f− µn+1
k (4)
k

until convergence:
n 2 n 2
X
n+1

µk − µk / µk 2 < ε (5)
k 2

Using Parseval Fourier isometry under L2 norm, this problem can be solved in spectral domain:
 2 

2

ˆ X λ̂ 
µ̂n+1 = arg min α kjω [(1 + sgn(ω + ω )) µ̂ (ω + ω )]k + f − µ̂ + (7)

k k k k 2 i
2 

µ̂k ,µ̂k =µ̂∗k 
2

By performing a change in variables ω → ω + ωk in the first term:


 2 

2

ˆ X λ̂ 
µ̂n+1 = arg min α kj(ω − ω ) [(1 + sgn(ω)) µ̂ (ω)]k + f − µ̂ + (8)

k k k 2 i
2 

µ̂k ,µ̂k =µ̂∗k 
2

Exploiting the Hermitian symmetry of the real signals in the reconstruction fidelity term, both
terms of the equation can be written as half space integrals over the non-negative frequencies:
∞ !2 
Z X λ̂ 
n+1 2 2 ˆ
µ̂k = arg min 4α(ω − ωk ) |µ̂k (ω)| + 2 f − µ̂i + dω (9)
µ̂k ,µ̂k =µ̂∗k  2 
0

The solution of this quadratic optimization problem is readily found by letting the first variation
vanish for the positive frequencies:
 
X λ̂ 1
µ̂n+1
k = fˆ − µ̂i +  (10)
2 1 + 2α(ω − ωk )2
i6=k
.
which is clearly identified as a Wiener filtering of the current residual with signal prior 1 (ω − ωk )2 .
The full spectrum of the real mode is then obtained by Hermitian symmetric completion. Con-
versely the mode in time domain is obtained as the real part of the inverse Fourier transform of
this filtered signal.

5
B. Minimization w.r.t ωk
The center frequencies ωk do not appear in the reconstruction fidelity term, but in the bandwidth
prior. The relevant problem thus writes:
(    2 )
n+1
j −jω t

ωk = arg min ∂t δ(t) + ∗ µk (t) e k (11)
ωk
πt
2

As before, the optimization can take place in the Fourier domain, and end up with:
∞ 
Z 
ωkn+1 = arg min 4α(ω − ωk )2 |µ̂k (ω)|2 dω (12)
ωk  
0

The quadratic problem is solved as:


R∞ n+1 2
ω µ̂k (ω) dω
0
ωkn+1 ← (13)
R∞
µ̂n+1 (ω) 2 dω

k
0

which puts the new ωk at the center of gravity of the corresponding mode’s power spectrum.
Plugging the solutions of the sub-optimizations into the ADMM algorithm 1, and directly optimizing
in Fourier domain where appropiate, the complete algorithm for variatinal mode decomposition,
summarized in algorithm 2. The EMD and related algorithms are briefly reviewed in this section

Algorithm 2 Complete optimization of VMD


Initialize µ̂1k , ω̂k1 , λ̂1 , n ← 0
repeat
n←n+1
for k = 1 : K do
Update µk for all ω ≥ 0:
λ̂n
fˆ − i<k µn+1 − i>k µ̂ni +
P P
i
µ̂n+1
k ← 2
(14)
1 + 2α(ω − ωkn )2
Update ωk :
R∞ n+1 2
ω µ̂k (ω) dω
0
ωkn+1 ← (15)
R∞
µ̂n+1 (ω) 2 dω

k
0

end for
Dual ascent for all ω ≥ 0: : !
X
λ̂ n+1 n
← λ̂ + τ fˆ − µ̂n+1
k (16)
k

until convergence:
n 2 n 2
X
n+1

µ̂k − µ̂k / µ̂k 2 < ε (17)
k 2

as these algorithms are the basis for comparing our proposed approach for LS extraction.
A. Emperical Mode Decomposition
Empirical mode decomposition (EMD) is an efficient approach for non stationary signal process-
ing. EMD decomposes signals into components, called intrinsic mode functions (IMFs), which are
governed by two properties [42] as follows:

6
1. The extrema and zero crossing counts can differ by at most one.

2. The maximum and minimum envelopes should be symmetric around the mean envelope.

Following these rules, a number of IMFs are generated from a composite input signal. Considering
a real and stable sequences, which may be divided into fine-scale details xi (t) and r(t):
N
X
s(t) = xi (t) + r(t) (18)
i=1

An iterative sifting process is used to compute the modes from a signal s(t). The sifting procedure
to obtain the i-th IMF of signal s0 (t) can be outlined as follows:
1. The local maxima and minima of s0 (t) are computed.

2. The local maxima and minima are interpolated and the maximum envelope emax (t) and min-
imum envelope emin (t) are obtained.

3. The local mean is computed: m(t) ← [emax (t) + emin (t)]/2

4. Extract the detail: d(t) = s0 (t) − m(t)

5. d(t) is defined as an IMF once it satisfies the stopping criterion xi (t) ← d(t), otherwise set
s0 (t) ← d(t) and the process reiterates from step 1.

B. Ensemble EMD

To remove the mode mixing problem in the standard EMD, the ensemble EMD (EEMD) has been
proposed by Wu and Huang [43]. The ensemble empirical mode decomposition is specified as:
1. Input signal is corrupted by AWGN.

2. The AWGN corrupted input signal is decomposed into Intrinsic Mode Functions.

3. Steps 1 and 2 are repeated with different AWGN series.

4. The ensemble means of the modes are computed as the final result.
The iteration is terminated when the number in the ensemble approaches a given boundary J:
J
1X
si (t) = [si (t) + σn rj (t)] (19)
J
j=1

where si (t)+σn rj (t) is the j-th trial of the i-th IMF in the noise-corrupted signal, σn is the standard
deviation of the added noise, and rk (t) is the residual after extracting the first k IMF components.
As outlined above, the noise component incorporated in each iteration is different, however the
signal can be denoised by ensemble mean of entire trials. For accurate results, the boundary J
should be sufficiently large.

2.1.2 Proposed method for Heart and Lung Sound Separation


A schematic representation of the proposed scheme is presented in Fig. 1, which is subsequently
explained.

2.1.3 Preprocessing
The combined HLS signal sampled at 44.1 kHz is treated as input to our proposed method. This
combined signal is decimated by a factor of 50 reducing its sampling rate to 882 Hz. This decimated
signal is then normalized.

7
Figure 1: Schematic of the proposed method for Heart and Lung Sound Separation.

2.1.4 Mode Decomposition


The preprocessed composite heart lung signal (HLS) is then decomposed into 7 intrinsic mode
functions (IMFs) using Variational Mode Decomposition (VMD).

2.1.5 Combination of slowest mode


The first three slowly varying IMFs are combined together to obtain the HS and combination of
next three faster modes resulted in the LS. Further if few portion of HSs are present outside the
LSs cycle, it is removed using thresholding in order to get final separated lung sounds.

We have combined 10 HSs and 5 LSs with all possible combinations that resulted in 50 mixed
signals. The proposed method is applied on these 50 mixed signals after decimation which yielded
a total of 100 separated HSs and LSs, 50 each.

2.2 Segmentation and Detection of S1 and S2 using Variational Mode Decomposition


The Phonocardiogram (PCG) is analyzed by employing Variational Mode Decomposition (VMD)
combined with Shannon Energy (SE) feature to locate the presence of heart sounds, and detect them
from the processed data, forming the proposed Heart Sound Segmentation (HSS) scheme, namely
HSS-VMD/SE. The performance of the algorithm has been evaluated using 53 cardiac periods
obtained from 10 randomly selected phonocardiograph records. The simulation results have shown
nearly 90 percent correct result (for less than 5 ms error), paving the way for further investigation
of the diagnostic aspects of cardiac sounds in the daily clinical practice.

2.2.1 Proposed method for S1 and S2 Segmentation and Detection


A schematic representation of the proposed scheme is presented in Fig. 2, which is subsequently
explained.

Figure 2: Schematic of the proposed method.

2.2.2 Preprocessing
A median filter followed by a low-pass filter with a cut-off frequency 800 Hz was applied since the
heart sound signals mainly have frequencies predominantly less than 800 Hz.

8
2.2.3 VMD and Selection of Mode
The preprocessed PCG signal was then decomposed into 7 modes using VMD. Each of the obtained
7 modes, were analyzed for all the 10 PCG signals and the average normalized S1 detection error
was calculated. It was observed that the 3rd mode exhibited the minimum error among all the 7
modes. Subsequently, we propose the sole use of the 3rd mode, µ3 for the further analysis in order
to detect S1 and S2 .

2.2.4 Normalization and Low Pass Filtering


The absolute value of the selected 3rd mode was normalized and then passed through a tenth order
Butterworth low pass filter with cut off frequency 200 Hz. The first heart sounds (S1 ) of the cardiac
cycles have frequency less than 200 Hz. The signal obtained was considered for the further analysis.

2.2.5 Shannon Energy and Thresholding


A threshold is applied on the obtained Shannon Energy plot in order to find the location of the first
heart sound, S1 . The value of the threshold is set based on the fact that the processed signal has
higher energy at S1 locations as compared to the S2 locations and other components like murmurs.
The Fig. 3c and Fig. 3d shows the plots of Shannon Energy of mode 3 before and after thresholding
respectively.

(a) Error variation across different modes, µk . (b) Plot of |µ3 | after normalization and filtering.

(c) Shannon Energy plot of |µ3 |. (d) Shannon Energy plot of |µ3 | after thresholding.

Figure 3: Plots for Shannon Energy before and after thresholding.

2.2.6 Post processing for location of S1 and S2


A number of peaks are observed in the plot of the thresholded Shannon Energy of the 3rd mode,
µ3 of the PCG signal. In order to predict the S1 locations, we discard the erroneous peaks by the
using the fact that the time duration between two consecutive S1 sounds must be at least 300ms.
The prominent peak between two consecutive S1 along with the information that systolic period
(time duration between S1 and S2 ) is less than the diastolic period (time duration between S2 to
S1 ) helped to detect location of S2 .

2.3 Detection of the Third Heart Sound, S3 using VMD


2.3.1 Smoothed Pseudo Wigner-Ville Distribution
In this section, a brief explanation of Smoothed Pseudo Wigner-Ville Distribution (SPWVD) is
provided by citing [44] to build a background for Time-Frequency analysis of non-stationary signals.
Cohen generalized that various types of time-frequency distribution (TFD) can be originated from

9
the Wigner-Ville distribution (WVD) and concluded the general formulation of kinds of TFDs as
follow:
Z∞ Z∞ Z∞
τ τ
M (t, f ) = p(v + )p∗ (v − )ψ(τ, w)
2 2
−∞ −∞ −∞

e−j2π(tw+τ f −vw) dvdwdτ (20)

which is further converted in the form:


Z∞ Z∞
M (t, f ) = Λ(m − t, β − f )Wp (m, β)dmdβ (21)
−∞ −∞

Where,
Z∞ Z∞
Λ(t, f ) = ψ(τ, w)e−j2π(tw+τ f ) dτ dw (22)
−∞ −∞

In special case, the Cohens class distribution becomes WVD if the ambiguity function ψ(τ, w) = 1,
as shown below:
Z∞
v v
Wp (t, f ) = p(t + )p∗ (t − )e−j2πuf dv (23)
2 2
−∞

The Λ function can be viewed as a smooth function. At the same time, Cohens class distribution
can be regarded as the smoothed Wigner-Ville distribution. The relationship between short-time
Fourier spectrogram and smooth WVD can be related with:

S(t, f ) = |ST F Tp (t, f )|2


Z∞ Z∞
= Wy (m − t, β − f )Wp (m, β)dmdβ (24)
−∞ −∞

The resolution trade-off between temporal and spectral domain in the spectrogram can be observed
clearly in the spectrogram from the above relation. However, if Λ is defined as the product of
windows from both the time and frequency domain as:

Λ(t, f ) = r(t).Y (−f ) (25)

Then definition formula of the W-V distribution can be changed as:

SP W Vz (t, f )
R∞ R∞
= y(τ )[ r(m − t)z(m + τ2 )z ∗ (m − τ2 )dm]e−j2πf τ dτ
−∞ −∞

which refers to the definition formula of SPWVD, capable of eliminating the cross-item artifacts
present in the W-V distribution.

2.3.2 Proposed Method for S3 Detection


This work extends to the detection of third HS, S3 . The detection of S3 is more challenging as
compared to the detection of first and second HS S1 , and S2 , respectively due to its relatively low
amplitude characteristics. Fig. 4 illustrates the flow diagram for proposed signal processing for
detection of third HS, S3 . First, the abnormal HS that consist of S3 are obtained from different
publicly-available HS databases [45–47] and are preprocessed by first extracting one cardiac cycle
followed by decimating it by a factor of 10 for further processing. The sampling rate of the decimated
signal is reduced from 44.1 kHz to 4.41 kHz for Michigan and Washington database containing

10
Figure 4: Flow diagram of proposed method for detection of S3
.

1 1
Amp

Amp
0 (a) 0 (b)

-1 -1
0 2000 4000 0 300 600 900
100
100
NR

NT

500 (c) 50 (d)

900 0
300 600 900 0 300 600 900
No. of Samples No. of Samples
Figure 5: (a) A reconstructed HS signal obtained by summing first 3 slower modes, (b) An extracted
cardiac cycle, (c) TFR of the cardiac cycle, (d) Plot of the normalized column sum.
Label:“NA”refers to normalized amplitude; “NR”stands for number of rows containing
frequency components obtained from TFR; “NT”refers to plot obtained by the sum of
normalized columns of TFR.

S3 [47] [45] and from 11.025 kHz to 1.1025 kHz in case of the Littman database containing S3 [46].
The Heart cycles are extracted from the abnormal HS signals based on the consecutive locations
of S1 . Then, the VMD is used on the preprocessed cardiac cycle signal to split it into 7 numbers
of modes. The first 3 slow modes are combined to obtain a reconstructed cardiac cycle, suitable
for localization of lower frequency components corresponding to S3 . After that, the time-frequency
analysis is carried out in order to localize S1 , S2 , and S3 . Therefore, the SPWVD followed by the
normalized column summation of the obtained TFR is done. If A ∈ <m×n is considered to be the
TFR matrix obtained using SPWVD, then the column sum vector, s is defined as s = AT 1 which is
− − −
further analyzed for the S3 detection. The SPWVD having independent separable kernel function
for the time and frequency smoothing has been implemented using Time-Frequency Toolbox in
MATLAB. The location of S1 , S2 and S3 in the time domain are found using the peaks obtained.
Finally, the S3 is detected based on the threshold value with respect to higher peaks corresponding
to S1 and S2 respectively and by exploiting the information of the time differences between S2
and S3 , which is usually more than 110 ms. A combined HLS signal containing S3 is chosen to
demonstrate the output at each stage of the proposed flow diagram of Fig. 5.

11
2.4 Detection of Cardiac Murmurs using Peak Envelope Bandwidth
This work extends to the detection of murmurs in phonocardiogram signals. The diagnosis of
murmurs is a challenging task, especially when the heart sound signals are corrupted by lung
sounds and background noise. Fig. 6 illustrates the flow diagram of the proposed noise-robust
technique for murmur detection.

Figure 6: Flow diagram of proposed method for detection of Cardiac murmurs


.

2.4.1 Preprocessing
The input PCG signals are obtained from different publicly available databases namely Washington
[45], Littman database [46], and Michigan database [47]. and are preprocessed by cardiac cycle
segmentation followed by low pass filtering with a cutoff frequency of 800 Hz. The cutoff frequency
was chosen based on the observation that heart sound signals mainly have frequencies predominantly
less than 800 Hz.

2.4.2 Cardiac Cycle Segmentation


Each of the preprocessed PCG signals was then decomposed into 7 intrinsic modes with character-
istic central frequencies using Variational Mode Decomposition (VMD). The peaks corresponding
to the first (S1 ) and second (S2 ) heart sounds are localized using Shannon energy thresholding and
the signals are segmented into cardiac cycles using these peak locations and timing information.
The value of the threshold is set based on the fact that the processed signal has higher energy at S1
locations as compared to S2 locations and other components like murmurs. A number of peaks are
observed in the plot of the thresholded Shannon Energy of the 3rd mode, µ3 of the PCG signal. In
order to predict the S1 locations, we discard the erroneous peaks by the using the fact that the time
duration between two consecutive S1 sounds must be at least 300ms. The prominent peak between
two consecutive S1 along with the information that systolic period (time duration between S1 and
S2 ) is less than the diastolic period (time duration between S2 to S1 ) helped to detect location of
S2 .

2.4.3 Feature Extraction for Murmur Detection


The intermediate signal component between consecutive S1 and S2 peaks is then decomposed into
intrinsic mode functions using Variational Mode Decomposition (VMD) and the fastest varying
mode is used for further analysis. Once the peak envelope of the fastest varying mode is obtained,
it is observed that noise envelop oscillates faster as compared to that of the murmur envelope,

12
and its frequency spectrum is spread over a larger frequency band. Based on this observation, the
bandwidth of the peak envelopes are calculated and are used as a feature for classification.

2.4.4 Murmur Diagnosis using Thresholding


The average peak envelope bandwidth for murmur is comparatively lower than that of noise, and
thus accurate classification between murmur and noise can be performed by using a threshold on
the peak envelope bandwidth. Experimentally, it has been observed that the bandwidth for murmur
envelope always lies below 0.1 rad/sample. Thus if the peak envelope bandwidth of an unknown
signal is below 0.1 rad/sample, we classify it as murmur, otherwise it is detected as background
noise.

3 Results and Discussion


3.1 Separation of Heart and Lung sounds
3.1.1 Simulation Results
Fig. 7 shows a randomly selected mixed signal and separated heart and lung sounds signal are
shown in Fig. 8 and Fig. 9 respectively. The x-axis represents number of samples whereas y-axis
denotes normalized amplitude of signals.
1 1 1
Amplitude of separated HS

Amplitude of separated LS
Amplitude of mixed signal

0.5 0.5 0.5

0 0 0

-0.5 -0.5 -0.5

-1 -1 -1
0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 2500 3000 3500 4000
Number of samples Number of samples Number of samples

Figure 7: Mixed Signal. Figure 8: Separated HS. Figure 9: Separated LS.

The reason of decomposing the input composite signal into specifically 7 numbers of modes
is provided with the results plotted in Fig. 10a to Fig. 10d. It has been observed that mean
square error (MSE) between the original signal and reconstructed signal obtained by the sum of
different modes, reduces with increase in the number of modes but at the cost of CPU execution
time as illustrated in Fig. 10a. It has been also found that the value of correlation coefficient
between original signal and the reconstructed signal increases as we increase the number of modes
for the reconstruction of the signal as shown in Fig. 10b. However, the number of decomposed
modes is directly related to CPU execution time and space requirement as depicted in the plots
of Fig. 10c and Fig. 10d respectively. Therefore, considering these parameters, we have chosen 7
numbers of modes to decompose the combined HLS signal. Though there is a marginal decrease in
correlation coefficient when up to mode number 7 is chosen over its higher modes but it takes less
computational time and memory which may be appreciated for real-time applications. Similarly,
slightly less MSE may be achieved using higher modes as compared to 7 numbers of modes, but it
will require higher execution time and more space. The choice of selecting less than 7 numbers of
modes for decomposing the composite signal was discarded based on the comparatively larger value
of MSE and smaller value of correlation coefficient.

3.1.2 Lung Sound Separation Error Analysis


The performance of proposed algorithm was evaluated via statistical methods, which is expressed
as mean µ(ε) and standard deviation σ(ε) of Power Spectral Density (PSD) error ε(f ) ranging from
0 to 500 Hz in the form of decibel. The power spectral density of original LS Pl (f ) and extracted

13
× 10-3
6
M2
4 M3

MSE
M4
M5
2 M6
M7 M8 M9 M10

0
0 2 4 6 8 10
CPU ET (s)
(a)
1
Corr Coeff

0.95 M8 M9 M10
M7
M5 M6
0.9 M4
M3
0.85
M2
0.8
0 2 4 6 8 10
CPU ET (s)
(b)
10
CPU ET (s)

8
6
4
2
0
2 3 4 5 6 7 8 9 10
No. of Modes
(c)
29.2
Memory (MB)

29

28.8

28.6

28.4
2 3 4 5 6 7 8 9 10
No. of Modes
(d)

Figure 10: (a) MSE vs CPU execution time with variation in no. of modes and (b) Correlation
Coefficient vs CPU ET with variation in no. of modes (c) CPU execution time vs no.
of modes and (d) memory requirement vs no. of modes. Here M denotes the no. of
decomposed modes.

LS Pbl (f ) are calculated using the following equation:

P (f ) = 10log10 (A2 (f )) (26)

where A(f ) denotes the magnitude response at frequency f . Thus the power spectral error ε(f )
can be calculated through:
ε(f ) = Pl (f ) − Pbl (f ) (27)

3.1.3 Comparison of LS Separation Results with Existing Schemes


The numerical results shown in Table 1, for a total of 50 mixed signals clearly shows that the
VMD is superior to the other methods in LS separation from mixed signal containing LS and HS,

14
followed by NAM-EMD (Noise assisted M-EMD), M-EMD (Multivariate EMD), EEMD (Empirical
Mode Decomposition), EMD (Empirical Mode Decomposition), and RLS (Recursive Least Squares)
algorithm. As a conclusion, LS extraction based on our proposed method using VMD performs ex-
ceptionally well due to its robustness against non-linear and non-stationary signals. The validation

Table 1: Numerical analysis of PSD Error in terms of mean and standard deviation in dB
Methods RLS EMD EEMD M-EMD NAM-EMD Proposed
LS Separation 5.58 ± 5.76 2.91 ± 5.77 2.73 ± 5.06 2.41 ± 4.97 1.89 ± 3.68 1.63 ± 3.29

by medical experts on the database of 100 separated LS and HS show that the average score for
LS clarity is around 51 percent and still there is HS interference in the separated LS. However,
separated LSs were easily identified and its clinically meaning information remains intact.

3.2 S1 and S2 Detection results


To evaluate the performance of the proposed method, it was applied on 10 randomly selected
PCG signals from the Peter J. Bentley Heart Sound Challenge Database. For each of the signals,
the Shannon Energy of the proposed 3rd mode, µ3 is calculated, followed by the application of
thresholding. The post-processing analysis, based on the higher energy criterion associated with S1
is exploited in the prediction of the locations of S1 .
As a demonstration, the results of the application of the proposed scheme on PCG signal of
subject 2 are presented subsequently. Fig. 11a with X axis as the number of samples and Y axis as
the normalized amplitude, shows the original PCG signal of the mentioned subject. Fig. 11b, Fig.
11c and Fig. 11d represents the locations of the actual and the predicted first, second and third
S1 sounds of the PCG signal. The red dotted lines in Fig. 11e represent the actual locations of all
the S1 sounds in PCG signal of subject 2. The green dotted lines in Fig. 11f depict the predicted
locations of all the S1 sounds of the same subject. For the quantization of the performance analysis,

0.02

0.015

0.01

0.005
Amplitude

−0.005

−0.01

−0.015

−0.02
0 2000 4000 6000 8000 10000 12000 14000 16000
Samples

(a) Phonocardiogram signal of a randomly cho- (b) Comparison between estimated and actual lo-
sen subject 2 . cation of first S1 .

(c) Comparison between estimated and actual loca- (d) Comparison between estimated and actual loca-
tion of second S1 . tion of third S1 .

(e) Actual locations of S1 sounds in PCG signal. (f) Predicted locations of S1 sounds in PCG signal.

Figure 11: Red dotted line represents actual S1 location while green dotted line represents estimated
ones.

15
we define the following two error measures:

3.2.1 Average Absolute Error (E1 )


The average absolute error (E1 ) is defined as
N
1 X
E1 = |p(n) − q(n)| (28)
N
n=1

where n = 1, 2, ...N . N denotes the total number of occurrences of S1 in the signal, p(n) and q(n)
denote the actual and the estimated locations(sample numbers) of S1 in the signal respectively.
Thus, E1 is a measure of the average number of samples by which the estimated locations of S1
differs from the actual ones.

3.2.2 Average Detection Time Error (E2 )


The average detection time error (E2 ), is defined as:

E1
E2 = (29)
Fs
where E1 represents average absolute error and Fs is the sampling frequency of the signal, which
is 44100 Hz. Thus, E2 is a measure of the average error between the actual and the predicated
S1 locations in the time domain. The above errors were calculated for all the 10 signals and are
presented in Table 2. The number of S1 occurrences in each of the 10 chosen signals and their

Signal Avg. abs. error Avg. detection Signal S1 0-1 1-5 5-10 10-40
(No. of samples) error (ms) frequency ms ms ms ms
S1 18.25 0.4 S1 4 4 0 0 0
S2 7.2 0.16 S2 5 5 0 0 0
S3 24 0.54 S3 5 4 1 0 0
S4 95.4 2.16 S4 5 2 2 1 0
S5 9154 207 S5 6 2 0 0 4
S6 8.4 0.19 S6 5 5 0 0 0
S7 13.33 0.3 S7 6 5 1 0 0
S8 530.66 12.03 S8 6 2 2 0 2
S9 15.83 0.359 S9 6 6 0 0 0
S10 43.88 0.994 S10 6 4 2 0 0

Table 2: Error calculation in terms of number of Table 3: Distribution of S1 predictions across


samples and corresponding time. different time error ranges.

distribution across different average detection time error ranges are depicted in Table 3. It is evident
from Table 3, that the proposed scheme locates the first heart sound, S1 , within a maximum error
range of 40ms with almost 90 percent of predictions having less than 5ms of error in the time
domain. Tables 4 and 5 report the time error analysis of S2 detection of two randomly chosen PCG
signals (subject 2 and 8) respectively. The accuracy of detection of heart sounds increases as the
time error threshold increases. It is observed that the achieved accuracy is over 95 percent for the
time error tolerance of 35ms.

3.3 S3 Detection Results


A typical heart cycle is selected to illustrate the proposed technique for detection of S3 , for both
noiseless and noisy situation as shown in Fig. 13a and 13b respectively. The same clean signal,
containing S3 , shown in Fig. 13a, is corrupted by adding white Gaussian noise as shown in Fig.

16
S2 occurrences Detection time error
S2 occurrences Detection time error
First S2 1.2 ms
First S2 0.137 ms
Second S2 2.65 ms
Second S2 1.179 ms
Third S2 2.22 ms
Third S2 0.181 ms
Fourth S2 3.08 ms
Fourth S2 5.83 ms
Fifth S2 0.589 ms
Table 4: Error analysis of S2 detection of
Table 5: Error analysis of S2 detection of
PCG (subject 2).
PCG (subject 8).

Figure 12: Error Analysis.

13b to obtain a noisy cardiac cycle at SNR level of 0 dB. As a result, the appearance of S3 is not
visually clear in the noisy signal. In both the Fig. 13a and Fig. 13b, the subplots (a) represent a
preprocessed cardiac cycle containing S3 , (b) represent the cardiac cycle obtained from combination
of first 3 slower modes via the VMD technique, (c) show the TFR of reconstructed cardiac cycle
obtained using the SPWVD in terms of columns and rows, and (d) refer to the plot of corresponding
peaks generated from the sum of the normalized columns of the TFR respectively. The presence of
1 1
NA

NA

0 0
-1 -1
0 200 400 600 (a) 800 1000 1200 1400 0 200 400 600 (a) 800 1000 1200 1400
1 1
NA
NA

0 0
-1 -1
0 200 400 600 (b) 800 1000 1200 1400 0 200 400 600 (b) 800 1000 1200 1400
200 200
NR

NR

800 800
1400 1400
0 200 400 600 (c) 800 1000 1200 1400 0 200 400 600 (c) 800 1000 1200 1400
250 200
NT
NT

125 100
0 0
0 200 400 600 (d) 800 1000 1200 1400 0 200 400 600 (d) 800 1000 1200 1400
No. of samples No. of samples

(a) Steps of the S3 detection (noise-less case). (b) Steps of the S3 detection (noisy case, 0 dB).

Figure 13: S3 detection in noiseless and noisy HS signal. (a) A preprocessed cardiac cycle containing
S3 , (b) Reconstruction by the first 3 slower modes, (c) TFR of the reconstructed cardiac
cycle, (d) Plot of the normalized column sum. Label:“NA”refers to normalized ampli-
tude; “NR”stands for the number of rows containing frequency components obtained
from TFR; “NT”refers to plot obtained by the sum of normalized columns of TFR.

the lowest peak among other three higher peaks in subplot (d) of both figures 13a and 13b, confirms
the presence of S3 based on threshold and timing information of HS related to the peaks. It has
been found that the proposed method is able to detect S3 when its average normalized amplitude is
approximately 14.1% as compared to the normalized amplitude of highest peak value for all the 40
cardiac cycles. The proposed method is compared with other existing methods on the same signals.
It is observed that the proposed scheme is better than the existing scheme based on HVD [25] that
requires on average 16.13% of normalized amplitude of S3 in comparison to normalized amplitude
of the peak value present in the cardiac cycle, followed by different methods - HHT, ST and STFT
|S3 |
as shown in Table 6 for noiseless conditions. This depicts that the amplitude threshold level |S 1|

17
gets slightly higher to detect the S3 in the noisy situation for all these methods at the SNR=10 dB.
The SNR level of 10 dB is chosen to demonstrate the comparison of different methods in the noisy
case as these methods are able to detect S3 clearly for all the 40 cardiac cycles at this minimum
SNR or above. It has been observed that the proposed method performs better in detecting S3 in

|S3 |
Table 6: Detection of S3 in-terms of |S1 | in noiseless and noisy conditions (SNR=10 dB) for 40
cardiac cycles.
|S3 | Methods
|S1 |
STFT ST HHT HVD VMD
Noiseless 0.3514 0.3428 0.4257 0.1613 0.1410
Noisy 0.4738 0.4133 0.5276 0.1924 0.1842

|S3 |
Table 7: S3 detection in noisy signal at |S1 | = 0.05
SNR Methods
(dB) STFT ST HHT HVD VMD
-10 0/40 0/40 0/40 1/40 10/40
-5 0/40 6/40 6/40 9/40 34/40
0 5/40 14/40 15/40 32/40 40/40
5 17/40 32/40 34/40 40/40 40/40
10 40/40 40/40 40/40 40/40 40/40
15 40/40 40/40 40/40 40/40 40/40
≥ 20 40/40 40/40 40/40 40/40 40/40

terms of threshold level, ( |S3|


|S1 | = 0.05) in noisy situations, depicted in Table7 for 40 cardiac cycles.
It is noticed that at the SNR=0 dB, all these existing methods fail but the proposed method using
the VMD is able to detect S3 clearly for all the signals, showing its robustness to noise. A cardiac
signal containing S3 , corrupted by the white Gaussian noise (i.e. a noisy signal at SNR=−5 dB)
is chosen to demonstrate the performance of different methods visually, namely: STFT, ST, HHT,
HVD and the VMD, shown in Fig.14. It is clearly observed that the proposed VMD-based method
performs better in noisy conditions.

3.4 Detection of Cardiac Murmurs


3.4.1 Simulation Results
Fig 15 shows the simulation results when the proposed murmur detection technique is applied on
a randomly selected murmur signal and a AWGN corrupted murmur free PCG signal. As can be
seen from Fig 15b and Fig 15e, there is a stark difference in the randomness of the fastest varying
modes of the murmur and the noise signals. This observation is substantiated by Fig 15d and Fig
15f, where the noise peak envelope oscillates noticeably faster than the murmur envelope.

3.4.2 Peak Envelope Bandwidth Analysis


To evaluate the performance of the proposed method it was applied on 270 PCG signals, among
which 90 had murmur. The 180 murmur free PCG signals were corrupted with Gaussian and
Babble noise at different SNR levels (0 dB, 10 dB, 20 dB, 30 dB, 40 dB and 50 dB) to observe

18
1 1

NA
NA
0 0.5

-1 0
0 500 (a) 1000 1500 0 500 (b) 1000 1500
1 1
NA

NA
0.5 0.5
0 0
0 500 (c) 1000 1500 0 500 (d) 1000 1500
1 1

NA
NA

0.5 0.5

0 0
0 500 (e) 1000 1500 0 500 (f) 1000 1500
No. of samples No. of samples
Figure 14: S3 detection from the noisy signal (SNR=−5 dB). (a) noisy cardiac signal containing S3 .
The subplots (b) STFT, (c) ST, (d) HHT, (e) HVD, and (f ) VMD-based normalized
column sum of SPWVD spectrum. “NA” refers to the normalized amplitude of the
processed signal for heart sound components.

(a) (b) (c)


1 0.4 0.4

0.5 0.2 0.3

0 0 0.2

-0.5 -0.2 0.1

-1 -0.4 0
0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000

(d) (e) (f)


1 0.4 0.3

0.5 0.2
0.2
0 0
0.1
-0.5 -0.2

-1 -0.4 0
0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000

Figure 15: Simulations results for murmur and noise detection. (a)Extracted murmurs, (b)Fastest
mode of murmurs, (c) Murmur peak envelope, (d) PCG segment corrupted with AWGN,
(e) Fastest mode of noise, and (f) Noise peak envelope
.

the performance of the technique in different noisy scenarios. For each of the signals, the peak
envelope of the fastest varying mode was obtained and its bandwidth was calculated. Fig 16(b)

19
shows the computed envelope bandwidths for these input signals. It can be observed that the
murmur bandwidth is generally limited below 0.1 rad/sample, while that of noise corrupted PCG
signals begin from around 0.2 and extend till 0.5, with most of them lying around the 0.3 level.
There is no overlap between the bandwidth values between murmur and noise, and this facilitates
efficient classification using simple thresholding alone.

3.4.3 Probability Distribution of Peak Envelope Bandwidth


Fig 16(a) shows the probability distribution of the computed peak envelope bandwidths of the 270
PCG signals. The distribution is bimodal in nature with the two peaks at around 0.045 and 0.265.
As we observe from Fig 16(b) the probability of occurrence of bandwidth values between these two
peaks is low, with the 0.1-0.2 having almost zero probability. The bimodal probability distribution
of the envelope bandwidths substantiates our claim that accurate classification can be achieved by
setting an appropriate threshold.

0.025 0.5

0.02 0.4
Probability

0.015 0.3
PEBW

0.01 0.2

0.005 0.1

0 0
0 0.2 0.4 0.6 0 10 20 30

PEBW PCG Signals


(a) (b)

Figure 16: (a) Probability distribution of peak envelope bandwidth of the PCG signals. (b)Peak
envelope bandwidth of murmurs and noise
.

4 Conclusion
In this work, a novel technique for the separation of heart sounds from composite heart-lung sound
recordings have been proposed. The method employs Variational Mode Decomposition (VMD) to
segregate heart and lung sounds utilizing their spectral information without any significant loss of
information due to spectral overlap. This is made possible as VMD, in contrary to other spectral
separation methods is applicable even in case of non-stationary signals. Experimental results show
that the proposed method is superior to the other methods in heart sound separation. Thus, HS
extraction based on our proposed method using VMD is a promising algorithm for its robustness
in non stationary and non-linear signal processing.

20
This work is further extended to the detection of murmur and its differentiation from clinical
background noise. While the murmur diagnosis problem has been explored for some time now, dif-
ferentiating murmur from noise is a relatively novel problem. In this work, a technique for achieving
this using a single feature has been proposed. The computed values for the peak envelope band-
width follow a bimodal distribution, and as such, accurate classification can be performed using
thresholding alone, which in turn, ensures low computational cost.

At the outset of this project, our objective was to develop a low cost and computationally efficient
cardiac monitoring system that does not require expert medical supervision, extensive hardware
circuitry or ECG synchronization. Throughout the course of our research, accurate techniques for
cardiac cycle segmentation, separation of heart sounds from composite cardiopulmonary recordings,
identification of the first and second heart sounds and murmur detection in presence of background
noise have been proposed. All these methods have been extensively tested on phonocardiogram data
available in online databases both in noisy and noise free scenarios. As part of future work, we are
planning to propose a novel feature extraction methodology combining variational mode decompo-
sition (VMD) and higher order spectral analysis (HOSA), named VMDHOSA, for characterization
of fundamental heart sounds and classification thereof.

References
[1] Y.-L. Tseng, P.-Y. Ko, and F.-S. Jaw, “Detection of the third and fourth heart sounds using
Hilbert-Huang transform,” Biomed. Eng. Online, vol. 11, no. 1, pp. 1–13, 2012.

[2] P. Hult, T. Fjällbrant, K. Hildén, U. Dahlström, B. Wranne, and P. Ask, “Detection


of the third heart sound using a tailored wavelet approach: Method verification,”
J. Med. Biol. Eng. Comput., vol. 43, no. 2, pp. 212–217, 2005. [Online]. Available:
http://dx.doi.org/10.1007/BF02345957

[3] D. Kumar, P. Carvalho, M. Antunes, J. Henriques, L. Eugenio, R. Schmidt, and J. Habetha,


“Detection of S1 and S2 heart sounds by high frequency signatures,” in Engineering in Medicine
and Biology Society, 2006. EMBS’06. 28th Annual Int. Conf. of the IEEE. IEEE, 2006, pp.
1410–1416.

[4] D. Kumar, P. Carvalho, M. Antunes, P. Gil, J. Henriques, and L. Eugenio, “A new algorithm
for detection of S1 and S2 heart sounds,” in 2006 IEEE Int. Conf. on Acoustics Speech and
Signal Processing Proceedings, vol. 2. IEEE, 2006, pp. 1180–1183.

[5] H. Liang, S. Lukkarinen, and I. Hartimo, “Heart sound segmentation algorithm based on heart
sound envelogram,” in Computers in Cardiology 1997. IEEE, 1997, pp. 105–108.

[6] X. Zhang, L. Durand, L. Senhadji, H. C. Lee, and J. L. Coatrieux, “Analysis-synthesis of


the phonocardiogram based on the matching pursuit method,” IEEE Trans. on Biomedical
Engineering, vol. 45, no. 8, pp. 962–971, Aug 1998.

[7] F. Hedayioglu, M. G. Jafari, S. S. Mattos, M. D. Plumbley, and M. T. Coimbra, “Denoising


and segmentation of the second heart sound using matching pursuit,” in 2012 Annual Int.
Conf. of the IEEE Engineering in Medicine and Biology Society, Aug 2012, pp. 3440–3443.

[8] C. Barabaşa, M. Jafari, and M. D. Plumbley, “A robust method for S1 /S2 heart sounds detec-
tion without ecg reference based on music beat tracking,” in Electronics and Telecommunica-
tions (ISETC), 2012 10th Int. Symp. on. IEEE, 2012, pp. 307–310.

[9] J. Vepa, P. Tolay, and A. Jain, “Segmentation of heart sounds using simplicity features and
timing information,” in 2008 IEEE Int. Conf. on Acoustics, Speech and Signal Processing,
March 2008, pp. 469–472.

21
[10] L. G. Gamero and R. Watrous, “Detection of the first and second heart sound using proba-
bilistic models,” in Engineering in Medicine and Biology Society, 2003. Proceedings of the 25th
Annual Int. Conf. of the IEEE, vol. 3, Sept 2003, pp. 2877–2880 Vol.3.

[11] A. Castro, T. T. V. Vinhoza, S. S. Mattos, and M. T. Coimbra, “Heart sound segmentation


of pediatric auscultations using wavelet analysis,” in 2013 35th Annual Int. Conf. of the IEEE
Engineering in Medicine and Biology Society (EMBC), July 2013, pp. 3909–3912.

[12] C. D. Papadaniil and L. J. Hadjileontiadis, “Efficient heart sound segmentation and extrac-
tion using ensemble empirical mode decomposition and kurtosis features,” IEEE Journal of
Biomedical and Health Informatics, vol. 18, no. 4, pp. 1138–1152, July 2014.

[13] S. Charleston-Villalobos, A. T. Aljama-Corrales, and R. Gonzalez-Camarena, “Analysis of


simulated heart sounds by intrinsic mode functions,” in Engineering in Medicine and Biology
Society, 2006. EMBS ’06. 28th Annual Int. Conf. of the IEEE, Aug 2006, pp. 2848–2851.

[14] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C. Yen, C. C. Tung,
and H. H. Liu, “The empirical mode decomposition and the Hilbert spectrum for nonlinear
and non-stationary time series analysis,” in Proceedings of the Royal Society of London A:
Mathematical, Physical and Engineering Sciences, vol. 454, no. 1971. The Royal Society,
1998, pp. 903–995.

[15] V. K. Iyer, P. A. Ramamoorthy, H. Fan, and Y. Ploysongsang, “Reduction of heart sounds


from lung sounds by adaptive filterng,” IEEE Trans. Biomed. Eng., vol. BME-33, no. 12, pp.
1141–1148, 1986.

[16] L. J. Hadjileontiadis and S. M. Panas, “Adaptive reduction of heart sounds from lung sounds
using fourth-order statistics,” IEEE Trans. Biomedical Engineering, vol. 44, no. 7, pp. 642–648,
1997.

[17] M. T. Pourazad, Z. Moussavi, and G. Thomas, “Heart sound cancellation from lung sound
recordings using time-frequency filtering,” Medical and Biological Engineering and Computing,
vol. 44, no. 3, pp. 216–225, 2006.

[18] L. J. Hadjileontiadis and S. M. Panas, “Separation of discontinuous adventitious sounds from


vesicular sounds using a wavelet-based filter,” IEEE Trans. Biomedical Engineering, vol. 44,
no. 12, pp. 1269–1281, 1997.

[19] T. H. Falk and W.-Y. Chan, “Modulation filtering for heart and lung sound separation from
breath sound recordings,” in 2008 30th Annual International Conference of the IEEE Engi-
neering in Medicine and Biology Society, Aug 2008, pp. 1859–1862.

[20] F. Ghaderi, S. Sanei, B. Makkiabadi, V. Abolghasemi, and J. G. McWhirter, “Heart and


lung sound separation using periodic source extraction method,” in 2009 16th International
Conference on Digital Signal Processing, July 2009, pp. 1–6.

[21] J. Gnitecki and Z. Moussavi, “The fractality of lung sounds: A comparison of three waveform
fractal dimension algorithms,” Chaos, Solitons and Fractals: the interdisciplinary journal of
Nonlinear Science, and Nonequilibrium and Complex Phenomena, vol. 26, no. 4, pp. 1065–1072,
2005.

[22] M. Kompis and E. Russi, “Adaptive heart-noise reduction of lung sounds recorded by a single
microphone,” in 1992 14th Annu. Int. Conf. of the IEEE Eng. in Med. and Biol. Soc., vol. 2,
1992, pp. 691–692.

[23] A. Mondal, P. Bhattacharya, and G. Saha, “Reduction of heart sound interference from lung
sound signals using empirical mode decomposition technique,” J. Med. Eng. & Technol., vol. 35,
no. 6-7, pp. 344–353, 2011.

22
[24] C. Lin, W. A. Tanumihardja, and H. Shih, “Lung-heart sound separation using noise assisted
multivariate empirical mode decomposition,” in 2013 Int. Symp. on Intelligent Signal Process.
and Commun. Sys., 2013, pp. 726–730.

[25] S. Barma, B. W. Chen, W. Ji, S. Rho, C. H. Chou, and J. F. Wang, “Detection of the third
heart sound based on nonlinear signal decomposition and time frequency localization,” IEEE
Trans. Biomed. Eng., vol. 63, no. 8, pp. 1718–1727, 2016.

[26] G. Andria, M. Savino, and A. Trotta, “Application of Wigner-Ville distribution to measure-


ments on transient signals,” IEEE Trans. Instrum. Meas., vol. 43, no. 2, pp. 187–193, 1994.

[27] S. Barma, B.-W. Chen, K. L. Man, and J.-F. Wang, “Quantitative measurement of split of
the second heart sound (S2 ),” IEEE/ACM Trans. Comput. Biol. and Bioinformatics, vol. 12,
no. 4, pp. 851–860, 2015.

[28] P. S. Wright, “Short-time Fourier transforms and Wigner-Ville distributions applied to the
calibration of power frequency harmonic analyzers,” IEEE Trans. Instrum. Meas., vol. 48,
no. 2, pp. 475–478, 1999.

[29] G. Livanos, N. Ranganathan, and J. Jiang, “Heart sound analysis using the S-transform,” in
Computers in Cardiology 2000. Vol.27 (Cat. 00CH37163), 2000, pp. 587–590.

[30] E. Sejdic and J. Jiang, “Comparative study of three time-frequency representations with ap-
plications to a novel correlation method,” in 2004 IEEE Int. Conf. on Acoust., Speech, and
Signal Process., vol. 2, 2004, pp. 633–636 vol.2.

[31] A. Soualhi, K. Medjaher, and N. Zerhouni, “Bearing health monitoring based on Hilbert Huang
transform, support vector machine, and regression,” IEEE Trans. Instrum. Meas., vol. 64,
no. 1, pp. 52–62, 2015.

[32] S. Barma, B.-W. Chen, W. Ji, F. Jiang, and J.-F. Wang, “Measurement of duration, energy
of instantaneous frequencies, and splits of subcomponents of the second heart sound,” IEEE
Trans. Instrum. Meas., vol. 64, no. 7, pp. 1958–1967, 2015.

[33] M. Feldman, Hilbert Transform Applications in Mechanical Vibration. USA: John Wiley &
Sons, 2011.

[34] G. Andria and M. Savino, “Interpolated smoothed pseudo Wigner-Ville distribution for accu-
rate spectrum analysis,” IEEE Trans. Instrum. Meas., vol. 45, no. 4, pp. 818–823, 1996.

[35] A. Djebbari and F. Bereksi-Reguig, “Detection of the valvular split within the second heart
sound using the reassigned smoothed pseudo Wigner-Ville distribution,” J. BioMed. Eng. On-
Line, vol. 12, no. 1, p. 37, 2013.

[36] D. Kumar, P. Carvalho, M. Antunes, R. P. Paiva, and J. Henriques, “Heart murmur classifica-
tion with feature selection,” in 2010 Annual International Conference of the IEEE Engineering
in Medicine and Biology, Aug 2010, pp. 4566–4569.

[37] F. Rios-Gutierrez, R. Alba-Flores, K. Ejaz, G. Nordehn, N. Andrisevic, and S. Burns, “Clas-


sification of four types of common murmurs using wavelets and a learning vector quantization
network,” in The 2006 IEEE International Joint Conference on Neural Network Proceedings,
2006, pp. 2206–2213.

[38] Z. Jiang, S. Choi, and H. Wang, “A new approach on heart murmurs classification with svm
technique,” in 2007 International Symposium on Information Technology Convergence (ISITC
2007), Nov 2007, pp. 240–244.

23
[39] S. L. Strunic, F. Rios-Gutierrez, R. Alba-Flores, G. Nordehn, and S. Bums, “Detection and
classification of cardiac murmurs using segmentation techniques and artificial neural networks,”
in 2007 IEEE Symposium on Computational Intelligence and Data Mining, March 2007, pp.
397–404.

[40] F. Ghaderi, H. R. Mohseni, and S. Sanei, “Localizing heart sounds in respiratory signals using
singular spectrum analysis,” IEEE Trans. Biomed. Eng., vol. 58, no. 12, pp. 3360–3367, 2011.

[41] K. Dragomiretskiy and D. Zosso, “Variational mode decomposition,” IEEE Trans. Signal Pro-
cessing, vol. 62, no. 3, pp. 531–544, 2014.

[42] C. Lin, W. A. Tanumihardja, and H. Shih, “Lung-heart sound separation using noise assisted
multivariate empirical mode decomposition,” in 2013 International Symposium on Intelligent
Signal Processing and Communication Systems, Nov 2013, pp. 726–730.

[43] Z. W. Huang and N. E., “Ensemble empirical mode decomposition: a noise-assisted data
analysis method,” Advances in Adaptive Data Analysis, vol. 1, no. 1, pp. 1–41, 2009.

[44] L. Yang, H. Hao, C. Jiang, and L. Li, “Preliminary study on processing local field potential
with smoothed pseudo wigner-ville distribution for epileptic seizure detection,” in 2010 4th
International Conference on Bioinformatics and Biomedical Engineering, June 2010, pp. 1–4.

[45] “Heart sounds and murmurs,” https://depts.washington.edu/physdx/heart/demo.html.

[46] “50 heart and lung sounds library,” http://solutions.3mae.ae/wps/portal/3M/


en AE/3M-Littmann-EMEA/stethoscope/littmann-learning-institute/heart-lung-sounds/
heart-lung-sound-library/.

[47] “Heart sound and murmur libraray,” http://www.med.umich.edu/lrc/psb open/html/repo/


primer heartsound/primer heartsound.html.

5 Acknowledgment
This research work is supported by IoTimize LLC, USA under the project Classification and Pro-
gression Modelling of Cardiovascular and Pulmonary Diseases using Advanced Data Analytics and
Machine Learning techniques at Indian Institute of Technology Kharagpur, India.

24
List of Publications

1. S. Banerjee, M. Mishra and A. Mukherjee, “Segmentation and detection of first and second
heart sounds (Si and S2) using variational mode decomposition,” IEEE EMBS Conf. on
Biomed. Eng. and Sci. (IECBES), Kuala Lumpur, pp. 565-570.

2. M. Mishra, S. Banerjee, D. C. Thomas, and A. Mukherjee, “Detection of Third Heart Sound


using Variational Mode Decomposition,” IEEE Trans. Instrum. Meas., vol. PP, no. 99, pp.
19, 2018 (early access).

3. M. Mishra, S. Pratiher, S. Banerjee, and A. Mukherjee, “Grading heart sounds through


variational mode decomposition and higher order spectral features,” in IEEE Instrum. and
Meas. Technol. Conf., (I2MTC),Houston, USA, 2018 (accepted).

25

Das könnte Ihnen auch gefallen