No 4 PDF

CHAPTER 1
INTRODUCTION
An electrocardiogram is abbreviated as EKG or ECG.ECG is a medical test that
detects cardiac abnormalities. ECG is a non invasive technique that measures the
electrical activity of the heartbeat. The electrocardiograph records the electrical
activity of the heart muscle and displays this data as a trace on a screen or on paper.
ECG can be recorded with the help of the electrodes, which are placed over the skin.
The purpose of using ECG is to obtain the information regarding the functioning of
the heart. If the shape of ECG is abnormal, then it indicates that the functioning of the
heart is improper. Premature detection of heart diseases can improve the quality of life
and prolong the life. The ECG signal has different morphological features and it plays
a vital role in the analysis. The physicians cannot analyze such a long signal in short
duration of time and the morphological changes cannot be visualized by the naked
eye.
For the better diagnosis, the ECG pattern should be analyzed effectively. As
the vast data is recorded, the physicians can take more time for studying the signal
and the possibility of missing the crucial information is more. For the premature
detection of the heart abnormalities, a powerful system called computer aided
diagnosis is required. By using automatic classification and detection of patterns
based on the feature extraction, we can classify the signal into two classes. These are
facing a lot of problems, as there is a large variations in morphological and temporal
characteristics. To extract the features, we need to transform the signal. This
transformed signal will give the information of the frequency spectrum present in the
signal, but not the temporal characteristics. The analysis of the ECG signal in the time
variations is important as the morphology changes and the frequency component
varies with time. This makes the use of time-frequency analysis.
The major aim is to process and extract the significant information from the
ECG signal for the clinical purpose and the automatic heart beat detection using the
signal processing and pattern recognition algorithms.
1.1. HUMAN HEART ANATOMY:
The heart is a muscular organ that serves two functions. One is to collect the
deoxygenated blood from the tissues of the body and pump this blood to the lungs to
1
pick up oxygen and release carbon dioxide. The other is to collect oxygen-rich blood
from the lungs and pump this blood to all of the tissues of the body.
Figure 1.1: Heart Anatomy

The heart is situated in the protective thoracic cage medial to the lungs,
posterior to the sternum and costal cartilages, and rests on the superior surface of the
diaphragm. The human heart has a special kind of pacemaker cells which is having
the property of triggering the electrical sequence of depolarization and repolarization
of heart muscle. The Sino atrial (SA) node is the natural pacemaker, which initiates
electrical impulse signal and mechanical cycle of the heart. Atrioventricular (AV)
node will control and slow down the electrical current, which is sent by the sinoatrial
node before it reaches to the ventricle.The heart has four chambers. They are left atria,
right atria, left ventricles and right ventricles. The left atria and the right atria are the
upper chambers of the heart whereas the left ventricle and the right ventricle are the
lower chambers of the heart. The left and right atria and the ventricles are separated
by a wall of muscle called as septum. The left ventricle is the strongest and largest
chamber in the heart. There are four valves in the heart which are used to have blood
flow within the heart. These valves will prevent the blood to flow in wrong direction.
The four valves are mitral valve, tricuspid valve, aortic valve and pulmonary
valve. Mitral and tricuspid valves are the two atrioventricular (AV) valves whereas
the aortic and pulmonary are the two semilunar valves. The heart valves will prevent
the blood flow in wrong direction. The mitral and tricuspid valves are situated in
between the atria and ventricles. Aortic and Pulmonary valves are situated in between
2
the ventricles. The blood enters the heart through the two large veins like inferior and
superior vencava and flow into the right atrium. When the right atrium contracts the
blood flow will be from right atrium to right ventricle through the open tricuspid
valve. When the ventricle is full, the tricuspid valve will be closed. As the right
ventricle contracts, blood leaves the heart through the pulmonary valve, into the
pulmonary artery. From the pulmonary artery the blood flows into the tiny capillary
vessels and then into the lungs. Here, oxygen travels from the tiny air sacs in the
lungs, through the walls of the capillaries and then into the blood. At the same time,
carbon dioxide is passed from the blood into the air sacs. Once the blood is purified,
the oxygen rich blood from the lungs is received by the left atrium. Then the atrium
contracts and the blood will flow from left atrium to the left ventricle through the
open mitral valve. When the ventricle is full, the mitral valve will be closed. Then the
left ventricle contracts and the purified blood is pumped to the rest of the body
through aortic valve.
The process of blood circulation through heart is divided into two stages i.e.,
systole and diastole. Both the atria and ventricles undergo systole and diastole, and it
is essential that these components be carefully regulated and coordinated to ensure
that the blood is pumped efficiently to the other parts of the body. The period of
contraction that the heart undergoes while it pumps blood into circulation is called
systole. The period of relaxation that occurs as the chambers fill with blood is called
diastole. The period of time that begins with contraction of the atria and ends with the
ventricular relaxation is known as the cardiac cycle.
1.2. ECG SIGNAL:
The ECG signal reflects the net electrical activity at a certain moment. The rhythm of
the beating heart is estimated as beats per minute (bpm). The ECG signal has three
main components, namely P, QRS, T waves. P wave is used to represents the
depolarization of the atria. The QRS complex is used to represents the depolarization
of the ventricles and the T wave is to represent the repolarization of the ventricles. U
is the successor of the T wave. O is the origin point.
3
Figure 1.2: Schematic representation of standard ECG signal
P wave: P wave indicates the depolarization of the atria’s. The repolarization is
invisible because of its low amplitude which is about 0.1-0.2mV. The duration of the
P wave is about 60-80ms.
QRS Complex: QRS complex indicates the depolarization of the left and right
ventricles which triggers the main pumping contractions in the heart. Q wave is a
downward deflection immediately preceding the ventricular contraction. R wave is the
peak, which indicates the ventricular contraction. S wave is the downward deflection
appears immediately after the ventricular contraction. QRS complex is a high
amplitude wave with 1mV. The duration of the QRS complex is about 60-100ms. It
plays a vital role in diagnosing cardiac arrhythmia.
T wave: T wave indicates the repolarization of the left and right ventricles. The
amplitude of the T wave lies in between 0.1-0.3mV and its duration is about 120-
160ms.
PR interval: It indicates the time taken from atrial depolarization to ventricle
depolarization. The duration of PR interval is about 120-200ms.
RR interval: It denotes the time between the successive ventricle depolarization.
The duration should be less than 3s.
ST segment: It indicates the time during early ventricle repolarization. ST segment
starts from J, which lies in between QRS complex and ST segment. The duration of
the ST segment should be less than 20ms.
1.2.1. ECG Signal Acquisition:
ECG is used to trace out the electrical activity of the heart cells. This can be achieved
by using an electrode. Electrode acts as a transducer which is capable of converting
the ionic potentials which are generated within the body into electrical potentials. It
will measure the ionic potential difference between their respective points of
4
application on the body surface. The electric potential across a cell membrane is the
result of different ionic concentrations that exist inside and outside the cell.
Cells are the basic building block and it consists of ionic conductors, which is
separated by a semi permeable membrane. The cell is surrounded by the fluids and
contains ions like sodium (Na+), potassium (K+), and chloride (Cl-). The separation
of charge exists across the cell membrane as these ions are either positively charged
or negatively charged. The sodium ions have high concentration at outside the cell
and low concentration inside the cell. In order to have the balance charge
concentration, the potassium ions try to move inside the cell as it has low
concentration at outside the cell and high concentration inside the cell. So, there is no
charge or concentration takes place. The equilibrium is reached with a potential
difference, with positive on outside and negative on inside the membrane. This is
known as resting potential, which varies from -60 to -100mV. A cell is said to be
polarized, if it is in resting potential.
If the cell is excited by the external stimulus, the membrane changes its
characteristics. When there is an ionic current flow, the sodium ions will accelerate
into the cell. This leads to the avalanche effect and at the same time the potassium ion
will try to come out of the cell. A new equilibrium is reached with a potential
difference across the cell, which is negative on outside the cell and positive on inside
the cell. This is known as action potential, which is approximately 20mV for the cells.
The process of changing from the resting potential to the action potential is called
depolarization and the process of changing from action potential to the resting
potential is called resting potential. The detection of these tiny potential changes can
be done by using electrodes. The acquisition of the ECG signal can be done by using
3 Lead ECG and 12 Lead ECG.
1.2.1.1. Three Lead ECG:
The 3 lead ECG is simple to use and can be performed easily. We will use 3
electrodes, which are placed on the human body. They are RA (right arm), LA (left
arm) and LL (left leg). The data is collected from Leads I and II and the difference of
these two channels are being equal to Lead III. In 3 lead, the electrodes can be placed
in two ways. One way is to place the electrodes on the chest wall from the equidistant.
This type of placement will give the better results.
5
Figure 1.3: placement of 3 Lead ECG on the chest walls
Another way of placing the electrodes is made by using an extra electrode RL
(right leg), which acts as a ground or reference position of electrode. The placement
of electrodes can be done in three ways.
• Placing leads on RA, LA and RL.
• Placing leads on RA, LL and RL.
• Placing leads on LA, LL and RL.
Figure 1.4: Normal placement of 3 Lead ECG

1.2.1.2. 12 Lead ECG:
The 12 lead ECG had become the standard diagnostic tool for tracing the activity of
the heart, where we will use 10 electrodes. But, these 10 electrodes will provide the
12 perspectives of the heart’s activity using different angles by using two electrical
(horizontal and vertical) planes. In this case, the ECG signal is obtained from four
chest leads or precordial leads and limb leads placed at six different positions of the
human body. The chest leads are V1, V2, V3, V4, V5, and V6. The leads I, II and III
are connected to left arm, right arm, left leg respectively. The right leg (RL) is used as
the reference of the electrode. The combination of right arm, left arm and left leg
forms a reference known as Wilson’s central terminal, which is utilized as a zero
equivalent. There are three augmented limbs, known as augmented vector left (aVL),
augmented vector right (aVR), augmented vector foot (aVF). Lead I, lead II, lead III
are bipolar leads whereas the augmented leads are unipolar leads. The Einthoven's
triangle is the equivalent triangle, which explains the six frontal leads in fig 1.5.
6
Figure 1.5: Einthoven’s triangle
The projection of the three dimensional cardiac electric vector on to the axes is
measured by using six leads in the fig 1.5. The view and analysis of the heart’s
electrical activity can be achieved through projections. The placement of the chest
electrodes are shown in fig. The lead V1 is placed at the fourth intercostal space on
the right sternum and the lead V2 is placed at the fourth intercostal space at the left
sternum. V3 is placed in between the V1 and V2. V4 is placed at the fifth intercostal
space at the midclavicular line. V5 is placed at the arterior axillary line on the same
horizontal level as V4. V6 is positioned in the mid-axillary line on the same
horizontal level as V4 and V5. These six leads are used to find the electrical activity
in different ways in cross sectional plane. V1 and V2 are used to reflect the activity of
the right half of the heart. The spectral activity can be described by the leads V3 and
V4. V5 and V6 will detect the activity of the left ventricular. The electrodes
placement should be exact in order to have better signal acquisition from the human’s
heart. The slight misplacement of the electrodes may lead to the false signal
acquisition.
Figure1.6: Placement of the chest leads in 12 lead ECG
7
1.2.2. Characteristics of ECG Noises:
ECG is the system which highly demands the exact determination of the characteristic
points like P, QRS and T. Noise is the unwanted component in any signal. In
practical, the ECG signal is corrupted with the noise. There are various kinds of
artifacts which get added in the ECG signals and change the characteristics of the
original signal. As the signal is distorted by the noise, it is very difficult to analyze the
signal. The vital features and the morphological changes are affected and they are
disturbed by the noise. Presence of noise may also leads to wrong interpretation of the
signal. The different types of noises are
1. Power line interface noise.
2. Baseline Wander noise and abrupt drift noise.
3. Electromyogram noise.
4. Electrode pop up noise
5. Instrumentation noise.
1. Power line Interface Noise:
Power line interface (PRI) is an insistent noise, which is correlated within the signal.
PLI is coupled to the signal carrying cables and they are subjected to the
electromagnetic interface (EMI) of frequency 50Hz or 60Hz by the pervasive supply
lines. The recordings of ECG signals are totally influenced by the PLI noise. As the
power line signals are in the same frequency range of the ECG signal, it is very
important to reduce such type of noise. It is a consequential source of noise during the
bio-potential measurements. The electromagnetic interface will degrade the quality of
the signal and distorts the tiny features, which plays a major role in monitoring and
diagnosing the signal. It will also affect the bio potentials. It is observed that the PLI
noise can contaminate the ECG recordings, due to the differences in the electrode
impedance and the stray currents through the patient, cables or in instruments.
Figure 1.7: Signal contaminated with power line interface noise

2. Baseline Wander and Abrupt Drift Noise:
8
Baseline wander (BW) is a low-frequency artifact in the electrocardiogram (ECG)
signal recordings. The spectrum of the baseline wander ranges between o.5Hz and
1Hz. BW removal is an important step in processing of ECG signals because BW
leads to wrong interpretation of the ECG recordings. The main cause of the baseline
wander noise is due to change in electrode-to-skin polarization voltages, by the
movement of the electrodes, by respiration movement or by body movement. In
baseline wander, the isoelectric line change positions. During respiration baseline
wander, the amplitude is varied about 15percent of peak to peak amplitude at the
frequency ranges between 0.15Hz and 0.3Hz.
Motion artifacts are the transient baseline changes which are caused by motion
of the electrode. The vibrations, movement, or respiration of the subject (human)
contribute to motion artifacts. The peak amplitude and duration of the artifact depend
upon the various unknown parameters such as the electrode properties, electrolyte
properties, skin impedance, and the movement of the patient. In ECG signal, the
baseline drift occurs at a low frequency approximately 0.014Hz and most likely
results from very slow changes in the skin-electrode impedance.
The magnitude of the baseline wander can exceed the QRS complex for
several times. The heavy baseline wanders and motion artifact can distort the low
frequency components and ST segments of ECG signal.
Figure 1.8: signal contaminated with baseline wander noise

3. Electromyogram Noise:
Electromyogram noise is also called as muscle artifacts. The electromyogram (EMG)
noise is generated because of the electrical activity of the muscle. When the
neighbourhood muscles of the electrode contracts, ECG will pick up the waves that
are generated by the proximity muscle contraction. Generally, the amplitude of the
EMG will be 10 percent of the ECG signal with a bandwidth range of 20 to
1000Hz.The frequency range of EMG is about 10 kHz. It is known that the EMG is
9
random in nature and Gaussian distribution function. The mean is assumed as zero
and the variance is dependent on the variables. It is necessary to reduce the EMG
noise, as it could overwhelm the desired signal.
Figure 1.9: Signal contaminated with electromyogram noise

4. Electrode Popup or Contact noise:
The position of the electrodes and the change in the propagation medium between the
electrodes and the heart are the causes of electrode contact noise. This leads to the low
frequency baseline shifts and sudden changes in the amplitude. Due to poor
conductivity between the skin and the electrodes, the ECG signal will get distorted.
The larger the electrode-skin impedance, smaller are the relative impedance change
which is required to cause a major shift in the baseline of the ECG signal. It might be
impossible to detect the signal features reliably in the presence of body movement if
the skin impedance is significantly high.
Figure 1.10: Signal contaminated with electrode contact noise

5. Instrumentation Noise:
The electrical equipment that is used in ECG measurements will also contributes
noise. Electrode probes, cables, signal processor/amplifier, and the Analog-to-Digital
converters are the major sources of this form of noise. Though the instrumentation
noise cannot be eliminated, it can be reduced through higher quality equipment and
careful circuit design.
Figure 1.11: signal contaminated with instrumentation noise

1.2.3. ECG Signal Normal and Abnormal Ranges:
10
The ECG signal plays a major role in diagnosing several diseases related to the heart.
The details and the condition of the patient can be known by studying the generated
signal with a reference of standard ECG wave. Among the different parameters of the
ECG signal, QRS complex and RR interval plays a vital role. ECG signal consists of
different components like intervals, segments. The regular rhythm of the heart is
called as normal whereas the irregular rhythm of the heart is called abnormal. The
abnormal means the pumping of blood by the heart is not proper. It may be either fast
or slow. In such a case, the QRS complex may become widens or come closer. The
normal duration of the QRS complex is 60 to 100ms. If the duration of QRS complex
is more than 100ms, then it is said to be in abnormal range. R-R interval indicates the
successive consecutive beats. One can find out the heart beat with the help of the R-R
interval. The normal range of the heart beat should be 60 to 100 bpm. There indicates
an abnormality, if the heart rate exceeds or precedes the normal range. The normal
range of R-R interval should be 0.6 to 1.2sec. The normal and abnormal ranges of
different intervals are given in below table:
Table 1.1: Normal and abnormal ranges of ECG
Segment Normal Abnormal
QRS Complex 60-100ms >100ms
R-R interval 0.6-1.2sec >2sec
P wave 60-80msec <50msec
PR interval 20-120msec >120msec
QR interval 350-440msec >540msec
Let us consider three signals like standard, normal and abnormal signals along with
the ranges of QRS complex and RR interval and heart rate. The heart can be
calculated with
Table 1.2: Example of finding the heart condition with different ranges
Signal QRS complex RR interval Heart rate(HR) Condition
Normal signal 0.94 0.791 75.84 Standard
Signal A 0.88 0.877 65.83 Normal
Signal B 0.238 1.059 56.65 Abnormal
1.3. DIFFERENT TYPES OF ARRHYTHMIAS:
Heart rates can be regular or irregular. An irregular rhythm of the heart is known as an
Arrhythmia. The range of the normal heart rate lies in between60 to 100 beats per
11
minute. The range of RR interval may change along with the breathing cycle. Some
arrhythmias may cause your heart to skip or add a beat now and again, but have no
effect on your general health or ability to lead a normal life. The normal activity of the
heart without any variations in morphology of the ECG signal is called Normal sinus
rhythm. The heart Arrhythmia may also occur due to improper blood circulation. The
different types of Arrhythmias are
• Sinus Arrest
• Sinus Arrhythmia
• Sinus Bradycardia
• Sinus Tachycardia
1.3.1. Sinus Arrest:
Sinus arrest is also known Sino atrial arrest or sinus pause. Sinus arrest reflects the
failure of atrial activation. It may be due to a problem of generation of the impulse in
the sinus node or a failure of conduction of the impulse to the atrium. It will pause the
conduction of the nervous stimulus of several seconds, followed by the resumption of
a regular rhythm or the idioventricular escape rhythm. It indicates that there is a drop
in blood pressure. The long drop indicates the high blood pressure. The heart rate in
this case is about 40bpm.
Figure 1.12: Sinus Arrest

1.3.2. Sinus Arrhythmia:
An irregular heartbeat is called an arrhythmia. A Sinus arrhythmia is the irregularity
in the heart rhythm, originating at sinus node.
Figure 1.13: Sinus Arrhythmia

The heart beat can be either too fast or too slow. Damage to the sinus node can
prevent the electrical signals from leaving the node and producing the normal
heartbeat.
12
1.3.3. Sinus Tachycardia:
In medical terminology, Tachycardia refers to the heart rate that is faster than normal.
Sinus tachycardia is also known as sinus tach or sinus tachy. It occurs when the rate
of the electrical impulse formation occurs at a rate exceeding 100bpm.It can be a
significant problem if the heart rate is so faster as the heart is pumpingthe blood to
other parts of the body than its needed.
Figure 1.14: Sinus Tachycardia

1.3.4. Sinus Bradycardia:
In medical terminology, Bradycardia refers tothe heart rate that is slower than the
normal heart rate. It can be a significant problem if the heart rate is much low as the
heart is not able to pump enough blood to other parts of the body. The heart rate in the
Sinus Bradycardia is less than 50bpm.
Figure 1.15: Sinus Bradycardia

1.4. A BREIF REVIEW OF ECG SQA ANALYSIS:
ECG signal carries information about the electrical activity of the heart with different
shapes of P, QRS and the T waves. There are various kinds of noises that are
contaminated with the ECG signal and thus make difficulty to perform morphological
analysis of such signals. Therefore, the automatic assessment of ECG signal is used to
reduce the false alarms which are due to the presence of the unacceptable levels of
noises. The signal quality assessment methods generally consist of three major stages.
They are pre-processing stage, feature extraction stage, classification
stage. The pre-processing stage is used to remove the noise by using different types of
digital filters and decomposition techniques. Feature extraction stage is used to extract
the features of the signal or noise components. Classification is the stage which
classifies the signal into two or more classes. The physicians will take more time for
the analysis of data as it cannot be visualized by the naked eye and the human analysis
13
is not suited for the analysis of data, which leads to misinterpretation of data. By
using computerized analysis, the ECG signal can be analyzed much more accurate
than by the human. Automated ECG beat classification can easily classify the data
into two classes, based on the various features extracted from the ECG signal. The
ECG feature extraction methods can be divided into two types. They are direct
method and transformation method.
The direct method will reveal the information regarding the amplitude and
time interval of the QRS complex, RR interval, and other waves. But, this information
is not enough for the feature extraction of the data. In transformation method, the
signal in time domain is transformed into another type of domain called frequency
domain. Here we can get more information which is not available in the raw ECG
data. It is observed in literature that there are two promising methods for the feature
extraction. They are Fourier transform and wavelet transform. Fourier transform is
used to show the range of the spectrum that is involved in the signal. The drawback of
Fourier transform is that it will provide only the spectral component; it cannot give
information regarding the temporal relationship. As the signal varies with time, it is
needed to analyze the morphological characteristics. These are the reasons which gave
rise to the use of time frequency domain representation. There is a technique called
Fast Fourier Transform (FFT) which implements the fast algorithm. But due to
unavoidable limitations of FFT, it is failed to express the information of exact position
of frequency component in time.
To overcome this we will use short Fourier transform to compute the energy
distribution of ECG signal. This can be used for classification by extracting the
features from the energy. This technique is well localized in time domain as well as
frequency domain. But the STFT is unable to track the minute sensitive changes in
direction of the time. To minimize this effect, we can reduce the length of the time
window as short as possible. But this will again degrade the frequency resolution in
the time frequency plane. Hence, it is observed that there is a need of trade-off
between the time resolution and frequency resolution for STFT technique and it is not
effective in providing the features. The other method for feature extraction is wavelet
transform. From the time frequency localization properties of wavelet transform, we
can say that it is well suited for the analysis of non-stationary data. This can be used
for the decomposition of signal and thus separating the relevant data of morphologies
in ECG from the noise. Though it is superior to the Fourier transform, it is also having
14
some limitations. Wavelet transform need the prior information in order to decompose
the signal and it is applicable only on the non-stationary data, but not on the nonlinear
data. This justified the use of Empirical Mode Decomposition (EMD) which is
adaptive in nature and can be applied on both nonlinear and non-stationary data. The
noise can be removed much efficiently by using EMD. The features of the ECG signal
can be extracted by detecting the QRS complex. QRS complex plays a major role for
the analysis of the signal. There are various algorithms for the detection of QRS like
Moriet-Mahoudeaux, Fraden Neuman, Gustafson, Hilbert transform, novel dual slope
method, Hamilton and Tompkins method, Pan-Tompkins method, Phosphor transfer
algorithm. Among these all we are using the Pan Tompkins algorithm as it is best
suited algorithm for the detection of QRS complex whose accuracy is 99.3%,
sensitivity is 99.7% and the positive predictivity value is 99.84%.
In the past, the number of researchers has reported different automatic quality
assessment techniques. Wang et al assessed the signal quality based upon the area
difference between the successive QRS complexes. B. E. Moody explored a number
of heuristic rules based up on the threshold values. This method had achieved the
score of 0.913 for the training sets and 0.896 for the test sets. Liu et al has proposed a
method based on the RR interval with the recordings of PICC database, whose
sensitivity is 90.67% and specificity is 89.78%. Hayn et al introduced a new method
which is depending upon the QRS complex. Quesnol et al developed a method with
the help of PQRST complexes whose PCC is 0.89.C. Orphanidou et al has proposed
the quality assessment based on the HR, R-R intervals and template matching with the
help of records from PICC database. The specificity is 97% and sensitivity is 93%.
Q.Li and G.D.Clifford proposed a method for false alarm (FA) reduction in the ICU
monitor. This is based on the features such as the amplitude of ECG, HR, systolic,
diastolic, mean and pulse blood pressure. This method had achieved the maximum
false alarm reduction rate of 30.5% with a true alarm suppression rate below 1% on
the MIMIC II database. G.D.Clifford et al. studied SQIs and data fusion for
determining clinical acceptability of ECGs using the six SQIs(72 features for twelve
leads) and four classifiers such aslinear discriminate analysis (LDA), Naive Bayes
(NB), support vector machine (SVM) and multi-layer perceptron (MLP)neural
network to classify the ECG signal as acceptable or unacceptable . H. Naseri and
M.R.Homaeinezhad proposed a method called Correlation and Neural Network Based
Method. It is having Se of 100% for detecting high-energy noise and the Se of
15
92.36% for recognizing any other kind of disturbances using the PICC database. G. D.
Clifford et al. presented a method called as signal quality indices and data fusion for
determining acceptability of ECGs collected in noisy ambulatory environments. In
this study, six SQIs such as iSQI, bSQI, fSQI, sSQI, kSQI, pSQI and five classifiers
(NB, SVM, MLP and ANN) were used for assessing the signal quality. This method
is evaluated using the PICC database of three types of data and had an accuracy of
99% (Set-a), 95% (Set-b) and 92.6% (test data). C. J.Behar et al. presented a method
to assess the quality of the ECG signalfor the normal and abnormal rhythms for false
arrhythmia alarm reduction of ICU monitors. This method employed seven SQIs
introduced in previously published works and the SVM classifier with a Gaussian
kernel. The method was evaluated using three types of databases like the PICC
database, the MITBIHA database, and the MIMIC II database. By using the SVM
classifier with a Gaussian kernel function, this method had achieved a classification
accuracy of 99% for ECGs with normal sinus rhythm and accuracy up to 95% for
ECGs with arrhythmia.
The above methods are having drawbacks like some of the methods are tested
only on limited database and some methods are limited due to poor performance. It is
observed that it is difficult to classify the signal by using one particular technique.
Therefore, we implemented the SQA method based upon the QRS detection (Pan-
Tompkins algorithm), using feasibility rules based on the heart rate and R-R intervals,
and adaptive template matching. The proposed method is verified with the
mathematical approach using support vector machine classifier.
1.5. MOTIVATION AND CONTRIBUTION:
1.5.1. Motivation:
Electrocardiograms (ECG) are most widely used for the application of cardiovascular
disease diagnosis. The ECG conveys the information of the activity of the heart. In
general, the ECG signals are contaminated with the various types of noise and
artifacts. It is very difficult to diagnose the noise contaminated signal and it may also
leads to the misinterpretation of the signal. It is observed that the ECG signals are
severely distorted under the physical activities. The baseline wander noise is mainly
caused by the physical activity. The baseline wander noise can degrade the ECG
signal quality and also affects the PQRST complexes. This objective has motivated to
use the signal quality assessment method in order to reduce the false alarm. So, we
16
have implemented the technique called Empirical Mode Decomposition for the
removal of baseline wander and motion artifact noise and the classification of the
signal is done by using feasibility rules and adaptive template matching and the
justification is done with the help of Support Vector Machine classifier.
1.5.2. Contribution:
The contribution of the proposed method is as follows
The concept of pre-processing technique is based on the Empirical Mode
Decomposition (EMD) technique. Generally, Discrete Wavelet Transform (DWT) is
the best technique for the denoising of the signal as it can perform in time frequency
domain. The major drawback of DWT is that, it is not adaptive in nature. In order to
decompose the data, it needs the prior information. If the signal enters a new domain,
the characteristics of the signal changes and the previous variables are totally lost.
Though the signal is retrieved accurately the signal cannot be effectively analyzed in
the new domain without this information. Therefore, EMD is used in order to
overcome the drawbacks of DWT. EMD is a time frequency analysis and the
decomposition which is derived from the data. EMD is adaptive in nature and can
decompose the nonlinear and non-stationary signals. By using EMD, we can reduce
the baseline wander noise much efficiently than the DWT technique.
The proposition of the Pan Tompkins algorithm is to detect the QRS complex.
This mainly deals with detection of R peaks, because with the reference of R peaks
we can detect other peaks. Though there are many types of algorithms for the QRS
detection like Moriet-Mahoudeaux, Fraden – Neuman, Gustafson and many, Pan
Tompkins is the best suited technique for the QRS detection due to its advantages.
This technique will automatically adjust its threshold value and parameters
periodically to adapt the changes in the QRS morphology. It is having 96 percent of
sensitivity and 92 percent of accuracy.
The concept of classification is to detect the signal automatically as “good” or
“bad”. Though there are many methods for the SQA methods, the proposed method
will give high accuracy and the template matching is best suited for comparing the
morphologies of different QRS complexes. The proposed method based on the
Support Vector Machine (SVM) is used for the classification of data accurately. There
are various machine learning techniques like neural network and deep learning
techniques, which also produces high accuracy. The drawbacks in the above two
17
methods are it need large data to process, selection of hidden nodes and it is time
taking process when compared to SVM. The SVM needs less data to perform and
they are not confined to local minima points and it works faster than the other
techniques.
1.6. PROPOSED WORK:
The proposed method is mainly used to obtain the good quality ECG signal. Here, the
ECG signal is acquired from the human and from the available databases. The signal
is acquired from the human body by using the module AD8232. AD8232 is a heart
rate monitor which consists of 3 lead electrodes. Electrode acts as a transducer which
converts the ionic potentials which are generated within the body to the bioelectric
potentials. The electrodes are placed on the right arm (RA), left arm (LA) and right
leg (RL).
GND 3.3V GND
3.3V 11
Output AD8232 10ARDUINO
LO- A0
LO+
1. Pre-processing
2. Feature
Acceptable/ COM
extraction Physionet
3. Feasibility rules PORT
Unacceptable and adaptive
template matching
4. SVM classifier
Figure 1.16: Proposed block diagram of ECG signal acquisition and SQA
The interfacing of the AD8232 and the arduino can be done as follows:
18
1. The ECG signal can be acquired by interfacing the module AD8232 and the
Arduino board. Arduino will supply the required voltage (3.3 V) to the AD8232
in order to operate and the ground of the both are shorted or grounded.
2. Output pin of AD8232 is connected to the analog pin of the arduino A0.
3. The LO+ and LO- are connected to the digital input-output pins 10 and 11.
4. Shutdown pin is not connected to any pins of arduino.
When the supply voltage is given the signal is acquired with the help of AD8232. The
acquired signal consists of noise and they can be suppressed initially with the filters
that are present in the AD8232. The obtained output is in the analog form and it is
given to the analog pins of the arduino board. The microcontroller ATmega 328 in the
Arduino is used to read the output and convert the analog data into the digital data.
The arduino is connected to the Personal Computer (PC) with the help of Universal
Serial Bus (USB). USB is a computer port which can be used to connect equipment to
a computer. The supply voltage required to acquire ECG signal is given from the PC
via cable and the obtained output is given to the PC via USB cable. The ECG signal is
plotted by the arduino IDE and MATLAB. Arduino board and the arduino IDE are
communicated with the help of the cable. The below program is dumped into the
microcontroller ATmega328 in order to read data from the AD8232 and transfer data
to the PC via USB cable.
Start
Set the baud rate as 9600 for

the serial communication
Read the analog data
Write the data
Stop
Figure 1.17: Algorithm for arduino
To set up the code, the baudrate should be given. The output data is read from
the analog pin. Though the serial transmission is accurate, but the serial transmission
is having propagation delay, it will wait to receive next bit. Then it will write the data
and move to the next position. The arduino and the MATLAB can be interfaced and
19
the data is read through serial communication. Serial communication is the process of
sending the data one bit at a time. To have a serial communication between the
arduino and the MATLAB, the baud rate should be 9600. It means that the serial port
is capable of transferring a maximum of 9600 bits per second. The baud rate is the
rate at which the information is transferred. The transferred bits include the start bit,
stop bit, data bit and the parity bits. We have to make sure that the COM port number
is the port number on which arduino is connected and the baud rate should be set
same in both the arduino and the MATLAB.
Start A
Initialize time=0, data=0,

count=0, delay=0.001s, scroll
If(~empty)&&is
width=5s, min=1mv,
max=1024 float(dat))
Initialize the serial port as

COM4 and BAUD as 9600 Increment the count value
for serial communication
Extract the elapsed time and
Open the serial port data data elements
Start the timer Set the axis and move with the
scroll width value
While
Plot the data, time
(Plot)
Close the serial port and

terminate the session
Read the float data
serially
Stop
A
Figure 1.18: Algorithm to plot the signal in MATLAB
The experimental results are shown as
20
Figure 1.19: Signal acquired from the human body with the electrodes
The data from the signal is extracted and are saved in the excel sheet for the further
implementation. Initially open the MATLAB figure that is already saved in
MATLAB. As the data is not similar we need to find the object at different interval
and it will read the values across the axis. Now it returns all properties and property
values for the graphics object identified.
Figure 1.20: Signal values stored in excel spreadsheet

The values from the excel spreadsheet can be read by using the excel file. We can
convert this into mat file by saving the workspace as .m file and is loaded into the
MATLAB by using the syntax “load”.
21
Figure 1.21: Signal loaded from the .mat
Acquisition
Feature extraction stage
Physionet HR b/w 40-180bpm?
QRS
Or
MIT-BIH De detection YES NO
database noising All RR intervals<=3s?
Using
Using YES YES NO
AD
Pan
8232 Max/Min RR < 2.2s?
EMD
ECG Tompkins YES NO
Module
Adaptive template matching
YES NO
Machine
Learning Classification
Language (normal/abnormal)
SVM and
yes
Confusion
matrices Acceptable Unacceptable
Figure 1.22: Framework based on EMD, QRS detection, rules and adaptive template
matching and classifier.
The obtained ECG signal and the signal acquired from the database contain
noise and artifacts. Our main aim is to denoise the signal and evaluate the quality of
the signal and justify the proposed method with the help of the Support Vector
22
Machine (SVM) classifier. The pre-processing is done with a technique called
Empirical Mode Decomposition (EMD) which is adaptive in nature. This technique is
best suited for removing both high frequency noises (like PLI noise and EMG noise)
and low frequency noises (like baseline wander and motion artifacts). These obtained
results are compared with the technique called as Discrete Wavelet Transform (DWT)
in terms of correlation and Signal to Noise (SNR) ratio. The denoised signal is given
to features extraction stage where the QRS complex and the R peaks are detected with
the help of QRS detection technique called as Pan Tompkins algorithm. The quality of
the signal is evaluated with the help of feasibility rules and the adaptive template
matching. Feasibility rules are the human made rules and these rules are based up on
the R-R interval and the heart beat. If the signal is satisfied with all the rules then, the
signal is given for the adaptive template matching; else it will reject the signal by
treating it as a bad quality signal. Adaptive template matching is used to know the
morphological changes in the QRS signal. Then the signal is classified as normal or
abnormal based on the standard ranges of the ECG signal. The proposed method is
justified with the help of machine learning classifier called as Support Vector
Machine (SVM) classifier and the performance is expressed in terms of sensitivity
(Se), specificity (Se) and accuracy (AC).
1.7.1. Hardware:
1.7.1.1. Arduino uno:
Arduino is an open source microcontroller board based on the microchip ATmega328.
It has 14 digital input/output pins (of which 6 can be used as PWM outputs), 6 analog
inputs, a 16 MHz ceramic resonator, a USB connection, a power jack, an ICSP
header, and a reset button and programmable with the Arduino IDE (Integrated
Development Environment).
23
Figure 1.23: Arduino UNO board
Specifications of the Arduino Uno:
Microcontroller -ATmega328
Operating Voltage -5V
Digital I/O Pins -14 (of which 6 provide PWM output)
Analog Input Pins -6
DC Current per I/O Pin -40 mA
DC Current for 3.3V Pin -50 mA
Flash Memory -32 KB (ATmega328)
SRAM -2 KB (ATmega328)
EEPROM -1 KB (ATmega328)
Clock Speed -16 MHz
The pin description of the arduino is
1. VIN: The input voltage required is given by using either USB cable (5 v) or jack
(7-12 v).
2. 5V: This is the pin that is used to give supply to other boards by having the
connections.
3. 3V: It is the volt that is generated on board.
4. GND: These pins are used to ground the device.
5. Serial: 0 (RX) and 1 (TX): These are the pins used to transmit and receive
6. External Interrupts (2 and 3): These pins can be configured to trigger an
interrupt on a low value, a rising or falling edge, or a change in value.
7. PWM: 3, 5, 6, 9, 10, and 11: These pins act as a pulse width modulation.
24
8. SPI 10 (SS), 11 (MOSI), 12 (MISO), 13 (SCK): These pins support SPI
communication using the SPI library.
9. AREF: It is a reference voltage for the analog pins.
10. Reset: It is used to reset the the process of the micro controller.
11. Analog pins: There are 6 analog input output pins that are used to read and
transmit the analog signals.
1.7.1.2. AD8232 ECG Module:
AD 8232 is the heart monitoring device. It is an integrated signal conditioning block
for ECG and other bio-potential measurement applications. It is designed to extract,
amplify, and filter small bio-potential signals in the presence of noisy conditions, such
as those created by motion or remote electrode placement. This design allows for an
ultra low power analog-to-digital converter (ADC) or an embedded output signal to
acquire the output signal easily.
Figure 1.24: AD8232 ECG module

Pin Description of AD8232:
1. 3.3 V: This is the required voltage for the module AD8232 to operate.
2. GND: Power supply is grounded. Leads off comparator output. It is used find out
the action of the electrode leads.
3. LO-: Leads off comparator output. It is used find out the action of the electrode
leads.
4. LO +: Leads on comparator. This is pin is used to make sure that every leads is
connected properly.
5. SDN: it is a shutdown control pin which makes low power mode when it is low.
Specifications of the AD8232:
1. Operating voltage is 3.3 V.
2. It is having high gain of 100 with dc blocking capabilities.
3. Low supply current is 170 A.
25
4. It is having two pole adjustable high pass filter and 3 pole adjustable low pass
filter.
1.7.2. Software Required:
MATLAB and Arduino IDE are the two softwares that are used to acquire the ECG
signal.
1.7.2.1. Arduino IDE Software:
The arduino integrated development software (IDE) is a introduced by the Arduino cc,
that is used for writing, compiling, and uploading the coding in the arduino device.
Arduino IDE is an open source and cross compilation software.
1.7.2.1.1. Applications:
Arduino IDE is mainly used in the IoT applications, real time applications, wearable
technology, communication and embedded environment. Arduino IDE is the open
source software that is used to communicate with the different types of arduino boards
without the help of the external devices. Arduino IDE is mainly used in the IoT
applications, real time applications, wearable technology, communication and
embedded environment.
1.7.2.1.2. Advantages:
1. It is inexpensive.
2. Arduino IDE is easy to use and it is an open source.
3. The programming in arduino IDE is flexible and clear programming environment
1.7.2.1.3. Features:
1. Arduino software can run on any type of operating system.
2. The sketch created on the IDE platform will ultimately generate a Hex file which
is then transferred and uploaded in the controller on the board.
3. The IDE environment mainly contains two basic functions: Editor and Compiler
where former is used for writing the required code and later is used for compiling
and uploading the code into the given Arduino module.
4. It has more than 700 libraries.
5. Arduino is having a baud rate of 9600Mbps.
1.7.2.2. Matlab Software:
Among the various programming languages, MATLAB is one of the best for industry
automation and problem solving. It provides an interactive environment that enables
you to easily develop algorithm, visual data, and also for numerical computations. We
26
can solve technical computing problems quickly and easily. It takes less time than
other traditional programming like C, C++, FORTAN, etc.
1.7.2.2.1. Applications:
MATLAB can be used quite extensively. Some of its applications include signal and
image processing, communication, control design, financial modelling and analysis,
computational biology, test and measurement. MATLAB also provides add-on tool
boxes which are basically a collection of special environment to solve problems
included in a particular class of applications.
1.7.2.2.2. Features:
1. Tools are interactive to design and solve the problem.
2. Simple environment programming which including controlling structures like
loops and selections, products and inverse, summing etc.
3. Mathematical functions like linear algebra, statistics, filtering and optimization
can be used.
4. Open component system based on tool boxes which can be created, modified,
customized and shared by the users.
1.7.2.2.3. Advantages:
1. Easy to understand and develop a program in less time of training.
2. Recent version of MATLAB compilers can compile C, C++ and binary code,
allowing the use of different optimization options for high speed executable.
3. The open architecture allows for very rapid extension of the range of functionality
of MATLAB by developing and sharing new tool boxes.
1.7.2.2.4. Test ECG Databases and Performance Metrics:
Databases are used for the analyzers of ECG signal. We can obtain various kinds of
ECG signals which are obtained from different patients. Databases are used for the
study of ECG signal and detection of abnormalities. There are different types of
databases that are available readily in different sites. They are physionet, MIT-BIH A,
PICS databases, MIT-BIHSTC Database, fantasia database, MIMIC II database.
1. PICS Database:
The Preterm Infant Cardio-respiratory Signals (PICS) database contains simultaneous
ECG and respiration recordings of ten preterm infants. These databases are collected
from the Neonatal Intensive Care Unit (NICU) of the University of Massachusetts
Memorial Healthcare. It mainly focused on the estimation of heart rates that are used
27
to predict bradycardia. These recordings were taken when the infants are with an age
of 29 weeks and 3 days to 31 weeks and 1 day old with a weight of 843 to 2100
grams. The infants are free from the infections. The recordings are done by using 3
lead Electrocardiograph and are recorded for 20 to 70 hours per infant with a
frequency of 500Hz. Infants 2 and 5 are recorded with compound ECG signals at
250Hz.
2. MIT-BIH A Database:
MIT-BIH Arrhythmia Database is conducted at laboratory of Boston's Beth Israel
Hospital since 1975 and Massachusetts Institute of Technology (MIT) has supported
with their own research for the analysis of arrhythmia. In 1980, they completed and
started distributing their database. It was the first generally available set of standard
test material for the evaluation of the arrhythmia detectors and also used for the basic
researches on cardiac. These databases are available at more than 500 sites
worldwide. The MIT-BIH databases are obtained from 47 subjects and are recorded
for 48 hours and 30 minutes by the BIH Arrhythmia Laboratory between 1975 and
1979.
Twenty three random recordings are chosen from a set of 4000 24-hour
ambulatory ECG recordings collected from a mixed population of inpatients (about
60%) and outpatients (about 40%) at Boston’s Beth Isreal Hospital and remaining 25
are selected from the same set to include less common but the clinically significant
arrhythmia would not be well presented in a small random sample. Twenty-three
recordings were chosen at random from a set of 4000 24-hour ambulatory ECG
recordings collected from a mixed population of inpatients (about 60%) and
outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings
were selected from the same set to include less common but clinically significant
arrhythmias that would not be well-represented in a small random sample. In 1984,
beat by beat comparison method was proposed and it was developed under the
support of the Association for the Advancement of Medical Instrumentation (AAMI)
between 1984 and 1987. These 47 recordings are used for the detection of arrhythmia
and basic study of cardiac and beat detection.
3. Physionet Database:
Physiobank is a large and growing archive of digital recordings of physiologic signals
and related data for use by the biomedical research community. Physionet is the
28
database which provides different types of ECG signal of different abnormalities.
Several challenges are conducted and the various recordings are taken by the Sana
project. These records are provided freely via physionet. The ECG signals are
recorded for 10sec and 1 minute from different patients with different age, weight
with the help of 12 lead ECG. Physiobank currently includes databases of multi-
parameter cardiopulmonary, neural, and other biomedical signals from healthy
subjects and patients with a variety of conditions with major public health
implications, including sudden cardiac death, congestive heart failure, epilepsy, gait
disorders, sleep apnea, and aging. At present, Physiobank contains over 75 databases.
4. MIT-BIHSTC Database:
The MIT-BIH ST change database includes 28 ECG recordings of varying lengths,
most of which were recorded during exercise stress tests and which exhibit transient
ST depression. The last five records (323 through 327) are excerpts of long-term ECG
recordings and exhibit ST elevation.
5. Fantasia Database:
The fantasia database consists of Twenty young (21 - 34 years old) and twenty elderly
(68 - 85 years old) which are rigorously screened healthy subjects underwent 120
minutes of continuous supine resting while continuous electrocardiographic (ECG)
and respiration signals were collected. The continuous ECG, respiration, and (where
available) blood pressure signals were digitized at 250 Hz.
6. MIMIC II Database:
The data comprising MIMIC-II was collected at the Beth Israel Deaconess Medical
Center in Boston, MA, USA from patients who were admitted from 2001 to 2008.
There is a second component of MIMIC-II consisting of high resolution waveform
recordings of electrocardiograms, blood pressures, and other monitored signals that
were archived from bedside monitors for a subset of the patients. The MIMIC-II
relational database (version 2.6) contains records from over 32,000 subjects, including
over 7,000 neonatal patients.
Performance Metrics:
The evaluation of the SQA is performed by using five metrics. They are sensitivity
(Se), positive predictivity (Pp), negative predictivity (Np), specificity (Sp) and
accuracy (Ac). They are defined as
29
Where TP is the true positive that denotes the number of correctly identified
unacceptable signals by SQA method, the false negative (FP) denotes the number of
unacceptable signals that are identified as acceptable signal. True negative (TN)
denotes the number of correctly identified acceptable signals and false negative (FN)
denotes the number acceptable signals that are identified as unacceptable.
1.8. ORGANIZATION OF BOOK:
The organization of the book is as follows:
Chapter1 represents the introduction of ECG which includes the motivation
work, a review of literature of ECG, heart anatomy, acquisition of ECG,
characteristics of ECG noises and its normal and abnormal ranges, different types of
Arrhythmias, a brief review of ECG SQA, an overview of ECG SQA system, about
databases and their performance metrics.
Chapter2 discusses about the pre-processing method in which the signal is
denoised. It also proposes a technique of denoising called Empirical Mode
Decomposition.
Chapter3 presents the feature extraction stage. It proposes a real time QRS
detection algorithm, known as Pan-Tompkins algorithm which is used for detecting
and recognizing the QRS complex.
Chapter4 represents the quality assessment of the ECG signal based upon the
feasibility rules and the adaptive template matching which is used to classify the
signal as normal or abnormal.
Chapter5 explains about the ECG signal classification using Support Vector
Machine (SVM). This will also include the theoretical background of SVM and multi-
class SVM.
30
CHAPTER 2
ECG DENOISING/PRE-PROCESSING
Electrocardiogram is a method or technique that is used to check the electrical activity
of heart. These directly use electrical signals produced by heart activity. The main
objective of denoising is to reduce the alarm fatigue which is caused by the baseline
wander noise based on EMD, so that the diagnosis of the ECG signal could be done
better. Now a day, most of the diseases are due to cardiovascular diseases. This is the
most common way to monitor heart activity. Various heart disorders can be detected
by analyzing electrocardiogram (ECG) signal abnormalities.
Pre-processing:
There are various pre-processing steps available to apply before starting the work, to
convert the signal into a noise - free and good quality signal. The various pre-
processing steps are denoising, deblurring, feature extraction. ECG pre-processing is a
major challenge and its function is to filter and remove noise and produce easier to
diagnose various heart rates disorders. The aim of this is to develop different types of
noise and then create different methods to denoising them. The various sources of
noise are baseline wandering, frequency noise, muscle noise and impulsive noise.
2.1. INTRODUCTION TO FREQUNCY AND TIME FREQUNCY
ANALYSIS:
Frequency domain analysis is important mainly in signals processing. This analysis is
widely used in areas like communications, image processing, remotes sensing etc. In
time domain signal changes over time where as in frequency the signal is distributed
over certain range of frequencies. A signal is converted between time and frequency
with pair of operations called a transform. There are many transformations regarding
the time-frequency domain signals. The time-frequency analysis is used to analyze the
signals whose energy distribution varies in time and frequency. Conditions for the
time-frequency analysis are, to have a valid time-frequency representation we should
have frequency and energy functions varying with time. The frequency and energy
functions should have instantaneous values.
2.1.1. Literature Survey for EMD:
The survey of EMD undergoes mainly three concepts, after the three transforms EMD
came into existence. The initial technique of ECG signal analysis is used in time
31
domain method. But this is not sufficient to analyze all the features of ECG signals.
So, we needed the frequency representation of the ECG signal. To achieve this, FFT
technique is used but due to the unavoidable limitations of this FFT is that the
technique failed to expresses the information regarding the exact position of
frequency components in time. Due to this drawback we are using STFT. In order to
examine the signal in frequency domain, one has to calculate the spectral function.
The most used form of this calculation is fast FFT. It is a fast algorithm, which
implements the discrete Fourier transform (DFT). Frequency analysis allows us to
determine which frequencies are involved in our signal.
The Short-Time Fourier Transform (STFT) or windowed Fourier transform is
a transform that yields a representation of sequences of any length by breaking them
into shorter sections, and applying the Fourier transform to each section. Simply we
can say that STFT is the modification of Fourier transform and involves sliding
window, mask function. This is a time-frequency localization technique in that it
computes the frequencies associated with small segments of the signal. For each
section, the STFT calculates its Fourier transform. STFT belongs to a group of joint
time -frequency representation which comes from two dimensional analyzing
functions. This is well localized in both time and frequency domain. However STFT
cannot track the very sensitive sudden changes in time direction to minimize this
problem it is necessary to keep the length of the time window as short as possible.
This is again reduces the frequency in time frequency plane. Hence there is a trade-off
between time and frequency resolution for this technique, and thus, the features are
limited by accuracy distribution. Due to this drawback we are using DWT. Discrete
wavelet transform (DWT) technique is applied to the standard ECG signal and de -
noisy signal is obtained and referred as a reference signal.
Wavelet transform is an efficient tool for analyzing non stationary ECG
signals due to its time frequency localization property. In wavelet, good time
resolution and poor frequency resolution is found at high frequencies and good
frequency resolution and poor time resolution is obtained at low frequencies. This
transform has the ability to analyze the ECG signal more accurately than STFT in
some cases.
2.2. THEORITICAL BACKGROUND:
Though there are many denoising techniques, EMD is mainly used due to its various
32
advantages. It is used to perform operations in time domain without converting it into
frequency domain. Suitable for finding the minute information we need. The main
objective of using EMD is to eliminate baseline wander which cause problem in peak
detection and signal analysis.
2.2.1. Empirical Mode Decomposition (EMD):
EMD is also called as bi dimensional empirical mode decomposition. It is based on
the decomposition derived from the data and it is useful for analyzing the nonlinear
and non stationary time series signals. The Empirical mode decomposition is best
suited for removing the baseline wander and also other noises by decomposing the
signal into intrinsic modes. Thus the decomposition of the components is said to be
intrinsic mode function (IMF).
An IMF is a function which satisfies the two conditions. They are
1. The number of extrema’s and the number of zero crossings in a data set either
must be equal or differ at most by one.
2. The mean value of the envelope defined by local maxima and local minima should
be zero in a given set.
2.2.1.1. Spline Algorithm:
Let us consider a set of coordinates of a function defined
as y=y(x), here the values of x are in ascending order. Now by using cubic functions
Si; i=0,…n-1,our aim is to bridge a gap between adjacent points )
with continuous first and second derivatives. This curve is named as cubic spline and
it is a thin strip of flexible used for drawing. The points that are present on the curve
are termed as knots/nodes.
The function Si can be expressed as
(1)
Where, x ranges from to .

Now from equation1,we express that equation in first and second derivatives.
The first derivative equation can be written as
(2)
The second derivative equation can be written as
(3)
33
At point ( ), the functions for i= 1,…n should meet and this can be
expressed in equation as
(4)
Equation 1 can be written as
Where =
Now the first derivative equation can be expressed as
(5)
then
The second derivative equation can be expressed as
(6)
then 6
Now we need to specify the conditions which are at the end points ( ) and
( ).We can set the first derivatives of the cubic functions at these points to the
values of the corresponding derivatives of y = y(x) thus,
= and = ( (7)
This is described as clamping the spline.
By this, we are introducing additional information about the function y = y(x); and,
therefore, we can expect a better approximation. If we leave the ends free, then the
conditions written as
=0 and (8)
This implies that spline is linear when it passes through the end points. Now we use
conditions y=y(x).In this case we know the values of and from data
Consider the following four conditions relating to ith segment
(i) (ii) ,
, (iv) (9)
If and were known in advance, In the case of n = 1, then these conditions

would serve to specify uniquely the four parameters of Si. In the case of n>1, the
conditions of first-order continuity provide the necessary link between the segments
which enables us to determine the parameters simultaneously. First
conditions specifies that,
34
(10)
The second condition specifies that,
Now we get
(11)
The third condition considered as an identity and the fourth condition specifies
that , which gives
(12)
Now substitute value in equation 11 then,
(13)
And thus we have expressed the parameters of the ith segment in terms of the second-
order parameters +1, and the data values
The condition of first-order continuity, which is found in equation

(5), can now be re-written with the help of equations (12) and (13) to give,
(14)
If I run from 1 to n−1 in this equation and by taking account of the end
conditions , we obtain a tri-diagonal system of n−1 equations in the form
of p and q
(15)
and
.
2.2.2. Methodology:
It is a powerful tool for time-frequency analysis and became popular in various fields,
nonlinear and non-stationary mechanics and acoustics. The main aim of the EMD
technique is to decompose the signal into components which is called intrinsic mode
functions (IMFs). The process of calculating IMFs is known as shifting process, and
is described as follows.
Shifting Process:
• In order to implement the shifting process we have to calculate the turning
points
• Turning point deals with the maxima, minima and inflection points.
35
• TP can be calculated by using partial differential equation (PDE).
Some of the assumptions made for decomposition are:
(1) The signal has at least two extrema: one maximum and one minimum
(2) The characteristic time scale is defined by the time lapse between the extrema.
(3) If the signal has no extrema but has inflection points, then the signal can be
differentiated one or more times to find the extrema
The basic principle of this method is to identify the intrinsic oscillatory modes by
their characteristic time scales in the data empirically and then decompose the data.
A systematic way to extract the IMFS is called the Shifting Process and is described
below.
1. Identify the maxima and minima of the data X(t), and generate the envelope by
connecting maxima and minima points with cubic splines, respectively.
2. By averaging the envelope, determine the local mean, m10(t).
3. Now subtract the mean from the data i.e.
h10(t) = X(t) – m10(t)
4. Now by considering the h10(t) as the input and repeat the above steps to get
h11(t) = h10(t) – m11(t);
Again repeat the above steps by considering h11(t) as the input
h12(t) = h11(t) – m12(t);
And repeat as necessary until we get
h1k(t) = h1(k-1)(t) – m1k(t)
Where h1k(t) is an IMF and here m1k(t) is the local mean of h1(k-1)(t).
Once the IMF is calculated from the shifting process it is defined as c1 = h1k, which is
the first component that contains the finest temporal scale in the signal. To obtain
series of IMFs, the EMD technique undergo some process. Generate the residue r1 of
the data from the X(t) by subtracting the c1:
X(t) – c1(t) = r1(t).
But still the r1(t) contains the information about the longer period components. So
now consider the r1(t) as the new input data and repeat the above step. This procedure
is applied on all the residues rj(t):
r1(t) – c2(t) = r2(t), ………..rn-1(t) – cn(t) = rn(t)
This process will repeated until we obtain a residue rn(t) either a constant, a
monotonic slope, or a function with only one extremum.
36
Similarly the original signal can be reconstructed by superposition of all the
components and it is represented as below:
X(t) = i(t) + rn(t)
Since the residue rn(t) is considered as the last IMF function cn+1(t), so the above
equation is modified as:
X(t) = i(t) + rn(t)
Since the residue rn(t) is considered as the last IMF function cn+1(t), so the above
equation is modified as:
X(t) = i(t)
The shifting process was applied on an ECG signal to obtain the various IMF’s. The
EMD method is a powerful tool for analyzing ECG signal. It is very reliable as the
base functions depend on the signal itself. EMD is very adaptive and avoids diffusion
and leakage of signal. The stopping criteria for the process can be known by using
standard deviation. It is similar to Cauchy sequence convergence. Shifting process
will be stopped when S.D is very smaller than predefined value. When the standard
deviation is less than the 50, then the process is terminated.
2.2.3. Performance Metrics:
We have used MIT-BIH, Infants, and Physionet databases to validate the efficiency of
our proposed method. Simulation was carried out in MATLAB environment. We have
added noise to the ECG signals to obtain a collection of noisy ECG signals with SNR
and correlation.
Correlation:
Correlation is used to compare the two signals i.e., the original signal and obtained
signal after applying EMD. The correlation coefficient (r) is given as
Signal to Noise Ratio (SNR):

SNR is mainly used to measure the level of desired signal to the level of unwanted
signals i.e., noise. It is defined as the ratio of signal power to the noise power. SNR is
expressed in decibels. SNR can be measured by using
37
2.3. EXPERIMENTAL RESULTS:
The denoising of EMD is evaluated by using correlation and SNR. The experimental
results are
Table 2.1: Results of EMD for different signals of different databases
Database Type of Record Correlation Signal-noise
noise no ratio(SNR)
BW 111 0.99043792 27.1252

(10s)
MIT-BIH BW 116 0.98169959 26.8662
(10s)
BW 232 0.99322419 26.8171
(10s)
Average: 0.988 26.93
BW 111(1m) 0.97384158 21.3360
BW 116 0.98281122 25.4579

MIT-BIH
(1m)
BW 232 0.99407535 25.8075
(1m)
Average: 0.98 24.20
BW M1 0.92548164 15.0941
PHYSIONET
BW M2 0.99203217 24.2131
BW M3 0.98622081 28.5575
BW 100(0) 0.86976693 6.6850
BW 100(1) 0.85359654 6.6850
BW 100(2) 0.88915864 7.0252

PICS
BW 100(3) 0.86222581 6.5493
38
BW 100(4) 0.85244975 6.8487
BW 100(5) 0.88966521 7.3363
PICS BW 100(6) 0.88966521 7.3363
BW 100(7) 0.90618978 7.1062
BW 100(8) 0.86208359 6.5227

BW 100(9) 0.88699226 6.7141
Signal 234mmat:
Database: MIT-BIH
Description: This signal is acquired from the female of the age 56years.
Figure 2.1: Raw ECG Signal of 234m
Figure 2.2: Result for filtered EMD signal of 234m

Signal M1:
39
Database: Reference Signal M1
Description: Below signal is reference signal.
Figure 2.3: Raw ECG Signal of M1
Figure 2.4: Result for filtered EMD signal of M1

Signal 116m.mat:
Database: MIT-BIH Arrhythmia
Description: This signal is acquired from the male of the age 68years.
40

Signal 100m.mat:
Description: This signal is acquired from the male of the age 69 years. Gender: Male
41
Infant1_ecgm:
Database: PICS
Description: This signal is acquired from the infant of the age between 1 to 2 years.
Figure 2.9: Raw ECG Signal of infant1
Figure 2.10: Result for filtered EMD signal of infant1
AD 8232 Acquired signal:
Figure 2.11: Raw ECG Signal of AD8232 signal
42
Figure 2.12: Result for filtered EMD signal for AD8232 signal
2.3. CONCLUSION:
Hence it is observed that the denoising is done by using the empirical mode
decomposition (EMD) technique. Performance metrics like correlation and signal to
noise ratio proved that there is a better noise removal. We had proved that the noise
contained in the raw ECG signal is reduced.
43
CHAPTER 3
QRS PEAKS DETECTION USING PAN-TOMPKINS
ALGORITHM
INTRODUCTION:
ECG is the process of recording the electrical activity of the heart over a period using
electrodes on the skin. These electrodes detect the tiny electric charges on the skin
that arise from the heart muscle's electro physiological pattern of the
depolarizing and the repolarising during each heartbeat. It is commonly performed to
detect any cardiac problems. In this chapter, we are discussing about the detection of
the QRS wave from the input signal. In general there are many detection techniques or
algorithms are present to detect the QRS complex, we considered the Pan-Tompkins
algorithm for the detection of QRS which was proposed by J. Pan and W. J. Tompkins
in the year 1985, considered as the most effective and efficient algorithm for the QRS
detection. The Pan-Tompkins QRS detection algorithm identifies the QRS complexes
based upon digital analysis of slope, amplitude and width of ECG data. It can reduce
false detection caused by various types of interference present in the ECG signal.
3.1. LITERATURE SURVEY:
In the recent years many methods or algorithms are proposed for the detection of the
QRS complex like Moriet-Mahoudeaux, Fraden-Neumann, Gustafson, Hilbert
transform and novel dual shape method, waveform method, dual slope method, level
crossing method, higher order statistics, but we have considered the Pan-Tompkins
algorithm for the detection due to its advantages. Pan-Tompkins method is justified as
the efficient method as it has 99.3% accuracy for detection of QRS complexes for
standard MIT/BIH arrhythmia database.
3.2. PAN TOPKINS ALGORITHM:
3.2.1. Methodology:
In general, software QRS detectors typically include one or more of three different
types of processing steps. They are linear digital filtering, nonlinear transformation,
and decision rule algorithm. In this algorithm all, the three processing steps are used.
Linear processes include a band pass filter, a derivative, and a moving window
integrator. The nonlinear transformation that we use is Signal amplitude squaring.
Adaptive thresholds and T-wave discrimination techniques provide part of the
44
decision rule algorithm. In this methodology, we are giving the detailed information
about all these steps in Pan-Tompkins algorithm. The various steps in the Pan-
Tompkins algorithm in detail are
Band pass filter
Derivative
Squaring
Moving window integration
Adaptive threshold
Figure 3.1: Pan Tompkins Algorithm.
1. Band Pass filters.

2. Derivative.
3. Squaring function.
4. Moving window integral.
5. Fiducial mark.
6. Thresholds.
The various steps in digital signal processing are First, in order to attenuate the
noise, the signal passes through a digital band pass filter composed of cascaded high-
pass and low-pass filters. The next process is differentiation which gives information
about the slope of the QRS. then squaring process is done which intensifies the slope
of the frequency response curve of the derivative and helps restricts false positives
caused by T waves with higher than usual spectral energies. And the next is moving
widow integration which produces a signal that includes information about both the
slope and width of the QRS complex. Then an adaptive threshold gives final output
stream of pulses marking the locations of the QRS complexes.
3.2.1.1. Band Pass Filter:
The band pass filter will reduce the influence of muscle noise, 60 Hz interference,
baseline wander, and T-wave interference. The desired pass band to maximize the
QRS energy is approximately 5-15 Hz. This approach results in a filter design with
45
the integer coefficients. Since the integer arithmetic is necessary, a real-time filter can
implemented with a simple microprocessor and still available computing power left to
do the QRS recognition task. For our chosen sample rate, we could not design a band
pass filter directly for a desired pass band of 5-15 Hz using this specialized design
technique. Therefore, we cascaded both low-pass and high-pass filters described
below to achieve a 3 dB pass band from about 5-12 Hz, reasonably close to the design
goal. The study of the power spectra of the ECG signal, QRS complex and other types
of noises also revealed that a maximum SNR value is obtained for a band pass filter
with a centre frequency of 17 Hz and a Q of 3 Hz.
Low- Pass Filter:
The transfer function of the second-order low-pass filter is
)2/(1-Z-1)2
Where T” is the sampling period. The difference equation of the filter is
Where the cut off frequency is about 11 Hz and the gain is 36.The filter processing
delay is six samples.
High-Pass Filter:
The design of the high-pass filter based up on subtracting the output of a first-order
low-pass filter from the all-pass filter. The transfer function for such a high-pass filter
is
(-1+32 + )/(1+ )
The difference equation is
The low cut off frequency of this filter is 5 Hz, the gain 32, and the delay is 16
samples.
3.2.1.2. Derivative:
After filtering, the signal is differentiated to provide the QRS complex slope
information. We use a five-point derivative with the transfer function
-2 +2Z+
The difference equation is
46
The frequency response of this derivative is nearly linear. Its delay is two samples.
3.2.1.3. Squaring Function:
After differentiation process, the signal is squared point by point. The equation of this
operation is
This makes all the data points positive and does nonlinear amplification of the output
of the derivative emphasizing the higher frequencies
3.2.1.4. Moving Window Integration:
The purpose of the moving-window integration is to obtain waveform feature
information in addition to the slope of the R wave. It is calculated from
Where N” is the number of samples in the width of the integration window.

In general, the width of the window should be approximately the same as the
possible QRS complex. If the window is too wide then the integration waveform will
merge the QRS and T complexes. If it is too narrow, QRS complexes will produce
several peaks in the integration waveform. This can cause difficulty in subsequent
QRS detection processes. The width of the window is determined adaptively.
3.2.1.5. Fiducial Mark:
The QRS complex corresponds to rising edge of the integration. The time duration of
the rising edge is equal to the width of the QRS complex. A fiducial mark for the
temporal location of the QRS complex can be known from this rising edge according
to the desired waveform feature to be marked such as the maximal slope or the peak
of the R wave.
3.2.1.6. Thresholding:
After finding the fiducial marks we need to find the R peak, which plays a vital role in
disease identification. As the Q and S are lower in amplitudes than R, the threshold is
applied as 45% of the maximal amplitude value of the signal. So that the R peaks are
identified among different peaks.
3.3. EXPERIMENTAL RESULT:
Using the Pan-Tompkins algorithm we run the code in the MATLAB by giving the
raw ECG signals as input signals, we get the peak points and the QRS waveforms as
the output.
47
Table 3.1: RR intervals obtained for different signals of different databases
Database Record no RR Interval
111(10s) 4.6958
116(10s) 4.5038
MIT BIH 232(10s) 4.7667
116(1m) 0.7629
232(1m) 1.0268
111m(1m) 0.9874
M1 0.81127
M2 0.17964
Physionet
M3 4.8375
100(0) 4.6056
100(1) 4.7985
PICS
100(2) 4.3718
100(3) 4.4987
100(4) 4.5569
100(5) 4.7333
100(6) 4.7333
PICS
100(7) 4.7667
100(8) 4.8121
100(9) 4.753
48
Results for the Input 234m.mat:
Database: MITBIH
Description: This is acquired from female of the age group 56years
R-R Interval = 3.9631
Figure 3.2: Raw signal of input 234m
Figure 3.3: Results of R peak detection for input 234m
Figure 3.4: Results of QRS detection for input 234m
49
Results for the Inputs 121m.mat:
Database: MIT-BIH ARYTHIMA
Description: This is acquired from the male of age 68 years.
50
Results for the InputsM1:
Figure 3.8: Raw signal of input signal M1
Figure 3.9: Results of R peak detection for input M1
Figure 3.10: Results of QRS detection for input M1
51
Results for the Inputs 116m.mat:
Database: MIT-BIH ARYTHIMA
52
Results for the Input 232m10s.mat:
Database: MIT-BIH
Description: This is acquired from female of the age group 56years
Figure 3.14: Raw signal of input 232m10s
Figure 3.15: Results of R peak detection for input 232m10s
Figure 3.16: Results of QRS detection for input 232m10s
53
Results for the Input Infant 1:
Database: PICS
Description: This is acquired from the infant of age 1 to 2 years.
R-R Interval=4.318
Figure 3.17: Raw signal of infant signal 1
Figure 3.18: Result of R peak detection for input infant 1
Figure 3.19: Results of QRS peak detection for input infant 1
54
AD8232 acquired signal:
R-R interval=2.1399
Figure 3.20: Raw signal acquired from AD8232
Figure 3.21: Results of peak detection acquired from AD8232
Figure 3.22: Result of QRS peak acquired
55
3.4. CONCLUSION:
In this chapter, we have implemented a real time QRS detection Algorithm and
implemented it. This algorithm reliably detects QRS complexes using slope,
amplitude and width information. The band pass filter pre-processes the signal to
reduce interference, permitting the use of low amplitude thresholds in order to get
high detection sensitivity. In this algorithm, we use a dual threshold technique and
search back for missed beats in detail and then we conducted the experiments in the
MATLAB on the raw ECG signal and we detected the peak points and the final QRS
wave forms.
56
CHAPTER 4
RULES AND ADAPTIVE TEMPLATE MATCHING
BASED ASSESSMENT
INTRODUCTION
Signal quality assessment plays a vital role in significantly improving the diagnostic
accuracy and reliability of ECG signals. SQA used to determine quality of ECG
signals. Individual heartbeats are detected in the segment of original signal. Signal
quality assessment is used to classify the signal as good or bad. By using the rules and
template matching we can assess the quality of signal. If the quality assessment is not
applied precisely before the analysis signal it may leads to misinterpretation. This
caused the need for automated detection of low quality segments perhaps caused by
motion artifact or poor electrode contact, which can be eliminated from the analysis.
In this chapter, we discussed about the signal quality assessment of the ECG
signal, which extracted from the Pan-Tompkins QRS detection algorithm, which is
already explained briefly in the previous chapter. Finally we get an output describing
whether the signal is good or bad. The systematic procedure of the signal quality
assessment is explained in detail in this chapter.
4.1 HEART RATE VARIABILITY:
Heart rate variability is the physiological phenomenon of variations in the time
interval between heart rates. The clinical relevance of heart rate variability is first
appreciated in 1965.The variations in heart rate may be evaluated by a number of
methods. It is the measure, which indicates how much variation is there in your
heartbeats with in a specific period. The unit of measurement is milliseconds. The
average HRV can be calculated as
HRV can be calculated by a time domain method called RMSSD. This is the Root
Mean Square of Successive Differences between each heartbeat.
57
SDNN calculated as the standard deviation of all of the RR intervals (the distance
between each heartbeat, or the “R” of the QRS complex).
We can know the number of RR intervals that are more than 50 ms interval with the
help of NN50.
A normal resting heart rate for adults ranges from 60 to 100 beats per minute.
Generally, a lower heart rate at rest implies more efficient heart function and better
cardiovascular fitness. Generally, for adults a heart rate more than 100 beats per
second is consider as too fast. Heart rate variability represents one of the most
promising such markers. The apparently easy derivation of this measure has
popularized its use. However the significance and meaning of many different
measures of HRV are more complex than generally appreciated and there is potential
for incorrect conclusion and for excessive or unfounded extrapolation. The
phenomenon is the focus of the oscillation in the interval between consecutive heart
beat as well as the oscillations between heartbeats. HRV has become the
conventionally accepted term to describe the variation of both heartbeat and RR
interval.
4.2 FEATURE EXTRACTION AND RULES:
4.2.1. Feature Extraction:
The ECG feature extraction system provides fundamental features (amplitudes and
intervals) which are to be used in automatic analysis. In recent times, a number of
techniques have proposed to detect this features .The previously proposed methods of
ECG signal analysis was based on time domain method. However, this is not always
adequate to study all the features of ECG signals. Therefore the frequency
representation of a signal is required .The deviation in the normal electrical patterns
indicates various cardiac disorders. In the normal state or electrical polarized ECG is
essentially responsible for patient monitoring and diagnosis. The extracted features of
ECG signal play a vital role in diagnosing the cardiac disease. The development of
accurate and quick method for automatic features extraction is of major importance.
Therefore, it is necessary that the features extraction system perform accurately. The
purpose of feature extraction is to find a few properties as possible within ECG signal
that would allow successful abnormality detection and efficient prognosis.
58
4.2.2. RULES:
4.2.2.1. Feasibility rules: The first step of the SQA algorithm is to perform R
peak on a sample and to compare the output of the detector with a set of
physiologically relevant rules. The following three rules applied sequentially, and it is
not satisfied, the sample classified as “bad”.
Rule 1: The range of heart rate value extrapolated from the 10-second sample must be
in between the range of 40 to180 beats per minutes (bpm).
Rule 2: The maximum acceptable gap between the successive R peak is less than or at
least is equal to 3second
Rule 3: The ratio of maximum beat-to-beat interval to minimum beat-to-beat interval
with in a sample must be less than 2.2.
4.3 ADAPTIVE TEMPLATE MATCHING:
Template matching approaches have been used in the past for identifying ventricular
ectopic beats and heartbeats in the ECG and for signal quality assessment of the PPG
Regardless of the actual morphology of the QRS complexes in a given ECG, template
matching searches for regularity in a segment which is an indicator of reliability
Our overall approach for the identification of signal is given by:
1. Using all the detected R-peaks/PPG-pulse peaks of each ECG, the median
beat-to-beat interval is calculated.
2. Individual QRS complexes pulse waves are then extracted by taking a
window, the width of which is the median beat-to-beat interval, centred on
each detected R-peak.
3. The average QRS template is then obtained by taking the mean of all QRS
complexes in the sample.
4. The average correlation coefficient is finally obtained by averaging all
correlation coefficients over the whole ECG sample. If the correlation
coefficient is greater than or equal to threshold (0.5), then it is accepted as
good.
4.4. ECG SIGNAL CLASSIFICATION:
In general, the important information about status of disease and condition of patient
can be known by studying the ECG signals generated by heart. The
Electrocardiograph plays a vital role in diagnosing and treatment of several diseases
59
related to heart. We are classifying them with the help of heart rate and duration of the
QRS complex.
TABLE 4.1: Classification of disease using heart rate and QRS complex
S. No Beats per minute QRS interval Comment

(BPM)
1 60≤BPM≥100 < 0.1 Normal Resting HR
2 4≤ BPM≥60 < 0.1 Sinus bradycardia
3 100≤BPM≥150 < 0.1 Sinus tachycardia
4 150≤BPM≥250 > 0.1 Ventricular tachycardia
5 20≤BPM≥40 > 0.1 Idioventricular rhythm
6 250≤BPM≥350 0.06<QRS>0.1 Artificial flutter
7 Others Others Other arrhythmia
4.5. EXPERMENTAL RESULTS:
Table 4.2: Parameters obtained for different signals of different databases.

Database Record Mean SDNN RMSSD NN50 Classification
no HRV
111(10s) 33.8675 1.4215 2.0423 10 Idioventricul
MIT BIH arrhythmia
116(10s) 13.3244 0.064246 0.084163 5 Other
arrhythmia
232(10s) 50.8477 3.8283 5.3849 8 Other
arrhythmia
111(1m) 120.9171 0.31834 0.42379 18 Sinus
MIT BIH tachycardia
116(1m) 78.7335 0.01311 0.05029 0 Other
arrhythmia
232(1m) 77.4121 0.61444 0.84353 37 Normal
Resting HR
M1 742094 0.048071 0.0709 5 Normal
Resting HR
60
Physionet M2 3310.4116 0.26057 0.37512 226 Other
arrhythmia
M3 12.5026 0.45766 0.75224 8 Normal
Resting HR
100(0) 13.0424 0.16242 0.14762 9 Other
arrhythmia
100(1) 12.5247 0.20487 0.259 7 Other
arrhythmia
100(2) 13.7404 0.15779 0.14313 8 Other
PICS arrhythmia
100(3) 13.3529 0.15978 0.14767 10 Other
arrhythmia
100(5) 12.6899 0.16469 0.14767 7 Other
arrhythmia
100(4) 13.1782 0.13953 0.13707 10 Other
arrhythmia
100(6) 12.6899 0.16469 0.17574 7 Other
arrhythmia
100(7) 12.5967 0.13688 0.17487 10 Other
PICS arrhythmia
100(8) 12.4743 0.10778 0.10554 6 Other
arrhythmia
100(9) 12.642 0.19146 0.21731 8 Other
arrhythmia
Results for the Input 234m.mat:

Database: MIT-BIH
Description: This signal is acquired from the female of the age 56years.
Mean HRV= 15.1473
SDNN = 0.092722
RMSSD = 0.11351
NN50 = 10
61
..
Figure 4.2: Result for HRV for input 234m

Results for the Input M1:
Description: Below signal is reference signal.
Mean HRV= 74.0252
SDNN = 0.037756
RMSSD = 0.055632
NN50 = 7
NORMAL Resting HR
62
Figure 4.3: Raw signal of HRV for input M1
Figure 4.4: Result for input M1

Results for the Input116m.mat:
Description: This signal is acquired from the male of the age 68years.
Mean HRV= 13.3244
SDNN = 0.064246
RMSSD = 0.084163
NN50 = 5
63
Results for Input 100m.mat:
Description: This signal is acquired from the male of the age 69 years. Gender: Male
Mean HRV= 74.0252
SDNN = 0.037756
RMSSD = 0.055632
NN50 = 7
NORMAL Resting HR
Figure 4.7: Raw signal of HRV for input 100m
64
Results for the Input infant 1:
Database: PICS
Description: This signal is acquired from the infant of the age between 1 to 2 years.
Figure 4.9: Raw ECG signal for infant1
Figure 4.10: Result for HRV for input infant1

Acquired signal:
Figure 4.10: Raw ECG signal for AD8232
65
Figure 4.11: Result of HRV for AD8232
4.5 CONCLUSION:
In this chapter, we have presented an SQA for the ECG Signal, which is used to
provide real time assessment of suitable ECG signal for deriving reliable HR value
and we have discussed about the signal quality assessment of the QRS complex,
which extracted from the given input raw ECG signal. The proposed SQA is
intrinsically linked to the peak detector used. Those using peak detectors would
change the performance of the SQA. We identified whether the signal is good or bad
by using the heart rate variability, feature extraction and rules, and by the adaptive
template matching and finally ECG signal classification for normal /abnormal
heartbeat. The proposed approach is based on the assumption that the signal reliable
HR measurement for every 5 minutes is sufficient. our proposed SQA shows
promising results in differentiating between “good” and “ bad “segments of data
obtained from ambulatory hospital patients using a range of wearable monitoring
66
CHAPTER 5
CLASSIFICATION OF ECG SIGNAL USING SVM
CLASSIFIER
5.1. INTRODUCTION:
Artificial Intelligence (AI) is the field where many researchers are interested in
knowing about machines and how they learn from the data. There are various
approaches of artificial intelligence like Neural Network (NN), Machine Learning
(ML), Deep Learning (DL), etc. Among them machine learning is recognized as a
separate field in the year 1990.
Artificial intelligence
Machine learning
Deep learning
Figure 5.1: Branches of artificial intelligence

The main aim of these approaches is to allow the computers to learn automatically
without the human assistance and adjusts accordingly.
5.2. NEURAL NETWORKS AND DEEP LEARNING:
5.2.1. Neural Networks:
Neural Network (NN) or artificial neural network (ANN) is one set of algorithm that
is used in the machine learning for modelling the data using graphs of neurons. Neural
network is inspired by the structure of the brain. It contains highly interconnected
entities called units or nodes. A typical neural network is a group of algorithms and
these algorithms will model the data using neurons for machine learning. This is used
for regression and classification process. The success of this method came from
Cybenko in year 1989 and Hornik in year 1991. Le cum in year 1986 proposed an
efficient way to compute the gradient of a Neural Network is called back propagation
of the gradient which is used to have accurate criteria. Back propagation is done by
increasing or decreasing the weights in order to reduce the error. There are about 100
Billions neurons in the human brain. Each neuron has a connection point between
1000 and 100000.
67
Figure 5.2: Simple neural network
The advantages of the neural networks are
1. It information is stored in the entire network instead of data base.
2. Ability of work to with the incomplete knowledge.
3. Corruption of one or more cells does not prevent it from generating output..
4. Its memory is distributed.
5. Ability to make machine learning.
6. Parallel processing capability.
The disadvantages of the neural networks are
1. It requires the processor with parallel processing power, in accordance with
their structure. Hence it is dependent on the hardware.
2. Determination of proper network structure.
3. Difficulty of showing the problems to the networks.
4. The duration of the network is unknown.
5.2.2. Deep Learning:
Deep learning is also known as deep structured learning or hierarchal learning. Deep
learning is a subfield of machine learning concerned with algorithms inspired by the
structure and function of the brain called ANN. The important aspect in deep learning
is the neural networks that are formed as deep neutral networks. Deep refers to the
number of layers typically used. It is having the input layer, output layer and the
hidden layer. Hidden layer contains the number of sub layers that used for the
operation. As like in ANN, deep learning also uses back propagation to have proper
gradient. But the difference is that the deep learning is preferred when we are dealing
with the large amount of data whereas neural network deals with fewer amounts of
data.
68
These techniques are processed in sound and image processing fields which are
included with the facial recognition, speech recognition, computer vision, automated
language processing, text classification etc.
Figure 5.3: Deep Learning

The advantages of the deep learning are
1. It is an architecture that can adaptive to new problems easily.
2. It reduces the need of feature engineering which is the most time consuming part
of machine learning practice.
The disadvantages of the deep learning:
1. It requires large amount of data.
2. It is extremely expensive to train.
3. It does not have strong theoretical foundation.
5.3. MACHINE LEARNING:
Machine learning is introduced by Arthur Samuel in the year 1959. Machine learning
is a subfield of artificial intelligence, which enables a system to automatically learn
from its experience without being explicitly programmed. It is a set of algorithms that
parse data and learns from the parsed data and use those learning’s from the parsed
data and use those machine learning’s to discover patterns of interest. The machine
learning is combination of computer science and engineering and statistics. Machine
learning is used for the analysis, regression, pattern recognition, classification etc.
Types of Machine Learning:
These machine learning algorithms are classified into four types. They are
1. Supervised learning
2. Unsupervised learning
69
3. Semi Supervised Learning
4. Reinforcement learning
5.3.1. Supervised learning:
In supervised learning algorithm we will train the data by giving the inputs and the
predicted outputs. The output for a given is known before itself and the machine
learning should be able to assign the given inputs to the outputs. So that it can predict
the output values for the new data based on relationship which it learned from the
previous data. Types of supervised learning algorithms are
1. Nearest Neighbour
2. Naïve Bayes
3. Decision Tree
4. Linear Regression
5. Support Vector Machines
6. Neural Networks
5.3.2. Unsupervised learning:
In unsupervised learning algorithms are used when the information used to train the
data is neither classified nor labeled. While training only inputs are assigned but there
are no output labels, where the computer might be able to teach new things after it
learns the pattern. This type of algorithm is used in the cases where the human doesn’t
know what to look for in the data. Types of unsupervised learning algorithms are
1. k-means Clustering
2. Association Rules
5.3.3. Semi Supervised Learning:
Semi Supervised Learning algorithms fall in between the supervised and unsupervised
learning algorithm as, it uses both labelled and unlabelled data for the training. This is
mainly used to improve better training accuracy. This method exploits the idea that
even though the group memberships of the unlabeled data are unknown, this data
carries important information about the group parameters.
5.3.4. Reinforcement learning:
The machine is exposed to an environment where it gets trained by trail and error
method. The machine learns from the past experience and tries to capture the best
possible knowledge to make accurate decisions based on the feedback received. In
this process, the agent learns from its experiences of the environment until it explores
70
full range of possible states. Types of reinforcement algorithms are Q-Learning, Deep
Adversarial Networks.
5.4. THEORITICAL BACKGROUND:
The real SVM algorithm is invented by Vladimir N. Vapnik and Alexy YA.
Cervonenkis in the year 1963.And after this invention Behard E. Boser , Isabelle M.
Guyon and Vladmir N. Vapnik had suggested a way to create nonlinear classifications
to the SVM by applying the kernel trick to the maximum margin of hyper planes.
Now we currently use the standard of margin was introduced by Cornin Cortes and
Vapnik in year 1993 and it was published in the year 1995.
In machine learning classifying the data is a common task. In case if we give
some data points which belong each of them to one of the two classes, and our goal is
to decide that the given data will be in which class. In the case of SVM the data point
is assumed as a -dimensional vector and want to know that as it can separate it with a
point as (P-1)-dimensional hyper plane. Hence it is known as linear classifier. There
are many hyper planes which may be classifying the data. The best hyper plane is
which represents the largest margin or separation between two classes. So we choose
that hyper plane has a distance from the nearest point on each side is maximized and
these type of hyper planes are called as maximum margin hyper planes ad in linear
classifier it defines as maximum margin classifier.
5.4.1. Support Vector Machine:
Support Vector Machine (SVM) is a type of supervised machine learning approach
which is developed by Vapnik. SVM can be used as pattern classification and
regression. SVM is mostly preferred for the classification of data as it does not trap in
local minima points and need less training input. SVM works much faster than
Artificial Neural Network (ANN) and Multi layer perceptron Neural Network (MLP-
NN). It is based on the structural risk minimization principle and it overcomes the use
of selecting appropriate VC dimension. VC dimension means separating the
maximum number of points in all the possible ways. The drawback of VC dimension
is that the separation will be perfect only for low data. In this work we are using SVM
as a classifier to justify the proposed method is good for classifying the data as good
or bad. This method projects the data from lower dimensional to the higher
dimensional feature space where the different classes are linearly separable in order to
have the optimal separating hyperplane.
71
5.4.2. Math behind the SVM:
SVM concept originated form statistical learning theory. SVM is a linear vector
machine whose design is greatly influenced by the positions of the support vectors. It
provides better generalization ability compared to traditional methods like ANN,
MLP-NN etc. transformation of data from lower dimension to the higher dimension is
done by using the kernel function. There are many kernel functions such as
polynomial, splines, radial basis function and sigmoid can be used in the
classification.
Figure 5.4: SVM classifier using hyperplane

Assume, that the training data set of N points are given where vector
denotes a pattern to be classified and scalar is the output pattern. If the training
data is linearly separable, SVM determines a hyperplane that separates the data in the
feature space. The hyperplane should be placed in such a way that it should separate
the classes with the large margins. The hyper plane is given as
(1)
Where w is the adjustable weight vector and b is the bias. The classification is not
perfect only with the hyperplane. So, we are using the boundaries to have perfect
classification that are denoted with when it is lying on the positive side
of the hyperplane and when it is lying on the negative side of the
hyperplane. The margin is given as which is equal to . To have the perfect
classification the margin should be increased. The margin can be increased by the
reduction of the weight with the function
(2)
is subjected to a constraints
(3)
72
The constrained optimization problem is characterized as the constraints are linear and
the cost function is convex function. The constrained optimization problem or the
primal problem is solved by using the method of Lagrange multiplier. The lagrange
multiplier is constructed as
(4)
Where are the lagrangian multipliers which are auxiliary non negative variables.
By the maximum point of the lagrangian function, the solution to the constrained
optimization problem is solved. The two different conditions are
(5)
(6)
By substituting the equation (4) in equations (5), we get the weight as

(7)
By substituting the equation (4) in equations (6), we get
(8)
Substitute the equation (7), (8) in the equation (4), and then the lagrangian multiplier
to solve the dual problem is
(9)
is subjected to the constraint

, (10)
If the training data is not completely separable by a hyperplane, a new set of non
negative scalar variables are introduced as , slack variable
that represents by which amount the linear constraint is

violated.
(11)
(12)
The slack variable is measured by the deviation of a data point from the ideal
condition of pattern separability. For , the data points fall inside the
separation region but on the correct side. For , it falls on the wrong side of the
hyperplane. The cost function to be minimized becomes
(13)
73
is subjected to the constraints of (11) and (12). Where, C is known as the
regularization parameter which controls the trade off between complexity of the
machine and the number of non separable points. Using the method of lagrange
multiplier and proceeding in a similar way of linearly separable data, the dual problem
is formulated as
(14)
Subjected to constraint
, (15)
Thus the decision function for patterns of input becomes
Where k(x,y) is the kernel function that converts the input into higher dimensions.
5.5. SVM CLASSIFIER FRAMEWORK:
Training samples
Extracting feature vectors
SVM classifiers Classifier

Testing samples
evaluation
Acceptable/Unacceptable
ECG signal
Figure 5.5: classification framework of SV
For the classification of the data using SVM, initially we need to train the machine
with some examples. While training, we will provide the input label and the predicted
output label. SVM will extract the features of the support vectors and the
classification is done. When the training is perfectly done the machine is able to
classify on its own with its experience. Then the testing is done with the test samples
and the classification is evaluated that whether the machine is able to classify the data
correctly or still it needs the training. Hence, SVM is able to classify the data on its
own with the experience.
74
5.6. PERFORMANCE EVALUATION:
When the decision function is known, the confusion matrix is constructed. In many
real life applications the data is imbalanced. Class imbalance presents major challenge
for classification algorithm where the risk loss for minority class is higher than the
majority class. When the minority data points are more important than the majority
and the main goal is to classify those minority data points correctly. When dealing
with imbalanced datasets overall accuracy is measure of good classifier. Instead the
confusion matrix and information on TP and FP are better indication of classifier
performance.
Table 5.1: Confusion matrix
Predicted/actual class Positive class Negative class
Positive class TP FP
Negative class FN TN
True Positive (TP) is the number of data points correctly classified from the positive
class. False Positive (FP) is the number of data points predicted to be in the positive
class but in fact belonging to the negative class. True Negative (TN) is the number of
data points correctly classified from the negative class. False Negative (FN) is the
number of data points predicted to be in the negative class but in fact belonging to the
positive class. With the help of the confusion matrix we can calculate the performance
metrics of sensitivity, specificity and accuracy.
Sensitivity is defined as a measure of how well a classification algorithm classifies the
data points in the positive class.
Specificity is defined as a measure of how well a classification algorithm classifies

data points in the negative class.
Accuracy is defined as the number of data points correctly classified by the

classification algorithm.
5.7. EXPERMENTAL RESULTS:

The confusion matrix obtained is
75
Table 5.2: Result of confusion matrix
Predicted yes Predicted no

Actual yes 9 6
Actual no 0 5
Sensitivity = 60
Specificity= 100
Accuracy = 70
Precision= 100
Figure 5.6: Experimental results of SVM with confusion matrix.

5.8. CONCLUSION:
This is the very new technique many software tools are used in SVM. It is good text
classification. Kernel makes SVM non linear algorithm. We have successfully
implemented this algorithm and solved large data set problems using acceptable
amounts of computer time and memory. From the border point of view we also
consider interesting the use the function approximation or regression extension that
has done in SVM in many different areas where Neutral Networks are currently used.
76
CHAPTER 6
CONCLUSION AND FUTURE SCOPE
6.1. CONCLUSION:
Hence, the noise present in the ECG signal is removed by using Empirical Mode
Decomposition (EMD) technique. The QRS complex is detected by using the Pan-
Tompkins algorithm and the R peaks are detected throughout signal as they plays a
vital role in the diagnosis of the ECG signal. The automatic assessment of the signal
quality is done by using the rules and adaptive template matching.
It is observed that the signal is said to be acceptable, if it satisfies all the rules
and adaptive template matching.
If the signal is acceptable then it is a noise free signal and then the disease
identification is done. The method is justified by using the SVM classifier and it is
observed that the performance is more.
6.2. FUTURE SCOPE:
Since ECG plays a vital role in the biomedical field, it is necessary to remove
the noise present in the signal. As the automatic evaluation of the signal is performed,
it can be implemented in the IoT as it reduces the power consumption and storage
needed.
77
REFERENCES
[1] D. He and S. Zeadlly, “An analysis of RFID authentication schemes for Internet of
Things in healthcare environment using elliptic curve cryptography, “IEEE Internet of
Things J.,vol. 2, no. 1, pp. 72-83, Feb. 2015.
[2] L. Catarinucci et al., “An IoT-aware architecture for smart healthcare systems,”
IEEE Internet of Things J.,vol. 2, no. 6,pp. 515-526,Dec. 2015.
[3]C. Orphanidou, T. Bonnici, P. Charlton, D.Clifton, D.Valance, and L.Tarassenko,
“Signal-quality indices for electrocardiogram and photoplethysmogram: Derivation
and applications to wireless monitoring,”IEEE J. Biomed. Health
Informat.,vol.19,no.3,pp.832-838,May 2015.
[4] L.W Andersen et al. “The prevalence and significance of abnormal vital signs
prior to in-hospital cardiac arrest,” Resusciation,, vol. 98, pp.112-117, 2016.
[5] C. Orphanidou, T. Bonnici, P. Charlton, D.Clifton, D.Valance, and L.Tarassenko,
“ A method for assessing the reliability of heartrates obtained from ambulatory ECG,”
in Proc. BIBE, 2012, pp. 193-196.
[6] P.S Hamilton and W.J. Tompkins, “ Quantitative investigation of QRS detection
rules using the MIT/BIH arrhythmia database,” IEEE Trans. Biomed. Eng., vol.
BME-33, no. 12, pp. 1157-1165, Dec. 1986
[7] E. Rumelhart, G. E Hinton, and Williams R. J. Hinton,“Learning internal
representations by error programming. In parallel disturbed Processing: explorations
in the microstructure of cognition,” MIT Press.
[8] Y. Ozhay, “Fast Recognition of ECG Arrhythmias,” Ph.D. Paper,University of
Selcuk, Konya,Turkey, 1999.
[9] C. Cotes, V. Vapnik, in Machine Learning. pp.273-297 (1995).
[10] C.J. C.Burges , Data Mining and Knowledge Discovery 2, 121(1998).
[11] J. Shawe-Taylor, N. Cristianini, Kernel Methods for Patttern Analysis.Cambridge
University Press, New York, NY, USA (2004).
[12] Andreao, R., Dorizzi, B., and Boudy, J.(2006). ECG signal analysis through
hidden Markov models. IEEE Transactions on Biomedical Engineering,
vol.53,no.8,pp.1541-1549.
[13] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science
and Statistics).Springer (2006).
[14] Anuradha, B. And Reddy,V. (2008). ANN classification of cardiac arrhythmias,
ARPN Journal of Engineering and Applied Sciences, vol.3, pp. 1-6.
78

No 4 PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

No 4 PDF

Hochgeladen von

Copyright:

Verfügbare Formate

CHAPTER 1

Figure 1.1: Heart Anatomy

Figure 1.4: Normal placement of 3 Lead ECG

Figure1.6: Placement of the chest leads in 12 lead ECG

Figure 1.7: Signal contaminated with power line interface noise

Figure 1.8: signal contaminated with baseline wander noise

Figure 1.9: Signal contaminated with electromyogram noise

Figure 1.10: Signal contaminated with electrode contact noise

Figure 1.11: signal contaminated with instrumentation noise

Figure 1.12: Sinus Arrest

Figure 1.13: Sinus Arrhythmia

Figure 1.14: Sinus Tachycardia

Figure 1.15: Sinus Bradycardia

GND 3.3V GND

Output AD8232 10ARDUINO

Set the baud rate as 9600 for

Read the analog data

Write the data

Figure 1.17: Algorithm for arduino

Initialize time=0, data=0,

Initialize the serial port as

Close the serial port and

Figure 1.18: Algorithm to plot the signal in MATLAB

The experimental results are shown as

Figure 1.20: Signal values stored in excel spreadsheet

Figure 1.24: AD8232 ECG module

Where, x ranges from to .

If and were known in advance, In the case of n = 1, then these conditions

Now substitute value in equation 11 then,

The condition of ﬁrst-order continuity, which is found in equation

Signal to Noise Ratio (SNR):

BW 111 0.99043792 27.1252

BW 111(1m) 0.97384158 21.3360

BW 116 0.98281122 25.4579

BW 100(0) 0.86976693 6.6850

BW 100(1) 0.85359654 6.6850

BW 100(2) 0.88915864 7.0252

BW 100(5) 0.88966521 7.3363

PICS BW 100(6) 0.88966521 7.3363

BW 100(7) 0.90618978 7.1062

BW 100(8) 0.86208359 6.5227

Figure 2.1: Raw ECG Signal of 234m

Figure 2.2: Result for filtered EMD signal of 234m

Figure 2.3: Raw ECG Signal of M1

Figure 2.4: Result for filtered EMD signal of M1

Figure 2.6: Result for filtered EMD signal of 116m

Figure 2.7: Raw ECG Signal of 100m

Figure 2.8: Result for filtered EMD signal of 100m

Figure 2.9: Raw ECG Signal of infant1

Figure 2.10: Result for filtered EMD signal of infant1

AD 8232 Acquired signal:

Figure 2.11: Raw ECG Signal of AD8232 signal

Band pass filter

Moving window integration

Figure 3.1: Pan Tompkins Algorithm.

1. Band Pass filters.

Where T” is the sampling period. The difference equation of the filter is

The difference equation is

Where N” is the number of samples in the width of the integration window.

MIT BIH 232(10s) 4.7667

Figure 3.2: Raw signal of input 234m

Figure 3.3: Results of R peak detection for input 234m

Figure 3.4: Results of QRS detection for input 234m