Sie sind auf Seite 1von 5

Inter-patient Heart-beat Classification using

complete ECG beat Time series by alignment of


R-peaks using SVM and Decision Rule
Jagadeeswara Rao Annam and Bapi Raju Surampudi
School of Computer & Information Sciences, SCIS, University of Hyderabad, India
Email: ajagarao@gmail.com bapics@uohyd.ernet.in

AbstractAn ElectroCardiogram (ECG) inter-patient heart- of auricles and ventricles, caused by the action potential gener-
beat time series classification method by a hierarchical system of ated by Sino-Atrial (SA) node,the natural pacemaker of Heart
based on support vector machine and Decision rule, using full This polarisation and depolarisation causes the auricles or
heart-beat time series by alignment of R-peaks of all beats, is
proposed. PQRST Time series of heart-beats having converted ventricles to relax and contract respectively. P wave occurs by
into equal length series by alignment of R-peaks of all heart-beats the depolarization of auricles (atrium) and hence contracting
based on R-peak of largest length PQRST series in the data and the auricles. This takes approximately 0.1 sec duration. QRS
by padding zeroes to the smaller length series on either side, was complex/wave occurs by the depolarization of the ventricles.
used in this experimentation. The main objective of this paper is This depolarisation causes the ventricles to contract so that
to identify the abnormalities in ECG heart beats based on AAMI
Categorization. Experiments were conducted on ECG data of 44 the right ventricle and left ventricle pumps the impure blood
patients obtained from MIT-BIH Arrhythmia database. Results and pure blood to lungs and arteries respectively. This QRS
were compared with existing methods such as weighted support complex represents the ventricles depolarizing, and completes
vector machine (SVM), hierarchical SVM and weighted linear in about 0.04 sec to 0.12 sec duration. T wave occurs due
discriminant analysis (LDA). Comparative analysis confirms the to the re-polarization of ventricles. During the T wave, the
viability and superiority of the proposed approach in terms of
Total classification accuracy (TCA). Proposed system achieved ventricles are re-polarizing or relaxing about 0.27 sec. The
Sensitivities of 98.7%, 85.9%, 88.8%, 58.3%, PPV% of 98.53%, heart diseases change the shapes of P, QRS and T waves from
82.2%, 89.9%, 85.6% for N, S, V and F classes respectively and the normal class [4]. Based on the recommendations of AAMI,
a TCA of 97.3%. MIT-BIH labeled types are grouped into five more clinically
relevant heart beat classes as shown in Table I.
I. I NTRODUCTION
TABLE I
Cardiovascular diseases (CVDs) are the leading cause of AAMI LABELS INMIT-BIH
death globally as more people die annually from CVDs than
# Class MIT-BIH Class (label) AAMI
from any other cause. 80% of these CVD deaths take place
1 . or N Normal (1)
in low and middle income countries. It is evident that early 2 L Left Bundle Branch Block (2)
detection followed by timely treatment of these diseases have 3 R Right Bundle Branch blocks (3) N
prevented these premature deaths in High Income countries. 4 j Nodal (junctional) escape (11)
5 e Atrial Escape (34)
ECG as a non-invasive diagnostic method, is economical 6 a Aberrated Atrial premature (4)
among all cardiac-related investigations. Automated digitized 7 J Nodal (junctional) premature (7)
ECG analysis can help doctors in assessing the condition of 8 A Atrial premature (8) S
heart-patient effectively. The main objective of this work is 9 S Supraventricular premature (9)
10 V Ventricle Premature contraction(5) V
to identify the abnormalities in ECG heart beats based on 11 E Ventricular escape (10)
the categorization scheme recommended by the Association 12 F Fusion of ventricular&normal (6) F
for the Advancement of Medical Instrumentation(AAMI) [1] 13 /or p Paced beat (12)
using the ECG heart beat data obtained from MIT-BIH Ar- 14 f Fusion of Paced and Normal (38) Q
15 Q Unclassifiable beat (13)
rhythmia database [2].

A. ECG Heart-beat Classification B. Support Vector Machines


ECG Heart-beat time series is a sequential data in which ob- Support vector machine (SVM) [19] is a supervised learning
servations are measured in time. Classification task is usually method for two-class classification and its extension to multi-
supervised and it first learns a classifier from training data with ple classes is straightforward by combining several binary clas-
labels, and uses the learned classifier to attach suitable labels sifiers using either one-against-all or one-against-one methods
to test data [3]. Different waves like P, QRS and T waves etc compared to directly considering all data in one optimization
in ECG heartbeat occur due to polarization and depolarization problem. This work used one-against-one approach which uses
k (k 1)/2 svm models, where each model is trained on data of 600 msec applied on the resulting signal removes the T
from two classes. Let us first define the p dimensional feature waves. The signal resulting from the second median filter oper-
vector xt = [xt1 , xt2 ...xtp ] be a sequence of p samples of a ation contains the baseline wanderings and is subtracted from
1-dimensional signal and the associated class value yt [1, k] the original signal. A second-order bidirectional Butterworth
for muliticalss, for a given heart beat t with t ranging from 1 filter with passband of 0.5 Hz to 35 Hz is used as described
to N, N being the total number of heart beats in the training by Elgendi [13], because the main frequencies of the P and
dataset. the T wave lie in the range of 0.5 Hz to 35 Hz. This band-pass
min 1 ij T ij X ij filter is required to smoothen the wave used for calculation of
ij ij ij (w ) w + C t (1) Onset and offset values of P and T waves using Chan slope
w b 2 t method.
such that

(wij )T (xt ) + bij +1 tij , if yt = i, B. Feature Extraction of ECG Heart-beat Data
(2)
(wij )T (xt ) + bij 1 + tij , if yt = j, R-peak values are taken from MIT-BIH annotation files.
Q and S are computed using Yeh Difference method [20].
tij 0, t = 1, 2, ...N (3)
Poffset and Tonset are calculated using Chan Slope method
If sign((wij )T (xt )+bij ), gives x is in i th class then yt = i, [9]. Ponset and Toffset are delineated using the combination of
and the vote for i th class is added by one, otherwise the vote Chan method and ecgpuwave software of MIT-BIH. Proposed
for j th class is increased by one in the Max Wins Voting Method used the complete Time series of hearbeats from
approach. Pwave onset to Twave offset, having converted into equal
length series by alignment of R-peaks of all heart-beats based
C. Background on R-peak of largest length PQRST series in the data and by
The proposed method is motivated by the works of Goras padding zeroes to the smaller length series on either side.
[14] that the un-alignment of the R waves increase the distance
between similar class waveforms for time series classification III. P ROPOSED M ETHOD
by using the Euclidian distance. And SVM requires each
pattern as a vector of equal dimensions so it can not be fed with A Hierarchical 2-stage system of SVM and Decision Rule
the variable length time series so a transformation is required for Classification is proposed as mentioned in Huang such
where variable length time series are to be aligned to equal that from the predicted output of SVM, V-class and F-class
length [8]. are to be detected. For the remaining heartbeats, the decision
Huang [15] used random projection in support vector ma- rule based on the normalised RR previous interval (the ratio
chine (SVM) ensemble to detect V class then the ratio of of interval between current R peak to the previous R-peak to
the RR interval was compared to a predetermined threshold the mean RR previous interval is compared to a predetermined
to detect S class on the test data set 2 (DS2) of MIT-BIH threshold to detect S-class and N-class). Use of the RR interval
Arrhythmia database for detecting 3 classes of AAMI. ratio can reduce the overlap between S and class N heartbeat
and thus increase the S detection rate.
TABLE II
DATA SETS OF MIT-BIH A RRHYTHMIA DATABASE Train Data Train SVM

Dataset MIT-BIH Records


V-class
DS1 101, 106, 108, 109, 112, 114, 115, 116, 118, 119, 122,
SVM
124, 201, 203, 205, 207, 208, 209 , 215,220, 223 &230 Test Data
classification

DS2 100, 103, 105, 111, 113, 117, 121, 123, 200, 202, 210, F-class
212, 213, 214, 219, 221, 222, 228, 231, 232, 233 & 234
RR yes
interval >
no
0.8 ?
II. P REPROCESSING & F EATURE E XTRACTION
A. Preprocessing of ECG Heart-beat Data S-class N-class

The ECG data downloaded from MIT-BIH are de-amplified


by analog-to-digital conversion gain to reproduce the polarity. Fig. 1. Hierararchical System of SVM and Decision Rule
Then the signals are processed for Baseline Wander correction
and Bandpass filtering to remove any unwanted noise. Baseline First the ECG heart-beat time series are divided into two
wandering in ECG signal occur due to respiration and it is data sets one for training and another for testing as used by
removed by using the filtering procedure defined by Chazal Lannoy [10] as shown in Table II. The two sets of heart beat
[6]. Two median filters are used in Baseline wander removal time series are transformed into R-peak aligned time time
procedure. The first median filter of 200 msec width removes series as proposed in Section IIIA and is shown in Figure
the QRS complexes and the P waves. The second median filter 2 for the test record 213.
mean RR previous interval is compared to a predetermined
threshold to detect S-class and N-class). Use of the RR interval
ratio can reduce the overlap between SVEB and class N
heartbeat and thus increase the SVEB detection rate.

IV. E XPERIMENTAL R ESULTS


Fig. 2. MIT-BIH ECG 213 after Deamplified by gain
The classification experiments were conducted based on the
division of MIT-BIH records into two data sets as used by
A. Translation transform for alignment of R-peaks Chazal [6] and Lannoy [10] is shown in Table II. For the
comparative results shown in Table III, we considered the
1) Find O, the position of the R-peak in the beat having
papers which used the complete data of 44 records of MIT-BIH
largest length PQRST series in the data i.e in Train Data
Arrhythmia Database excluding the other 4 records containing
and Test Data. ( i.e position of the R-peak from the
the paced beats of the respective 4 patients as recommended by
beginning of the beat).
AAMI standard. The experimental results achieved using the
2) Translate beats into 2-d array by applying the transform
proposed classification approach are shown in Table IV. The
to all beats so that all beats align their Rpositions at a
comparative results with other approaches using classification
common point i.e O.
accuracies in terms of sensitivity achieved using the proposed
a) F (O Rpeakposition + 1 : O) = Beat(1 : work are shown in Table III.
Rpeakposition).
b) F (O + 1 : O + Beatlength Rpeakposition) =
Beat(Rpeakposition + 1 : Beatlength). V. D ISCUSSION

B. SVM for Classification The two challenging issues in Heart-beat Classification


SVM for Classification was developed by modifying the are 1) Inter-patient beat variations. 2) Intra-patient beat
libSVM [12] and applying the following steps. variations[5]. Each class manifests inter-patient beat variations
i.e across different patients, beats of the same class have
1) The translation transform proposed in section IIIA is
variations because of the patient specific data. In Intra-patient
used to align the Time series of all hearbeats at R-
data, each class may have variations. The proposed method
peak of largest length PQRST series in the data and
overcomes the issue of Inter-patient variations in each class
by padding zeroes to the smaller length series on either
by R-peak algnment and solve the issue of Intra-patient
side, because the baseline is isoelectric that is physical
variations of different classes by the normalization method
zero level of the signal.
effectively. Huang [15] used random projection and support
2) A Numerical 2-dim Array along with a ECG visu-
vector machine (SVM) ensemble to detect VEB then the ratio
alisation Tool is designed to convert variable length
of the RR interval to mean RR imterval was compared to a
Heart-beat Time series into equal length series. The
predetermined threshold to detect SVEB on the dataset 2 of
final column length of this 2-d Array, after applying
MIT-BIH Arrhythmia database [2], but used only 3 classes in
transform to all beats , becomes the length of the largest
experimentation. Our method uses single Linear SVM where
length heart-beat time series after the alignment, because
as Huang [15] used Ensemble of 15 SVMs for V detection.
PQRST time series of the heart-beat is not symmetric
relative to the R-peak. That is, length of PQR series is
not equal to the length of RST series in any heart-beat. VI. C ONCLUSION
So finding O, the point of alignment for R-peaks, is
critical and challenging. We proposed a novel approach for Inter-patient ECG heart-
3) Normalize the 2-d Array so that Mean value for each beat Classification based on AAMI recommended ECG heart-
column is 0 and Standard deviation is 1. beat categorization scheme by using full heart-beat PQRST
4) Train the SVM using linear kernel after 10-fold cross- time series. Benchmark datasets from MIT-BIH Arrhythmia
validation of training data with the evolved Regulariza- database are utilized for experimentation. As shown in Ta-
tion parameter C by the grid search [12]. ble III, the proposed method shows superior performance
5) Test the SVM for prediction of test labels. results over other methods and thus its viability is established.
Compared to the traditional feature based classification ap-
C. Hierarchical system of SVM and Decision Rule for Clas- proaches, the present complete time series based approach
sification shows good discrimination capability with out losing any
From the predicted output of SVM, V-class and F-class are decriminative information in the available data. Future work
to be detected. For the remaining heartbeats, the decision rule would focus on improving classification accuracies of all the
based on the normalised RR previous interval (the ratio of classes, while retaining overall superior classification accu-
interval between current R peak to the previous R-peak to the racy.
TABLE III
OVERALL C OMPARISON OF C LASSIFICATION ACCURACY IN %

Sensitivities PPV TCA


Method N S V F N S V F %
Proposed SVM + Rule 98.7 85.9 88.82 58.3 98.53 82.27 89.9 85.6 97.3
Proposed SVM 98.25 82.1 88.82 58.3 98.3 73.7 89.6 85.6 96.7
Huang, 2014 [15] 99.2 91.1 93.9 - 95.2 42.2 90.9 - 93.8
Weighted LDA, Llamedo 2011 [16] 95 77 81 - 98 39 87 - 93.0
MLP, Tanis Mar, 2011 [17] 89.6 83.2 86.8 61.1 99.3 33.5 75.9 16.6 89.0
SVM ensemble, Zhang, 2014, [21] 88.9 79.1 85.5 93.8 98.9 35.9 92.8 13.74 86.6
Hierar. SVM, Park, 2008 [18] 86.3 82.6 80.9 54.9 - - - - 85.6
Weighted CRF +L1 Lannoy,2012 [11] 79.8 92.6 85.2 84.5 - - - - 85.4
Weighted LDA, De Chazal,2004 [6] 86.9 75.9 77.7 89.3 99.2 38.5 81.9 8.6 77.7

TABLE IV
SVM+RULE R ESULTS ON DATA S ET 2

SVM Total Beats False Positives True Positives Total Total ACC
File N S V F N S V F N S V F TP Beats %
100 2239 33 1 0 23 30 22 0 2188 10 0 0 2198 2273 96.7
103 2082 2 0 0 2 19 7 1 2055 0 0 0 2055 2084 98.6
105 2526 0 41 0 2 0 29 0 2497 0 39 0 2536 2572 98.79
111 2123 0 1 0 1 0 12 0 2110 0 0 0 2110 2124 99.3
113 1789 6 0 0 0 0 0 0 1789 6 0 0 1795 1795 100
117 1534 1 0 0 0 0 0 0 1534 1 0 0 1535 1535 100
121 1861 1 1 0 1 0 7 0 1854 0 1 0 1855 1863 99.5
123 1515 0 3 0 0 0 15 0 1500 0 3 0 1503 1518 99
200 1743 30 826 2 118 3 6 0 1737 0 737 0 2474 2601 95.3
202 2061 55 19 1 35 0 17 0 2055 12 17 0 2072 2136 97.0
210 2423 22 195 10 67 27 0 0 2380 0 176 0 2556 2650 96.4
212 2748 0 0 0 0 71 21 0 2656 0 0 0 2656 2748 96.7
213 2641 28 220 362 110 0 57 24 2626 0 209 225 3060 3251 94.1
214 2003 0 256 1 37 0 12 2 1993 0 218 0 2211 2262 97.7
219 2082 7 64 1 42 0 0 0 2082 0 30 0 2112 2154 98.1
221 2031 0 396 0 8 2 36 11 1993 0 377 0 2370 2427 97.6
222 2274 209 0 0 119 82 10 0 2182 90 0 0 2272 2483 91.5
228 1688 3 362 0 87 2 9 0 1678 0 277 0 1955 2053 95.2
231 1568 1 2 0 3 0 0 0 1568 0 0 0 1568 1571 99.8
232 398 1382 0 0 23 347 0 0 51 1359 0 0 1410 1780 79.2
233 2230 7 831 11 73 1 6 0 2230 0 769 0 2999 3079 97.4
234 2700 50 3 0 23 50 17 0 2633 30 0 0 2663 2753 96.7
Total 44259 1837 3221 388 654 340 320 38 43689 1578 2861 226 48354 49707 97.3
%Se 98.71 85.9 88.8 58.3 PPV 98.53 82.27 89.94 85.6 TCA 97.3

R EFERENCES [5] Alexander Singh Alvarado, Choudur Lakshminarayan, Jose C. Principe,


Time-based Compression and Classification of Heartbeats. IEEE Trans-
actions on Biomedical Engineering,99 (2012)
[1] Association for the Advancement of Medical Instrumentation.Testing
[6] Chazal, P. D., ODwyer, M., and Reilly,R.B Automatic classification
and reporting performance results of cardiac rhythm and ST seg-
of heartbeats using ecg morphology and heartbeat interval features.
ment measurement algorithms. ANSI/AAMI EC38:1998 (ANSI/AAMI
Biomedical Engineering, IEEE Transactions on, 51:11961206 (2004).
EC13:2002) Arlington, VA: Association for the Advancement of Medical
Instrumentation (2002) [7] P. de Chazal and R. B. Reilly, A patient-adapting heartbeat classifier us-
[2] MIT-BIH Arrhythmia Database, http://www.physionet.org/physio- ing ECG morphology and heartbeat interval features IEEE Transactions
bank/database/mitdb/ on Biomedical Engineering, vol. 53, no. 12, pp. 25352543, (2006)
[3] E. Backer and A. K. Jain A Clustering Performance Measure Based on [8] H. Shimodaira, K.-I. Noma, M. Nakai, and S. Sagayama, Dynamic time-
Fuzzy Set Decomposition. IEEE Transactions on Pattern Analysis and alignment kernel in support vector machine, in Advances in NIPS14.
Machine Intelligence Vol. PAMI-3, No. 1, pp. 66-75, January (1981) 2002, MIT Press.
[4] R.M. Rangayyan, Biomedical Signal Analysis: A Case-Study Approach. [9] K. Tan, K. Chan, and K. Choi. Detection of the QRS complex P wave
Wiley, Inter-Science, New York (2001) and T wave in electrocardiogram. Advances in medical signal and
information processing, pages 4147 (2000) [16] Llamedo-Soria M, Martinez JP, Heartbeat classification using feature
[10] G de Lannoy JD D Francois, Verleysen M, Feature Relevance Assess- selection driven by database generalization criteria. IEEE Trans
ment In Automatic Inter-patient Heart Beat Classification. In Proceedings Biomed Eng , 58:616625(2011)
of the International Conference on Bio-inspired Systems and Signal [17] Mar Tanis, Zaunseder S, Martinez JP, Optimization of ECG clas-
Processing, Valencia, Spain, 1320 (2010) sification by means of feature selection. IEEE Trans Biomedical
[11] de Lannoy G, Francois D, Delbeke J, Verleysen M,Weighted condi- Engineering,58,21682177 (2011)
tional random fields for supervised interpatient heartbeat classification. [18] Park KS, Cho BH, Lee DH, Song SH, Lee JS, Chee YJ, Kim IY, Kim
IEEE Trans Biomed Eng , 59:241247 (2012) SI, Hierarchical Support Vector Machine Based Heartbeat Classifi-
[12] HSU,C.-W. AND LIN, C.-J. (2002) A comparison of methods for multi- cation Using Higher Order Statistics and Hermite Basis Function.
class support vector machines. IEEE Trans. Neural Netw. 13,2, 415425 Proceedings of 35 Annual Computers in Cardiology Conference, IEEE
[13] Mohamed Elgendi, Mirjam Jonkman, Friso De Boer, Premature Atrial Bologna (2008)
Complexes Detection using The Fisher Linear Discriminant. 7th IEEE [19] Vapnik V N. The nature of statistical learning. Springer, Berlin,
International Conference on Cognitive Informatics( ICCI), pp. 83-88, Aug (1995)
(2008) [20] Yeh Y.C., Wang, W.J.,Chiou, C.W. A novel fuzzy c-means method
[14] L. Goras, M. Fira, Preprocessing Method for Improving ECG Signal for classifying heartbeat cases from ECG signals. Measurment, 43,
Classification and Compression Validation. Fourth Int. Scientific pp. 1542-1545(2010)
Conference on Physics and Control PHYSCON, Catania, Italia, Paper [21] Zhang ZC, Dong J, Luo XQ, Choi KS, Wu XJ, Heartbeat classifica-
ID 262, Procceding IEEE (2009) tion using disease-specific feature selection. Comput Biol Med, 46:79
[15] Huang et al. A new hierarchical method for inter-patient heartbeat 89.(2014)
classification using random projections and RR intervals. BioMedical
Engineering OnLine 2014 13:90.doi:10.1186/1475-925X-13-90 (2014)

Das könnte Ihnen auch gefallen