You are on page 1of 12

Clinical Neurophysiology 118 (2007) 13481359 www.elsevier.

com/locate/clinph

Combination of EEG and ECG for improved automatic neonatal seizure detection
Barry R. Greene a,*, Geraldine B. Boylan b, Richard B. Reilly Philip de Chazal a, Sean Connolly d
a

a,c

School of Electrical, Electronic & Mechanical Engineering, University College Dublin, Ireland b Department of Paediatrics and Child Health, University College Cork, Ireland c Cognitive Neurophysiology Laboratory, St. Vincents Hospital, Fairview, Dublin, Ireland d Department of Clinical Neurophysiology, St. Vincents University Hospital, Dublin, Ireland Accepted 7 February 2007 Available online 29 March 2007

Abstract Objective: Neonatal seizures are the most common central nervous system disorder in newborn infants. A system that could automatically detect the presence of seizures in neonates would be a signicant advance facilitating timely medical intervention. Methods: A novel method is proposed for the robust detection of neonatal seizures through the combination of simultaneously-recorded electroencephalogram (EEG) and electrocardiogram (ECG). A patient-specic and a patient-independent system are considered, employing statistical classier models. Results: Results for the signals combined are compared to results for each signal individually. For the patient-specic system, 617 of 633 (97.52%) expert-labelled seizures were correctly detected with a false detection rate of 13.18%. For the patient-independent system, 516 of 633 (81.44%) expert-labelled seizures were correctly detected with a false detection rate of 28.57%. Conclusions: A novel algorithm for neonatal seizure detection is proposed. The combination of an ECG-based classier system with a novel multi-channel EEG-based classier system has led to improved seizure detection performance. The algorithm was evaluated using a large data-set containing ECG and multi-channel EEG of realistic duration and quality. Signicance: Analysis of simultaneously-recorded EEG and ECG represents a new approach in seizure detection research and the detection performance of the proposed system is a signicant improvement on previous reported results for automated neonatal seizure detection. 2007 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Keywords: Neonatal seizure detection; EEG; ECG; EKG

1. Introduction Seizures in the neonate require immediate medical attention and represent a distinctive sign of central nervous system dysfunction. There is increasing evidence that neonatal seizures have an adverse eect on neurodevelopmental outcome, and predispose to cognitive, behavioural, or epileptic complications in later life (Levene, 2002). Neonatal seizures

Corresponding author. Tel.: +353 21 490 3793. E-mail address: barry.greene@ee.ucd.ie (B.R. Greene).

occur in 6% of low birth-weight infants (Volpe, 2001) and in approximately 2% of all newborns admitted to a neonatal ICU (Scher et al., 1993a). Seizures in this age-group are often subtle, dicult to diagnose and may be clinically silent, particularly after antiepileptic drug treatment, making diagnosis by clinical observation alone very unreliable (Boylan et al., 2002). Electroencephalography (EEG) is the most reliable method available to detect the majority of neonatal seizures but interpretation requires special expertise that is not readily available in most neonatal intensive care units least so on a 24-h basis. A system that could automatically detect the presence of seizures in

1388-2457/$32.00 2007 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.clinph.2007.02.015

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

1349

newborn babies would be a signicant advance, facilitating timely medical intervention. A number of studies have reported neonatal seizure detection methods based on the EEG (Liu et al., 1992; Gotman et al., 1997a; Celka and Colditz, 2002; Altenburg et al., 2003). Faul et al. (2005) provided a review and experimental comparison of three of the most commonly cited neonatal seizure detection algorithms. None performed suciently to be deemed suitable for use in the neonatal intensive care unit (ICU). Karayiannis et al. (2001) reported a video-based method for distinguishing myoclonic from focal clonic seizures and dierentiating these types of seizures from normal infant behaviours. However, this approach does not provide a complete solution to the problem, as many neonatal seizures are not accompanied by this spectrum of body movements. The importance of autonomic changes may be underestimated in neonatal seizure detection research. Neonatal seizures are often associated with changes in heart and respiration rate (Greene et al., 2006b). Signicant changes in heart rate may alert the clinician to the possibility of seizures and instigate further investigation with EEG. These ndings led to the development of a neonatal seizure detection system based exclusively on the electrocardiogram (ECG) (Greene et al., 2006a). The aim of this study was to attempt to improve the neonatal seizure detection rate by combining simulta-

neously-acquired ECG and EEG data. To the best of our knowledge this is the rst method to combine the ECG with the EEG for seizure detection. 2. Data-set A data-set of 12 records from 10 term neonates containing 633 labelled seizure events, with mean seizure duration of 4.60 min, were recorded and analysed. The records had a mean duration of 12.84 h. Each record contained 712 channels of EEG and one channel of simultaneouslyacquired ECG. Ten records, sampled at 256 Hz, were made in the neonatal intensive care units of the Unied Maternity Hospitals in Cork, Ireland, using the Viasys NicOne video-EEG system. The remaining recording, sampled at 200 Hz, was recorded at Kings College Hospital, London, on a Telefactor Beehive video-EEG system. A total of 154.1 h of EEG and ECG were analyzed. The data-set used in this research is a resource of continuously-recorded digital video-EEG data and other physiological parameters in newborns with seizures in the rst 3 days from birth. All newborns were full term (GA: 4042 weeks) and had hypoxic ischaemic encephalopathy (HIE). All the data for each recording were included in the analysis regardless of record length or quality. Electrographic seizures were identied and annotated by an expert in neonatal EEG (GBB). Fig. 1 shows

Fig. 1. Example of a multi-channel electrographic seizure. Seizure onset and duration are marked.

1350

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

Table 1 Data characteristics for each record: number of seizure events, duration of recording, mean seizure duration Record 1 2 3 4 5 6 7 8 9 10 11 12 No. of seizure events 90 22 21 60 35 29 155 56 60 41 50 14 Total 633 Record duration (h) 10.01 10.42 24.53 14.25 14.40 10.01 24.04 13.17 5.20 5.69 17.33 5.05 Total 154.10 Mean seizure duration (min) 2.77 7.33 5.41 1.56 10.02 2.15 5.28 1.99 1.05 1.16 4.88 11.64 Mean 4.60

3.1. ECG The algorithm reported in this paper utilises the same ECG features described previously, based on the RR intervals for 60-s epochs of ECG (Greene et al., 2006a). 3.1.1. ECG pre-processing All ECG signals were ltered with a 20th order FIR band-pass lter (corner frequencies 8 and 18 Hz) to remove baseline wander, power-line noise and out of band noise. Before ltering, the mean of the ECG was removed from the signal. 3.1.2. RR interval calculation The RR interval is dened as the time in seconds between adjacent R-wave maximum (QRS) points. Robust detection of the QRS point is determined using a QRS detection algorithm as described by Benitez et al. (2001). The Hilbert transform of the rst derivative of the signal was used to emphasize the R peaks. A moving window peak search was carried out with an adaptive threshold. As neonatal ECG often manifests elevated P-wave, a step back search was performed to isolate the P peak ensuring robust detection of the R-wave maximum. Correction for missing and extra QRS points was implemented as described by de Chazal et al. (2003). 3.1.3. ECG feature extraction The six ECG feature types considered in this study were calculated on a 60-s (15,360 samples for a record sampled at 256 Hz, 12,000 samples for 200 Hz) non-overlapping epoch basis. Features are based on the RR intervals associated with each 60-s epoch. The features used in this study: Mean RR interval (lRR) Std. Dev. RR intervals (rRR) Mean RR interval spectral entropy (RR H) Mean change in the RR interval (DRR) RR interval coecient of variation (dRR) RR interval power spectral density (RR PSD)

an example of an electrographic seizure from record 12, detected by both the patient-specic and patient-independent systems. Annotations give information on the time of onset and the duration of each electrographic seizure. Table 1 details the number of seizure events per record, the duration of each record and the mean seizure duration for each record. As the ECG and EEG signals were recorded simultaneously these annotations can be related directly in time to the ECG signal. The data-set contained a wide variety of seizure durations and seizure types. While the mean seizure duration across the data-set was 4.60 min, the mean seizure duration for each patient ranged from 1.05 min to 11.64 min. The data-set contained Electrographic-only seizures as well as Electroclinical seizures. Four records 2, 3, 10, 12 contained only Electrographic-only seizures. Two records 9 and 11 contained only Electroclinical seizures. The remaining recordings contained both Electrographic-only and Electroclinical seizures. Furthermore, the data-set contained focal, multi-focal and generalized seizures. 3. Method The combination of EEG and ECG for neonatal seizure detection was considered in the context of both patient-specic and patient-independent seizure detection classiers. While the ideal scenario for this application is a patient-independent system capable of identifying all seizures from any patient with a zero false detection rate, a patient-specic system might also represent an advance in neonatal ICU monitoring. The algorithms considered in this study are epoch-based, so each seizure event was rounded to the nearest epoch length when mapping time annotations to epochs. An epoch containing P50% electrographic seizure activity was labelled as a seizure epoch.

The mean RR interval, RR interval standard deviation, RR interval coecient of variation, mean change in adjacent RR intervals per epoch (DRR) as well as R R interval spectral entropy each contributed one feature to the ECG feature vector. Relative features for lRR, rRR, DRR and dRR were obtained by subtracting the mean of the feature for the four preceding epochs as well as the mean for the four subsequent epochs (called lRR 0 , rRR 0 , DRR 0 and dRR 0 here) and each contributed one feature. The RR PSD features were calculated on an interval basis (Teich et al., 2000). The mean of the RR intervals for each epoch was subtracted to yield a zero mean sequence. The sequence was then zero-padded to length 256 and the fast Fourier transform (FFT) taken. The resulting sequence was multiplied by its complex conjugate to yield a periodogram estimate of the RR interval power

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

1351

spectral density. A 64-point periodogram was obtained by averaging the values in four adjacent frequency bins. Only the rst 33 of these constituted a valid PSD, with the rst 32 of these points taken as a feature in its own right for each 60-s epoch. In total each epoch produced 41 features for each ECG feature vector. 3.1.4. ECG artefact detection There are a variety of artefacts that may be found in an ECG signal. In this paper we have attempted to reject two kinds of artefact from subsequent analysis, namely: movement artefacts which are large signal spikes caused by movement of the electrodes, and zero-signal artefact caused by the amplier being powered-o in the course of a recording. A new signal, Qecg, was constructed with the same sample rate as the ECG and provided a binary ag for the presence or absence of artefact for each sample of the ECG signal. To identify the artefact sections of the ECG a zero mean ECG signal was rst calculated by subtracting the mean of the ECG from each sample and then processing this signal as follows: The standard deviation of the absolute value of signal was calculated and any signal samples greater than six times the standard deviation were agged as movement artefact and Qecg was assigned the value 1 at these samples. Any 10 sample epoch whose mean was 100 times smaller than the 5% trimmed mean of the signal was agged zero-signal artefact and each sample Qecg was assigned the value 1. Unagged samples of Qecg were assigned the value 0. This artefact measure was then associated with each 60-s epoch with the mean value for each epoch assigned as the Qecg value for that epoch. Fig. 2 shows the operation of the ECG artefact detector on section of the ECG recording for Record 1.
ECG Artefact Detector 2 Artefact 1.5 ECG Artefact Detector

3.2. EEG 3.2.1. EEG pre-processing The EEG for each channel was low-pass ltered using a type II Chebyshev IIR lter with a corner frequency of 34 Hz to remove power-line noise along with out-of-band noise. 3.2.2. EEG feature extraction A set of EEG features and a novel multi-channel EEG classier architecture (as shown in Fig. 3) were used. In order to determine the optimum method for combining information across EEG channels the authors carried out a separate study comparing the score level or Late Integration of EEG channels (i.e. each channel processed independently and the scores combined) with feature level or Early integration of multi-channel neonatal EEG (as discussed in this paper). We found that early integration provided greatly superior results to late integration for this application. Results from this study have recently been published (Greene et al., 2006c). The fundamental dierence between our use of multiple EEG channels to other previous methods is that our method exploits the statistical inter-relationships between EEG channels. Processing EEG channels independently assumes an equal weighting for each EEG channel in the decision function and is less equipped to handle redundant features from not-involved EEG channels. This is a weak assumption when one considers neonatal seizure EEG which is often multi-focal and migrates across EEG channels. Feature vectors containing m features from n channels were concatenated into a large feature vector which was then fed to a pattern classier. Six features were extracted from each 2048 sample non-overlapping EEG epoch for each channel. The six features calculated per EEG epoch were: Dominant spectral peak (F) Power ratio (P) Bandwidth of dominant spectral peak (BW) Nonlinear energy (N) Spectral entropy (H) Line length (L)

1 Amplitude (uV)

0.5

-0.5

-1 354 356 358 360 Time (s) 362 364 366

Fig. 2. Example of the operation of the ECG artefact detector on a movement artefact for record 1.

The features for each channel were sorted according to feature type and then each group of features sorted into numerical order. All the grouped-sorted features were then concatenated into a super feature vector. The sorting function removes information about the spatial location of the seizure from the training set, preventing the classier from expecting seizure activity in a particular channel. The sorting function behaves as a numerical feature selector for the patient-independent classier, using the numerical dierences between feature values of channels involved in a seizure and channels not-involved. Features of involved and not-involved channels will be placed at opposing ends of the sorted,

1352

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

Channel 1

x11 x12
Feature Extraction
. . .

xs1 1 xs21
x1
. . .

x1m
x21 x22
Feature Extraction
. . .

xs1 2 xs2 2
x2
Feature Sort
. . .

Channel 2

Decision Classifier

x2 m

xsm 2
. . .

Ps

xs1n xs2n
Channel n

xn1 xn 2
Feature Extraction
. . .

xn

. . .

xsmn

xnm
Fig. 3. EEG classier conguration.

concatenated feature vector. A classier can then learn to distinguish the features of involved channels from those of not-involved channels using their rank in the sorted concatenated feature vector. The dominant spectral peak, power ratio and bandwidth features employed are those reported by Gotman et al. (1997b). The frequency spectrum was calculated for each epoch using the FFT. The dominant frequency was dened to be the frequency in the spectrum with the largest average power in its bandwidth. The bandwidth of the dominant spectral peak was dened as the width in Hertz (Hz) between the two half power points of the dominant spectral peak. The power ratio was dened as the ratio of the power in the dominant spectral peak to the power at the same frequency in the background EEG, where the background EEG is the average of the three epochs 60 s behind the current epoch (Gotman et al., 1997b). Recent evidence suggests that seizure activity represents a reduction in the complexity of the signal (Celka and Colditz, 2002). Spectral entropy can be interpreted as a measure of signal complexity and so represents a potential feature for seizure detection. DAlessandro et al. (2003) employed the mean nonlinear energy of an EEG epoch in predicting epileptic seizure in adults. Esteller et al. (2001) proposed line length, an approximate measure of the fractal properties of the signal, as a potential feature for epileptic seizure onset detection. 3.2.3. EEG artefact rejection The stability of an EEG epoch has been used as an EEG signal quality measure Qeeg (Gotman et al., 1997b). The larger this value is relative to unity, the more likely

it is to contain artefact. The mean Qeeg across n EEG channels was taken as the EEG signal quality measure, Qeeg for epoch, where Qeeg can be written as: Peach n Qeeg 1 i1 Qeegi . n 3.3. Classier model Classication based on a linear discriminant (LD) classier model was employed for all signal modes and congurations. Linear discriminant classier models utilise class conditional mean vectors and a common covariance matrix. They provide optimal performance when features within a class have a normal distribution and the same variance across classes. The class conditional mean vectors and a common covariance matrix were estimated separately from the training data for the patient-specic and patient-independent classiers. Weighting of the class conditional mean vectors and common covariance matrix by the duration of each record was implemented for the patient-independent classier (Greene et al., 2006a). This ensures that records of diering lengths contribute equally to the training of a patient-independent classier. 3.4. Combining the ECG and EEG information Two schemes for combining the information determined from the EEG and ECG signals were considered- the early integration (EI) scheme and the late integration (LI) scheme, and are discussed separately below. 3.4.1. Early integration of EEG and ECG features The EI conguration, hereafter referred to as the EI fusion conguration, involves concatenating the EEG

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

1353

and ECG feature vectors into a single feature vector and feeding this super feature vector to a pattern classier. Fig. 4 gives a graphical description of the EI conguration. If the signal quality measures Qecg and Qeeg for an epoch were both over an empirically-derived threshold, that epoch is considered to contain artefact and the epoch is neglected from analysis for both signals. 3.4.2. Late integration of EEG and ECG classications The late integration (LI) conguration, hereafter referred to as the LI fusion conguration, employs separate classiers for each signal to determine a probability of seizure for each signal mode. These two probabilities are then combined to provide an overall probability of seizure. Fig. 5 gives a graphical description of the LI fusion conguration. The combination of the EEG and ECG signal modes at using the classiers condence score, as is performed in this conguration, allows each signal to be weighted for improved classication performance. Static and dynamic weighting of the two signals was investigated. An expression for the overall probability of seizure is given in Eq. (1). The scalars as and bs are the static weights for the ECG and EEG signals, respectively. For the static weighting case, as = 1 bs. If a is varied over the range 01 the optimum static weights for each signal can be determined for both the patient-specic classier and the patient-independent classier. Fig. 6 shows the mean classication accuracy for the patient-specic classier for the combined classiers as the EEG static weight, as is varied from 0 to 1 in increments of 0.1. P sz as P ecg P eeg
Q

Dynamic weighting takes account of a measure of quality in each signal (shown as Q in Fig. 5). If an epoch is determined to contain artefact in either mode the corresponding weight for that mode is reduced appropriately causing the system to favour the decision for the other signal. The dynamic weight for each signal is calculated by subtracting the quality measure, scaled by dividing by the maximum value of that quality measure, from the static weight for each signal (It should be noted that a real-time system would require an empirically determined value in place of the maximum value of the quality measure used here, such a value should be chosen to be the largest value that may reasonably occur for this parameter and should not signicantly dier from the Qmax value used here). Expressions for the dynamic weights for both signals are given in Eqs. (2) and (3). ad as Qecg maxfQecg g Qeeg bd bs maxfQeeg g 2 3

Consequently, the probability of seizure Psz with dynamic weighting can be determined from Eq. (1) using ad and bd as the mode weights in place of as and bs. An epoch is labelled as seizure if Psz is over a given decision threshold. A decision threshold equal to 0.5 is used in this study. 3.4.3. Interpolation of mode frame rates The ECG was considered in non-overlapping epochs of 60 s (16,384 samples at 256 Hz). However the EEG was

EEG
Nx6

xs 1 xs 1
Feature Extraction
. . .

xs 1
. . .

Sort

xs 2 xs 2
. . . . . .

xs 2 xs 2
Concatenate
. . .

xs

xs n xs n
. . .

xs n y1 y2
. . .

Classifier

Ps

xs
ECG
Feature Extraction
42

y1 y2
. . .

Interpolate

y1 y2
. . .

yz

yz
Q

yz

Fig. 4. Early integration (EI) neonatal seizure detection conguration.

1354

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

xs 1 xs 1
. . .

EEG
N Feature Extraction Sort
Nx6

xs 2 xs 2
. . .

Classifier

xs

. . .

Peeg
Ps

xs n xs n
. . .

Decision

xs

Pecg

ECG
Feature Extraction
42

y1 y2
. . .

Classifier

Interpolate

y1 y2
. . .

yz
Q

yz

Fig. 5. Late integration (LI) neonatal seizure detection conguration.

considered in 2048 sample non-overlapping epochs. In order to facilitate direct comparison and fusion of the two signals the ECG frame rate must be matched to the EEG frame rate by means of interpolation. The EEG frame rate is a multiple of the ECG frame rate. The interpolation factor is the integer closest to this multiple, the frames are then shifted to ensure that the EEG and ECG windows remain synchronized. In the EI conguration this interpolation was performed at a feature level. A super feature vector with a frame rate matching that of the EEG was passed to the classier. In the LI conguration, this interpolation was performed at the score level. The output probability from the ECG classier was
Fusion weights vs mean Fusion Accuracy 90 mean Fusion accuracy mean EEG accuracy mean ECG accuracy

interpolated after sub-dividing the output for each 60 s ECG epoch into eight 8-s epochs (for the 256 Hz case) to match the frame rate from the EEG classier and the two combined, as discussed in Section 3.4.2, for an overall probability of seizure. 3.5. Classier performance estimation Each classier conguration was considered as both a patient-specic and a patient-independent classier. The performance of each patient-specic classier was estimated using m fold cross-validation on each record. This involves randomly splitting each record into m sections or folds: m 1 of these folds are then used to train the classier and the remaining fold is then used to test the performance of the classier. By shuing the data and repeating this procedure q times and averaging the resulting accuracies for the training and test sets an unbiased, low variance estimate of the classier performance can be obtained. In this study 10 folds and 10 shues of the data were used. The patient-specic performance measures were then taken as the average of each measure across records. The performance of the generalized or patient-independent classier was estimated using cross-validation across all records. This involved training the classier model on (z 1) of the z records and using the zth record to test the classier performance and then rotating through the z possible combinations of training and test sets. The mean of the results for all iterations is taken as the patient-independent performance estimate. This test provides a measure of the classiers ability to generalize from the training set and classify from unseen records.

85

80 Accuracy (%)

75

70

65

60

0.1

0.2

0.3

0.4 0.5 0.6 EEG fusion weight

0.7

0.8

0.9

Fig. 6. Patient-specic classication accuracy for LI EEG fusion classier as the EEG static weight is varied in the range 01 (ECG weight equals 1-EEG weight).

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

1355

3.6. Classier performance measures The classier architectures considered in this study were epoch-based. As a result, classication accuracy is an epoch-based measure. In contrast, the percentage of seizures detected by the system is an event-based measure. There is much inconsistency in the literature regarding the format of reported results. For this reason, we have presented our results in both epoch format and event format. The classication accuracy is dened as the percentage of epochs correctly classied by the system. The sensitivity is dened as the percentage of seizure epochs (as labelled by an expert in neonatal EEG) correctly identied as seizure epochs by the system. The specicity is dened as the percentage of labelled non-seizure epochs correctly classied as non-seizure by the system. The false detection rate (FDR) is dened as the percentage of nonseizure epochs incorrectly identied as seizure epochs and is equivalent to 100-specicity (%). Caution must be exercised in reporting false detection results. Many algorithms report these in terms of clusters of false detections per hour (Gotman et al., 1997b). Although this can be a useful measure of clinical utility of the algorithm, it is not always an accurate assessment of algorithm performance as it is possible for an entire hour of false detections to be taken as a single false detection for that hour. False detections less than 30 s apart are grouped as a single false detection. The mean false detection per hour (FD/h) is included here for completeness. The seizure sensitivity or good detection rate (GDR) is dened as the percentage of electrographic seizure events as labelled by an expert in neonatal EEG (G.B.B.) correctly identied by the system. If a seizure was detected any time between the start and end of a labelled seizure this was considered a good detection. A receiver operating characteristic (ROC) curve is a graphical representation of class sensitivity against specicity as a threshold parameter is varied. The area under the ROC curve (calculated using trapezoidal numerical integration) is an eective way of comparing the performance of dierent features or classiers and is equivalent to the MannWhitney version of the Wilcoxon rank-sum statistic (Zweig and Campbell, 1993). A random discrimination will give an area of 0.5 under the curve while perfect discrimination between classes will give unity area under the ROC curve. 4. Results The presented system is an epoch-based system so it makes intuitive sense to quantify its performance in terms of epoch-based measures such as accuracy, sensitivity and specicity as detailed above. From a clinical viewpoint, the most important measure of the clinical utility of a seizure detection system is the percentage of seizure events correctly detected by the system (GDR) along with the number of false detections. For this reason we have given

our results in terms of both epoch measures and eventbased measures. The results are divided into two sections: the patientspecic classier results appear in Section 4.1 and the patient-independent classier results appear in Section 4.2. Within each section results are presented for each signal individually as well as the results from the combination of the EEG and ECG signals. It should be noted that although artefact rejection was employed in both Fusion congurations, each of the 633 seizure events in the dataset was included in our analysis. 4.1. Patient-specic results The mean patient-specic ECG GDR was found to be 99.36% with an FDR of 29.80%. On an epoch basis the ECG classier had a mean classication accuracy of 69.09% with associated sensitivity and specicity of 60.06% and 69.48%, respectively. The patient-specic EEG classier had a GDR of 93.64% and an FDR of 11.47%. The EEG classication accuracy was 84.55% with sensitivity 71.02% and specicity 89.23%. The EI fusion patient-specic classier had a mean GDR of 95.82% with an FDR of 11.23%. The EI fusion classication accuracy was 86.32% with sensitivity of 76.37% and specicity of 88.77%. When static weighting of modes was employed, the LI fusion patient-specic classier had a GDR of 95.18% and an FDR of 10.77% when as = 0.7 and bs = 0.3. The accuracy was 85.99%, sensitivity 73.69% and specicity 89.23%. Fig. 6 shows the classication accuracy as as is varied from 0 to 1. With dynamic weighting of modes the LI patient-specic GDR was 97.52% with an FDR of 13.18%. The mean classication accuracy was 84.66% with sensitivity 74.08% and specicity 86.82%. Table 2 gives a breakdown for the patient-specic results for all modes and congurations. The results given for the LI fusion conguration are those for dynamic weighting of modes as this method gave superior performance to static weighting. 4.2. Patient-independent results The mean patient-independent ECG GDR was 63.54% with an FDR of 38.00%. The mean classication accuracy
Table 2 Patient-specic results ECG EEG Fusion EI GDR (%) FD/h FDR (%) Accuracy (%) Sensitivity (%) Specicity (%) 99.19 0.68 30.79 68.98 59.69 69.21 93.64 5.52 11.47 84.55 71.02 88.53 95.82 5.63 11.23 86.32 76.37 88.77 LI 97.52 3.96 13.18 84.66 74.08 86.82

Results for LI fusion with dynamic weighting for each signal.

1356 Table 3 Patient-independent results ECG

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

a
EEG Fusion
Amplitude (uV) 50

EEG Seizure
EEG

EI GDR (%) FD/h FDR (%) Accuracy (%) Sensitivity (%) Specicity (%) 82.33 1.71 37.78 63.97 69.51 62.22 80.41 3.42 26.05 72.45 68.18 73.95 81.44 3.15 28.57 71.51 71.73 71.43

LI 81.27 3.05 33.05 68.89 74.39 66.95

0 50 100

Seizure Onset
150 0 100 200 300 400 Time (s) 500 600 700

Patient Independent Classifier: Output Probability Probability of Seizure

Results for the LI fusion classier employed dynamic weighting of each signal.

0.7 0.65 0.6 0.55 0.5 0.45 200 400 600 Epoch No.
EEG Seizure

Label Probability of Seizure Decision Threshold

Seizure

was 63.54% with associated sensitivity and specicity of 69.63% and 61.63%, respectively. The mean patient-independent EEG GDR was 80.41% with an FDR of 28.57%. The mean classication accuracy was 71.51% with associated sensitivity and specicity of 68.18% and 73.95%. The EI fusion patient-independent GDR was 81.44% with an FDR of 28.57%. The mean classication accuracy was 71.51% with sensitivity of 71.73% and specicity of 71.43%. With static weighting of modes the LI fusion patientindependent classier had a GDR of 80.41% and an FDR of 27.87% when the mode weights were as = 0.7 and bs = 0.3. With dynamic weighting the GDR for this classier was 81.27% with an FDR of 33.05%. Table 3 outlines the patient-independent results. The results given for the LI fusion conguration are those for dynamic weighting of modes as this method gave superior performance. These results were conrmed by ROC analysis. Fig. 6 shows the ROC curves for the ECG, EEG and LI Fusion classiers. The ECG ROC area was 0.68 while the EEG ROC area was 0.76. The LI fusion ROC area was 0.76 while the EI fusion ROC area was 0.77 (Fig. 7).

Seizure

800

1000

b
Amplitude (uV)

200 100 0

100 200
370 372 374 376 378 380 382 Time (s) 384 386 388 390

Patient Independent Classifier: Output Probability 1

Probability of seizure

0.8 0.6 0.4

Good Detection

Probability of Seizure Label False Detections

Decision threshold = 0.5 0.2 50 100 150 200 Epoch 250 300 350

100 90 80 70 60 50 40 30 20 10 0 0 20

Patient Independent ROC curve Fusion EI ROC ECG ROC Fusion LI ROC EEG ROC

Fig. 8. (a) An example of a good detection for the patient-independent classier, for a seizure in record 5. The top panel shows an EEG channel with seizure onset marked with a black arrow. Bottom panel shows the probability of seizure generated by the system for that seizure. (b) An example of a good detection and false detection for the patientindependent classier, for a seizure in record 6. The top panel shows the EEG seizure event denoted by the dashed black line in the bottom panel.

Fig. 8a gives two examples of a good detection of a seizure event, showing the system output in terms of the probability of seizure Psz. Fig. 8b also shows an example of a false detection. In both cases an epoch was classied as seizure if Psz was greater than or equal to the decision threshold. 5. Discussion An approach is proposed for combining simultaneouslyrecorded ECG and EEG signals for more accurate and robust detection of neonatal seizures. Recent research by the authors has suggested that the ECG is suitable in its own right for use in seizure detection

Sensitivity [%]

40 60 Specificity [%]

80

100

Fig. 7. Patient-independent LI Fusion ROC curves with ECG and EEG ROC curves.

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

1357

algorithms, due to the fact that RR interval timing, complexity and variability changes appear to be associated with neonatal seizures (Greene et al., 2006b). There are a number of existing methods for EEG-based neonatal seizure detection. Many are either based on a single channel of EEG (Gotman et al., 1997b; Celka and Colditz, 2002; Hassanpour et al., 2004) or use empirically-based decision thresholds (Altenburg et al., 2003; Liu et al., 1992) as opposed to a classier model, trained on real multi-channel EEG. The novel EEG-based classier architecture reported here exploits the statistical inter-relationships and synchronously recorded nature of the EEG by processing all available EEG channels, while employing a statistical classier model. Manifestations of seizure were observed simultaneously in the EEG and ECG signals. The combination of the two signals supplies the neonatal seizure detection system with a broader seizure-specic information base, oering potentially superior seizure detection performance. The ECG and multi-channel EEG data used to evaluate the algorithms developed in this study are of the same duration and quality as that found in the neonatal ICU and so can be said to faithfully reect the performance of the algorithms under real-world conditions. Many previous studies have selected small numbers of seizure and non-seizure EEG epochs instead of long-duration EEG recordings to evaluate their algorithms. Scher and co-workers have reported that the neonatal sleep cycle is approximately 1 h in duration (Scher et al., 1993b). Results for algorithms trained and validated on small non-continuous tracts of EEG shorter than 1 h will not reect the performance of such algorithms on the unique stage-specic characteristics of the neonatal EEG sleep cycle. A similar observation can be made of algorithms that are dened and validated using a single channel of EEG. Performance gures given for such algorithms do not reect the performance of such algorithms on real multi-channel EEG data. A robust system must be able to cope with all EEG records regardless of record quality and duration. Gotman et al. (1997b) reported a GDR of 71% and Liu et al. (1992) reported a GDR of 84% for their patient-independent neonatal seizure detection methods (Liu et al., 1992), with 1.7 false detections per hour and a false detection rate of 1.7%, respectively. Celka and Colditz reported a GDR of 93% with an FDR of 4% for their patient-specic neonatal seizure detection method (Celka and Colditz, 2002). An independent evaluation of these three methods, performed on the same data-set as is used here, found that the results reported in the source papers overestimated the performance of these algorithms and found none were suitable for use in a clinical environment (Faul et al., 2005). Our patient-independent results for EEG alone and for LI and EI combined ECG and EEG are an improvement on those reported by the evaluation of Faul et al. The results for the Gotman method were validated by a subsequent paper by results for a separate data-set containing

281 h of EEG data from 54 patients in three centres (Gotman et al., 1997a). The mean seizure detection rate for this set was 69% with a mean of 2.3 false detections per hour. The size of this data-set must lend credence to these results. Our patient-independent results for EEG and ECG combined were an improvement on those reported by Gotman, and achieved using a methodology to ensure robust reproducible results. The data-set used by Celka and Colditz contained 4 neonates and does not detail the number of seizures or the duration of the recordings used (Celka and Colditz, 2002). Furthermore the results are based on a single channel of EEG. The data-set of Liu et al. used 58 30-s seizure epochs, selected for prototypicality, this may have had a biasing eect on their results as noted by Gotman et al. It has been noted that dierent classier models oer potentially complementary information about the patterns to be classied, which could be harnessed to improve the performance of the selected classier (Kittler et al., 1998). As a result it has been found that combining classiers from dierent modes with generalized knowledge of the patterns to be classied generally yields improved, more robust, classication performance. Our results conrm this nding. The combined EEG and ECG classiers out-performed both the ECG and EEG classiers individual performances. While the GDR or FDR for an individual signal may have been comparable to that for the early integration (EI) or late integration (LI) fusion classiers, when taken together, results for fusion were always superior to those for each signal individually. As a result, the combination of EEG and ECG has led to a more robust system for neonatal seizure detection than a system based exclusively on the EEG. Two methods for combining ECG and EEG were considered in this study. The EI fusion conguration was generally found to give better performance than the LI fusion conguration. The one exception to this trend gave a higher GDR to the patient-specic LI fusion classier than the EI classier. A LI conguration would possess a distinct advantage over an EI conguration in a real-world patient-monitoring scenario. The use of dynamic weighting allows the system to deal with the presence of artefact, electrode drop-o or interference. Such a conguration could also take into account local variations in feature characteristics allowing weighting of each signal in the decision function. Patient-specic neonatal seizure detection may have utility in the modern neonatal ICU. When an electroencephalographer is alerted to the presence of electrographic seizure, relevant sections and channels of the preceding EEG could then be labelled as seizure. These annotated seizures could then be used to adaptively train a base patientindependent classier towards the individual patients electroclinical seizure characteristics. A patient-specic system as discussed in this paper, while an improvement on current systems falls short of the ideal neonatal seizure detection system. This is due to the fact that it would require

1358

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359

signicant manual intervention from an electroencephalographer, as outlined above, to ensure robust operation. However, such a system may prove to be more clinically useful than a patient-independent system which we have found to have potentially lower false detection rates. Monitors such as the CFM are often used in the neonatal intensive care unit despite the fact that they were originally designed for adult intensive care use. They are currently used in both term and preterm neonates for seizure detection, prognosis, and to assess the severity of encephalopathy in trials of therapeutic hypothermia. The cerebral function monitor (CFM) produces a one channel amplitude integrated EEG signal. Despite attempts to develop more sophisticated cerebral function monitors such as the compressed spectral array (CSA) system, it is the CFM which remains dominant in the NICU today. There have been criticisms of the CFM because of its limitation to a single EEG channel (plus a simultaneous artefact detection channel) and the lack of detailed information compared with the conventional multi-channel EEG, especially when used for detection of neonatal seizure discharges (Eaton et al., 1994; Klebermass et al., 2001; Toet et al., 2002; Rennie et al., 2004). In the study by Rennie et al. up to 50% of seizures were missed particularly those that were of short duration, focal or of low amplitude. Therefore we would have to say that while these devices are in use in the NICU they have serious limitations and therefore the need for automated seizure detection from multi-channel EEG is even greater. The signal framework introduced here raises the possibility of multi-channel, multi-signal intelligent neonatal monitoring by taking account of, and combining, all available physiological parameters, for monitoring the state and wellbeing of newborns in the ICU. This framework could be further extended to all clinical patient-monitoring situations. It should be noted that the clinical utility of our patientindependent system is limited by the relatively high false detection rates reported for the patient-independent classier in this paper. In future research we hope to reduce false detection rates for both patient-specic and patient-independent classiers through the use of more advanced artefact detection and rejection algorithms. A more sophisticated normalisation scheme may lead to improved patient-independent performance. Further increases in system performance might be achieved by taking account of other recorded physiological signals such as the electrooculogram and cerebral blood ow velocity. However, in order for these signals to be included in multi-signal neonatal seizure detection systems, the spatial and temporal relation of these seizures to the electrographic seizure must rst be quantied. The inclusion of non-signal information, such as gestational age, weight, maternal history, etc., into an automatic neonatal monitoring system has the potential to greatly improve the performance of neonatal seizure detection systems due to the highly variable age-dependent characteristics of the neonatal period.

6. Conclusion We describe a novel algorithm for neonatal seizure detection. Combination of an ECG-based classier system with a novel multi-channel EEG-based classier system led to improved seizure detection performance. The algorithm was evaluated using a large data-set containing ECG and multi-channel EEG of realistic duration and quality. Future work is needed to develop improvements in these algorithms and to explore the possible added diagnostic value of other combinations of physiological data in the automatic identication of seizures in this age-group. Acknowledgements This project was funded by an Irish Higher Education authority grant (HEA 9300) and an interdisciplinary grant from the Health Research Board of Ireland. The authors would like to acknowledge the helpful technical assistance of Dr. Edmund Lalor and Mr. Brian OMullane. We also acknowledge the help and support of the nursing and medical sta of the Unied Maternity Services, Cork, and the parents and families of the babies involved in this study. References
Altenburg J, Vermeulen RJ, Strijers RLM, Fetter WPF, Stam CJ. Seizure detection in the neonatal EEG with synchronization likelihood. Clin Neurophysiol 2003;114:505. Benitez D, Gaydecki PA, Zaidi A, Fitzpatrick AP. The use of the Hilbert transform in ECG signal analysis. Comput Biol Med 2001;31:399406. Boylan GB, Rennie JM, Pressler RM, Wilson G, Morton M, Binnie CD. Phenobarbitone, neonatal seizures, and video-EEG. Arch Dis Child Fetal Neonatal Ed 2002;86:16570. Celka P, Colditz P. A computer-aided detection of EEG seizures in infants: a singular-spectrum approach and performance comparison. IEEE Trans Biomed Eng 2002;49:45562. DAlessandro M, Esteller R, Vachtsevanos G, Hinson A, Echauz J, Litt B. Epileptic seizure prediction using hybrid feature selection over multiple intracranial EEG electrode contacts: a report of four patients. IEEE Trans Biomed Eng 2003;50:60315. de Chazal P, Heneghan C, Sheridan E, Reilly R, Nolan P, OMalley M. Automated processing of the single-lead electrocardiogram for the detection of obstructive sleep apnoea. IEEE Trans Biomed Eng 2003;50:68696. Eaton DM, Toet M, Livingston J, Smith I, Levene M. Evaluation of the Cerebro Trac 2500 for monitoring of cerebral function in the neonatal intensive care. Neuropediatrics 1994;25:1228. Esteller R, Echauz J, Tcheng T, Litt B, Pless B. Line length: an ecient feature for seizure onset detection. In: Proceedings of the 23rd annual international conference of the IEEE engineering in medicine and biology society, 2001, 2; 2001, p. 170710, vol. 2. Faul S, Boylan G, Connolly S, Marnane L, Lightbody G. An evaluation of automated neonatal seizure detection methods. Clin Neurophysiol 2005;116:153341. Gotman J, Flanagan D, Rosenblatt B, Bye A, Mizrahi EM. Evaluation of an automatic seizure detection method for the newborn EEG. Electroencephalography Clin Neurophysiol 1997a;103:3639. Gotman J, Flanagan D, Zhang J, Rosenblatt B, Bye A, Mizrahi EM. Automatic seizure detection in newborns: methods and initial evaluation. Electroencephalography Clin Neurophysiol 1997b;103:35662.

B.R. Greene et al. / Clinical Neurophysiology 118 (2007) 13481359 Greene BR, de Chazal P, Boylan GB, Reilly RB, Connolly S. Electrocardiogram based Neonatal Seizure detection. IEEE Trans Biomed Eng, TBME-00350-2005.R2, in press. Greene BR, deChazal P, Boylan GB, Reilly RB, OBrien C, Connolly S. Heart and respiration rate changes in the neonate during electroencephalographic seizure. Med Biol Eng Comput 2006b;44:2734. Greene BR, Reilly RB, Boylan G, de Chazal, P, Connolly S. Multichannel EEG based neonatal seizure detection. In: 28th international conferences of the IEEE-EMBS conference; 2006c. Hassanpour H, Mesbah M, Boashash B. Time-frequency feature extraction of newborn EEG seizure using SVD-based techniques. EURASIP J Appl Signal Process 2004;16:254454. Karayiannis NB, Srinivasan S, Bhattacharya R, Wise MS, Frost Jr JD, Mizrahi EM. Extraction of motion strength and motor activity signals from video recordings of neonatal seizures. IEEE Trans Med Imag 2001;20:96580. Kittler J, Hatef M, Duin RPW, Matas J. On combining classiers. IEEE Trans Patt Anal Mach Intell 1998;20:22639. Klebermass K, Kuhle S, Kohlhauser-Vollmuth C, Pollak A, Weninger M. Evaluation of the cerebral function monitor as a tool for neurophysiological surveillance in neonatal intensive care patients. Childs Nerv Syst 2001;17:54450. Levene M. The clinical conundrum of neonatal seizures. Arch Dis Child 2002;86:757.

1359

Liu A, Hahn JS, Heldt GP, Coen RW. Detection of neonatal seizures through computerized EEG analysis. Electroencephalography Clini Neurophysiol 1992;82:307. Rennie JM, Chorley G, Boylan GB, Pressler R, Nguyen Y, Hooper R. Non-expert use of the cerebral function monitor for neonatal seizure detection. Arch Dis Child Fetal Neonatal Ed 2004;89:F3740. Scher MS, Aso K, Beggarly ME, Hamid MY, Steppe DA, Painter MJ. Electrographic seizures in preterm and full-term neonates: clinical correlates, associated brain lesions, and risk for neurologic sequelae. Pediatrics 1993a;91:12834. Scher MS, Hamid MY, Steppe DA, Beggarly ME, Painter MJ. Ictal and interictal electrographic seizure durations in preterm and term neonates. Epilepsia 1993b;34:2848. Teich MC, Lowen SB, Jost BM, Vibe-Rheymer K, Heneghan C. In: Akay M, editor. Nonlinear biomedical signal processing, vol. II. Piscataway (NJ): IEEE Press; 2000. Toet MC, van der Meij W, de Vries LS, Uiterwaal CSPM, van Huelen KC. Comparison between simultaneously recorded amplitude integrated electroencephalogram (cerebral function monitor) and standard electroencephalogram in neonates. Pediatrics 2002;109:7729. Volpe JJ. Neurology of the newborn. Philadelphia (PA): Saunders; 2001. Zweig M, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 1993;39:56177.