Comparitive Analysis PDF

Int. J.
Human-Computer Studies 72 (2014) 717–727
Contents lists available at ScienceDirect
Int. J. Human-Computer Studies

journal homepage: www.elsevier.com/locate/ijhcs
Comparative analysis of emotion estimation methods based on

physiological measurements for real-time applications$
Davor Kukolja n, Siniša Popović, Marko Horvat, Bernard Kovač, Krešimir Ćosić
Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia
art ic l e i nf o a b s t r a c t
Article history: In order to improve intelligent Human-Computer Interaction it is important to create a personalized
Received 28 August 2013 adaptive emotion estimator that is able to learn over time emotional response idiosyncrasies of
Received in revised form individual person and thus enhance estimation accuracy. This paper, with the aim of identifying
28 April 2014
preferable methods for such a concept, presents an experiment-based comparative study of seven
Accepted 16 May 2014
feature reduction and seven machine learning methods commonly used for emotion estimation based
Communicated by Winslow Burleson
Available online 24 May 2014 on physiological signals. The analysis was performed on data obtained in an emotion elicitation
experiment involving 14 participants. Specific discrete emotions were targeted with stimuli from the
Keywords: International Affective Picture System database. The experiment was necessary to achieve the uniformity
Affective computing
in the various aspects of emotion elicitation, data processing, feature calculation, self-reporting
Physiology
procedures and estimation evaluation, in order to avoid inconsistency problems that arise when results
Emotion estimation
Feature reduction from studies that use different emotion-related databases are mutually compared. The results of the
Machine learning performed experiment indicate that the combination of a multilayer perceptron (MLP) with sequential
floating forward selection (SFFS) exhibited the highest accuracy in discrete emotion classification based
on physiological features calculated from ECG, respiration, skin conductance and skin temperature. Using
leave-one-session-out crossvalidation method, 60.3% accuracy in classification of 5 discrete emotions
(sadness, disgust, fear, happiness and neutral) was obtained. In order to identify which methods may be
the most suitable for real-time estimator adaptation, execution and learning times of emotion estimators
were also comparatively analyzed. Based on this analysis, preferred feature reduction method for real-
time estimator adaptation was minimum redundancy – maximum relevance (mRMR), which was the
fastest approach in terms of combined execution and learning time, as well as the second best in
accuracy, after SFFS. In combination with mRMR, highest accuracies were achieved by k-nearest
neighbor (kNN) and MLP with negligible difference (50.33% versus 50.54%); however, mRMR þ kNN is
preferable option for real-time estimator adaptation due to considerably lower combined execution and
learning time of kNN versus MLP.
& 2014 Elsevier Ltd. All rights reserved.
1. Introduction automated estimation of patient's emotions, like treatment of

stress-related disorders (Ćosić et al., 2010).
In the last few years the research in automated emotion For a variety of these applications, individually adjusted emo-
recognition methods is steadily growing momentum due to tion estimators rather than a generic emotion estimation may
applicability in various domains which would benefit from an achieve higher accuracy (Kim and André, 2008; Picard, 2010),
accurate understanding of human emotional states, like entertain- particularly if the estimator can learn emotional response idio-
ment, safe driving, training and e-learning, telemedicine and home syncrasies of a particular individual over the course of multiple
robotics (Nasoz and Lisetti, 2006; Picard, 1997; Rani et al., 2006). sessions. Such personalized adaptive emotion estimator system
Furthermore, various mental health applications may benefit from should perform real-time estimation of user's emotion and con-
currently adapt itself over time based on the measured user's
responses.
☆
This paper has been recommended for acceptance by Winslow Burleson. As a step forward toward this goal, this paper presents a
n
Corresponding author. Tel.: þ 385 1 6129 521; fax: þ385 1 6129 705.
E-mail addresses: davor.kukolja@fer.hr (D. Kukolja),
comparative analysis of emotion estimation methods in order to
sinisa.popovic@fer.hr (S. Popović), marko.horvat2@fer.hr (M. Horvat), find the most suitable methods for the development of a perso-
bernard.kovac@fer.hr (B. Kovač), kresimir.cosic@fer.hr (K. Ćosić). nalized adaptive emotion estimator. Therefore, the criteria for
http://dx.doi.org/10.1016/j.ijhcs.2014.05.006
1071-5819/& 2014 Elsevier Ltd. All rights reserved.
718 D. Kukolja et al. / Int. J. Human-Computer Studies 72 (2014) 717–727
comparison are not only related to estimation accuracy, but also to reduction and seven machine learning methods commonly
execution and learning times of each emotion estimation method. employed in emotion estimation based on physiological features.
In the previous research related to emotion estimation the Tested methods of feature reduction and machine learning are
underlying emotional states have been determined based on cues listed in Table 1. During comparative analysis, each feature selec-
such as facial expressions, speech, and physiology. This paper, tion method, alone or in combination with FP, is paired with every
however, focuses solely on physiological signals, which in compar- listed machine learning method. Therefore, a total of 84 combina-
ison to facials expressions and vocal features are dominantly tions of feature reduction and machine learning methods were
related to the autonomic nervous system activity. This makes considered.
voluntary and conscious manipulation of physiological signals The comparative analysis of methods for physiology-based
more difficult than either vocal or facial emotional expressions emotion estimation was based on an experiment involving 14
(Kim and André, 2008), and allows continuous monitoring of participants, conducted in cooperation with the Department of
emotional states, even in absence of user's motor activity. Physio- Psychology at University of Zagreb, Faculty of Humanities and
logical sensors are becoming less and less disruptive as their size is Social Sciences. The experiment compared performance of all
decreasing, and wearable, small and wireless physiological sensors aforementioned combinations of feature reduction and machine
may be an appropriate solution for unobtrusive real-time emo- learning methods using the same acquired dataset.
tional state monitoring (Fletcher et al., 2010; Katsis et al., 2011; Liu, The experimental paradigm was designed to correspond with
2009). the idea of incremental adjustment of emotion estimator over the
Even though several feature reduction and machine learning course of multiple sessions in the context of its personalization for
methods have been so far successfully employed in the previous a particular participant. The corresponding conceptual timeline is
research to build emotional state estimators from physiological shown in Fig. 1, where real-time emotion estimation takes place
indices, a comparison of various methods used by different during the sessions while the participant's physiological signals
research groups, has been precluded due to the following reasons: are acquired, and estimator adaptation/learning with search for
the most appropriate features can be continuously performed
a) Emotion elicitation method diversity. during the sessions, as well as between consecutive sessions.
b) Emotional state representation method – discrete emotions or Duration of a single session is typically in minutes or up to a
dimensional (valence-arousal) space. couple of hours, while the period between two consecutive
c) Properties of used physiological signals and features. sessions can last days, weeks or months. Before the first session,
d) Referent emotional state selection – subjective ratings or participant's data are collected, such as demographics and life-
stimuli annotations. style, and the participant fills out relevant questionnaires, such as
e) Estimator evaluation method. emotional expressiveness and anxiety sensitivity questionnaires, a
life events list, etc. In the period between consecutive sessions, off-
As noted in the previous research (Rani et al., 2006), given line learning is performed which includes longitudinal personali-
these issues, finding a common ground for comparing methods zation of the estimator to a particular participant with enough
and analyzing their features is very challenging. Therefore, this time at disposal to perform the most complex machine learning
paper uses appropriate experimentally collected dataset to com- and feature reduction algorithms. During the session, the acquisi-
pare accuracy, execution and learning times of seven feature tion of physiology and emotional state estimation are continuously
Table 1
Tested feature reduction and machine learning methods. Methods for feature reduction are divided into feature selection and feature transformation methods. During
comparative analysis, each feature selection method, alone or in combination with FP, is paired with every listed machine learning method.
Feature reduction methods Machine learning methods
Feature selection Feature transformation
Sequential floating forward selection (SFFS) Fisher projection (FP) K-nearest neighbor (kNN)
Minimum redundancy – maximum relevance (mRMR) (none) Support vector machine (SVM)
ReliefF Random forest (RF)
Information gain (IG) Multilayer perceptron (MLP)
OneR classifier (OneR) RIPPER algorithm for production rule generation
Chi-squared (Chi2) C4.5 decision tree
Naive Bayes classifier (NB)
Fig. 1. The timeline of emotion estimation.

D. Kukolja et al. / Int. J. Human-Computer Studies 72 (2014) 717–727 719
executed concurrently with estimator adaptation, and time to between varying levels of a single emotional state, or to estimate
perform the necessary machine learning and feature reduction valence and arousal, which is regarded as a more challenging task
algorithms is much more limited. Specific experimental paradigm than distinguishing between several discrete emotions (for
employed in this paper, which is in line with the conceptual instance anger, joy, sadness etc.) (Liu, 2009).
timeline given in Fig. 1, is described in a separate later section. All research outlined in Table 2 use the same methodology of
emotion estimation as shown in Fig. 2. After emotion elicitation
and acquisition of physiological signals has been performed to
2. Related work create an emotion-related database, an automatic emotion recog-
nition system requires the following stages processing of physio-
Recently, numerous studies concerned with computer-based logical signals, feature calculation, feature reduction, learning,
emotion estimation have been published. In Table 2 we summar- feature transformation and emotion estimation.
ized the most relevant results regarding the emotion estimation However a systematic comparison of research outlined in
through the peripheral physiological response. Although pub- Table 2 is not possible. There is a large diversity in emotion
lished research outlined in Table 2 cannot be easily compared, it elicitation method, emotional states representation method, eli-
can be seen that physiology-based approach holds promise for cited emotional states, number of subjects who participated in the
automated detection of human emotions. These works further experiment and number of sessions. Also there is a considerable
substantiated the findings in psychophysiology literature that the variety of used physiological signals and calculated physiological
physiological responses are closely correlated with the underlying features.
emotional states (Liu, 2009). The experimental results in Table 2 were, among other con-
Most systems have attempted emotion recognition using car- tributing factors, also determined by the method of estimator
diovascular activities (e.g., ECG and BVP), EMG signals, and skin validation. Emotion estimation research usually relies on leave-
conductance. These signals are easy to record and analyze and one-out crossvalidation (LOOCV) procedure when analyzing the
have a well-understood basis. The approach of discrete emotion accuracy of estimation. In this procedure one data observation that
recognition has been adopted by most of the works, where the represents one emotional state of one participant is left out and
goal was to determine which of the several target emotions was used for validation, while the remainder of data is used for
present. On average, an accuracy of 60–80% has been achieved in construction of estimator's model in the course of supervised
distinguishing between 4 and 5 target emotional states. It should learning. LOOCV is not suitable for validation of real-time emo-
be noted that rarely any of these systems attempt to distinguish tional state estimators, because the single observation used in
Table 2
Overview of published research in estimation of emotion. Bold print indicates data mining methods which exhibited the highest emotion estimation accuracy in a
particular study.
Author Emotional states elicited Physiological signalsa Feature Machine Evaluationd Resultse
reductionb learningc
(Picard et al., Neutral, anger, hate, grief, platonic love, EMG, BVP, SC, RSP SFFS, FP, kNN, MAP LOOCV 81.25%
2001) romantic love, joy, and reverence SFFS-FP IS SD
(Lisetti and Sadness, anger, fear, surprise, frustration, ECG, SC, ST - kNN, LDA, MLP LOOCV 84.1%
Nasoz, 2004) and amusement PD SD
(Kim et al., 2004) Sadness, anger, stress, surprise ECG, SC, ST - SVM 33:0:17 61.8%
participants PI, SD
(Healey and 3 stress levels EMG, ECG, SC, RSP FP LDA LOOCV 97.5%
Picard, 2005) PD SD
(Rani et al., 2006) Engagement, anxiety, boredom, ECG, ICG, BVP, HS, SC, Person- kNN, BN, RT, SVM LOOCV 85.81%
frustration and anger EMG, ST specific IS SD
(Rainville et al., Anger, fear, happiness and sadness ECG, RSP PCA SDA LOOCV 65.3%
2006) PD SD
(Leon et al., 2007) Neutral, negative and positive SC, BVP, HR DBI AANN þ SPRT 6:0:2 80.0%
emotional states participants PI SD
(Kreibig et al., Fear, sadness and neutral ECG, ICG, ST, BVP, SC, RSP, ANOVA PDA Jack-knife 69.0%
2007) Capnografy PD SD
(Mandryk and Valence, arousal, 5 discrete SC, ECG, EMG - fuzzy logic 6:0:6 -
Atkins, 2007) emotions participants PI SD
(Katsis et al., High stress, low stress, EMG, ECG, RSP, SC - ANFIS, SVM 10-fold CV 79.3%
2008) disappointment, euphoria PD SD
(Kim and André, Joy, anger, sadness, pleasure EMG, ECG, SC, RSP SBS EMDC, pLDA LOOCV 95%
2008) IS SI
(van den Broek Neutral, positive, negative, mixed SC, EMG ANOVAþ PCA kNN, SVM, MLP LOOCV 61.31%
et al., 2010) PD SD
(Kolodyazhniy Fear, sadness and neutral ECG, ICG, BVP, SC, ST, RSP, EMG, SFS, SBS LDA, QDA, MLP, PSICV 77.5%
et al., 2011) Capnography, Piezo-electrics RBFN, kNN PI SI
a
BVP – Blood Volume Pressure, ECG – ElectroCardioGram, EMG – ElectroMyoGram, HS – Heart Sound, ICG – Impedance CardioGram, RSP – ReSPiration, SC – Skin
Conductance, ST – Skin Temperature.
b
ANOVA – Analysis of Variance, DBI – Davies-Bouldin Index, PCA – Principal Component Analysis, SFS – Sequential Forward Selection, SBS – Sequential Backward
Selection.
c
AANN – AutoAssociative Neural Network, ANFIS – Adaptive Neuro Fuzzy Inference System, BN – Bayesian Network, EMDC – Emotion-specific Multilevel Dichotomous
Classification, LDA – Linear Discriminant Analysis, MAP – Maximum a Posteriori, PDA – Predictive Discriminant Analysis, pLDA – Pseudoinverse LDA, QDA – Quadratic
Discriminant Analysis, RBFN – Radial Basis Function Networks, RT – Regression Tree, SDA – Stepwise Discriminant Analysis, SPRT – Sequential Probability Ratio Test.
d
LOOCV – leave-one-out cross-validation, k-fold CV – k-fold cross-validation, LOPOCV – leave-one-participant-out cross-validation, PSICV – participant- and stimulus-
independent cross-validation, x:y:z – ratio of training, validation and testing samples/sessions/participants.
e
IS – individual specific, PD – participant dependent, PI – participant independent, SD – stimulus dependent, SI – stimulus independent.
Fig. 2. Block diagram of a supervised system for emotion estimation.
estimator validation may appear earlier in time than a portion of are emotionally annotated regarding the dimensions of valence,
data used in estimator model construction. Since in real-time arousal and dominance.
estimation only past and present data is available for estimator In the emotion elicitation experiment the target discrete emotions
model construction, LOOCV cannot be used for real-time applica- were sadness, disgust, fear and happiness in addition to neutral. IAPS
tions analysis. pictures suitable for elicitation of discrete emotion states were
Review of the previous research indicates that a key challenge in selected based on research in categorizing the dimensional model to
developing the emotion estimator model by supervised learning is normative emotional states (Barke et al., 2011; Libkuman et al., 2007;
referent emotional state selection. Two different approaches for the Mikels et al., 2005). Due to the categorization of discrete emotions in
referent emotional state selection exist: (1) subjective ratings – an these studies, we were aiming at Ekman's basic emotions: happiness,
approach based on personal self-reported judgments regarding the surprise, sadness, anger, disgust and fear. However, although all
elicited emotional state and (2) stimuli annotations – an approach aforementioned studies categorize negative emotions as Ekman's
using compliance with the accepted social norms. According to the sadness, anger, disgust and fear, just a few images could be labeled
previous studies, e.g. (Leon et al., 2007), it is better to use subjective with only anger when taking a closer look at picture labels in Mikels
ratings than stimuli annotations for referent emotional states, as well et al. (2005) and multidimensional normative ratings in Libkuman
as eliminate non-evocative or contradictory stimuli based on compar- et al. (2007) or Barke et al. (2011). These findings are consistent with
ison between subjective ratings and stimuli annotations. Conse- the definition of anger being a combination of appraisals of extreme
quently, in this paper referent emotional state is selected in line unpleasantness and high certainty (Smith and Ellsworth, 1985), which
with these recommendations. are difficult to achieve with passive viewing of static pictures (Mikels
Another challenge for emotional state estimation is the phe- et al., 2005). Therefore, we considered only three negative emotions:
nomenon of person-stereotypy (e.g., different individuals expres- disgust, sadness, and fear. With passive viewing of static pictures it is
sing the same emotion differently under same contexts), which also hard to elicit surprise, so surprise was also removed from
makes it difficult to obtain universal patterns of emotions across consideration due to an insufficient number of pictures.
individuals (Lacey and Lacey, 1958). This suggests that an
individual-specific approach should be applied in order to accom-
modate the differences encountered in emotion expression and an 3.2. Participants
intensive study on each individual is demanded.
In total N ¼14 females participated in the experiment. The
selected participants were physically and mentally healthy young
3. Experimental setting and data collection people, especially without any heart-related medical conditions,
students, aged 19–22 years with average of 20.3 years and
In cooperation with the Department of Psychology at University standard deviation of 0.8 years. Thus, they were a homogenous
of Zagreb, Faculty of Humanities and Social Sciences, an emotion group in terms of sex and age, in order to minimize potential
elicitation experiment was conducted with the goal of evaluating negative impact of these variables on emotion estimation accu-
accuracy, execution and learning times of emotion estimators racy. To overcome possible problems due to static pictures' low
based on data mining of acquired physiological signals. stimulus intensity and low ability to sufficiently capture their
attention, the participants were carefully chosen among a larger
group of 150 University students based on results of an emotional
3.1. Emotion elicitation stimuli expressivity questionnaire (Kring et al., 1994). Another selection
criterion, based on estimation of students' anxious sensitivity
During the experiment design, it was decided that emotion (Jurin et al., 2012; Reiss et al., 1986), has been added to eliminate
would be elicited with a standardized database of emotionally students who might feel excessively uncomfortable in an experi-
annotated multimedia. Therefore, the International Affective mental environment. The selected participants had to meet
Picture System (IAPS) database (Lang et al., 2008) was selected two conditions: (1) they had to have relatively low scores in
as the preferred source of stimuli for the experiment since it is the the anxious sensitivity questionnaire (score o ¼ 40), and (2) rela-
most widely used and referenced database in the field of emotion tively high score in the emotional expressivity questionnaire
elicitation. The IAPS contains more than 1000 static pictures which (score 4 ¼ 57).
3.3. Emotion elicitation protocol participant was stimulated with a sequence containing 10 non-
neutral stimuli. Duration of exposure to one non-neutral IAPS
In order to collect data and to test accuracy, execution and picture stimulus was 15 s. When each elicitation stimulus finished,
learning times of emotion estimation procedures, each participant participants were given an opportunity to write down their
had to partake in two sessions which were held in two different subjective ratings using a written questionnaire after which a
days. During the sessions, the participants were shown sequences 10 s neutral stimulus was displayed and this was repeated until
containing 10 IAPS pictures each, that were meant to elicit a the end of the exposure sequence. With subjective ratings, the
particular emotional state. To identify the emotions actually participants indicated their own judgments regarding the emo-
elicited, after exposure to each IAPS picture the participant tional state they were in, on a scale from 0 to 9 for individual
expressed her judgments about the elicited emotions using a discrete emotions and on a scale from 1 to 9 for emotional
written questionnaire. dimensions valence and arousal. The participants rated the inten-
In each session, every participant was exposed to two con- sity of sadness, disgust, fear and happiness and they were free to
secutive sequences separated by a pause of at least 150 s, which add the intensity of any other emotion that they felt. Time for
was intended to bring the participant back to the neutral state. completing the questionnaire was not limited. Illustration of a
Each sequence of pictures was designed to specifically elicit one timeline for experimental paradigm is shown in Fig. 3.
particular emotional state. The targeted emotional states were As it was mentioned in Section 3.1, IAPS pictures suitable for
sadness, disgust, fear and happiness, in addition to the neutral elicitation of discrete emotional states of sadness, disgust, fear and
emotional state during the neutral-stimulus period. First eight happiness were selected based on research in categorizing the
participants were shown sequences of fear and happiness in the dimensional model to normative emotional states (Barke et al.,
first session, while during the second session they were exposed 2011; Libkuman et al., 2007; Mikels et al., 2005). Based on these
to sequences of disgust and sadness. Exposure sessions for the research results and in a consultation with experts from the
remaining six participants were reversed – they were the first Department of Psychology at University of Zagreb, four sets of 10
exposed to sequences of disgust and sadness, and in the second IAPS pictures relevant for stimulation of sadness, disgust, fear and
session they were exposed to fear and happiness sequences. happiness were selected.
However, one of these participants dropped out after the disgust
sequence, so only data from her disgust sequence were included in 3.4. Acquisition of emotion-specific physiological signals
analysis. Order of stimuli presentation in each sequence was the
same for all participants. The photograph of the experimental setup is shown in the
To counter the physiological signals drift (Rottenberg et al., Fig. 4. For acquisition and recording of participants' physiology the
2007), the elicitation protocol always included a neutral stimulus BIOPAC MP 150 system with AcqKnowledge data acquisition and
before every emotionally non-neutral stimulus. Therefore, a ses- analysis software were used. This system was synchronized in
sion began with a 30 s neutral stimulus, a simple blue–green real-time with SuperLab stimulus presentation software via a
neutral background, to establish participants' baseline response. Measurement Computing PCI card. The stimuli were displayed to
This particular appearance of the neutral screen was selected participants on a 19″ LCD monitor with 4:3 picture aspect ratio.
based on a study (Kaya and Epps, 2004) that identified blue-green, The channels measured were skin conductance, ECG, respiration
i.e. cyan, as a color with the best ratio between elicited positive and skin temperature. The acquisition frequency of all channels
and negative emotions. After the baseline had been measured, the was 1250 Hz.
Fig. 3. Illustration of a timeline for experimental paradigm.
Fig. 4. Photograph of the experimental setup.

Table 3
Description of used normalization methods for heart rate, skin conductance, breathing rate and skin temperature signals within the segment for features computation.
Normalization Description
NORM1 The division of physiological signal with its baseline value (mean of segment X(tB): X(tB þ30 s))
NORM2 The division of physiological signal with X(tS) value
NORM3 The division of physiological signals with mean of segment X(tS - 15 s): X(tS)
NORM4 The division of physiological signals with mean of segment X(tB): X(tS þ 15 s)
NORM5 Scaling of physiological signal on the interval [0, 1] based on the minimum and maximum values
of the part of physiological signal from the beginning of the baseline (tB) until the end of segment for features computation (tS þ 15 s)
NORM6 The division of physiological signals with mean of segment X(tS): X(tS þ 15 s)
NORM7 The division of physiological signals with standard deviation of segment X(tB): X(tS þ 15 s)
4. Feature calculation
In total 356 different physiological features were calculated for

each particular stimulus presented to the participant.
4.1. Features calculated from heart rate, skin conductance, breathing

rate and skin temperature signals Fig. 5. Position of segment for feature computation from physiological signal X.
While skin conductance and skin temperature signals were waveforms. The latter is usually called the skin conductance
measured directly, heart rate signal was obtained from ECG using response (SCR) and is considered to be useful as it signifies a
Pan Tompkins R wave detection algorithm (Pan and Tompkins, response to internal/external stimuli. We developed an algorithm
1985), and breathing rate was obtained via peak detection from that detects the occurrence of SCR. The algorithm measures the
respiration signal. magnitude and the duration of the rise time. From this informa-
Then, from extracted heart rate and breathing rate signals, as tion, the following features were calculated: the frequency of
well as measured skin conductance and skin temperature signals, occurrence (FREQSCR), the maximum magnitude (SM_MAXSCR),
numerous physiological features were calculated using the same the mean magnitude value (SM_MEANSCR), the first SCR magni-
linear time series analysis methods for each of these physiological tude (SM_FIRSTSCR), the mean duration value (SD_MEANSCR) and
signals. the area of the responses (SMnSDSCR).
Because physiology depends on many factors, to increase the In order to increase robustness, features SM_MAXSCR, SM_
robustness of emotion estimation, physiological signals were MEANSCR, SM_FIRSTSCR and SMnSDSCR were also calculated from
normalized. In previous studies, commonly used normalization is normalized skin conductance signals using all 7 normalization
based on the minimum and maximum of the signal for the entire methods. In this way, the total of 34 features was calculated. The
session, which cannot be implemented in real-time applications. total is detailed in (Kukolja, 2012).
Therefore, in this paper, seven normalization methods were
investigated, which rely only on the part of the signal from the
4.3. Heart rate variability features
start of the session until the moment of estimation. In this way, the
methods are applicable for development of estimators that can
Heart rate variability (HRV) is one of the most often used
operate in real-time. Detailed description of all seven normal-
measures for ECG analysis. HRV is a measure of the continuous
ization methods is given in Table 3. For explanation of timestamps
interplay between sympathetic and parasympathetic influences on
for physiological signal see Fig. 5.
heart rate that yields information about autonomic flexibility and
From original and normalized heart rate, skin conductance,
thereby represents the capacity for regulated emotional
respiration rate and skin temperature signals for each stimulation,
responding (Appelhans and Luecken, 2006). Therefore, HRV ana-
288 statistical features were calculated, based on 14 statistical
lysis is emerging as an objective measure of regulated emotional
methods: mean, standard deviation, mean of the first derivative,
responding.
minimum, maximum, difference between maximum and mini-
In the time domain, we calculated the following statistical
mum, mean of the offset, minimum of the offset, maximum of the
features: the standard deviation of all NN intervals (SDNN), the
offset, difference of means between two consecutive segments,
square root of the mean of the sum of the squares of differences
difference of standard deviations between two consecutive seg-
between adjacent NN intervals (RMSSD), standard deviation of
ments, difference of means of the first derivative between two
differences between adjacent NN intervals (SDSD), the proportion
consecutive segments, mean of the absolute values of the first
derived by dividing NN501 by the total number of NN intervals
differences and mean of the absolute values of the second
(pNN50), the proportion derived by dividing NN202 by the total
difference. However, among all 448 possible combinations of
number of NN intervals (pNN20) and Fano factor (FF) (Camm et al.,
original and normalized physiological signals with statistical
1996; Teich et al., 2000).
methods, only those with foundation in prior literature were
In the frequency domain, we calculated power of the low-
included. A detailed description of 14 used statistical methods
frequency (LF) band (0.04–0.15 Hz ), and the high-frequency (HF)
for feature calculation and a list of 288 calculated features are
band (0.15–0.4 Hz) from the power spectral densities of detrended
given in Supplementary data, which are adapted from (Kukolja,
HRV time series obtained using the Burg algorithm (Stoica and
2012).
Moses, 1997). We also calculated LF and HF measured in normal-
4.2. Skin conductance response features ized units (LFnorm, HFnorm) which represent the relative value of
The skin conductance signal includes two types of electroder- 1

The number of pairs of successive NN intervals differing by more than 50 ms.
mal activity: the DC level component and the distinctive short 2
The number of pairs of successive NN intervals differing by more than 20 ms.
each power component, as well as the ratio of power within the LF subjective ratings regarding the emotional state that a particular
band to that within the HF band (LF/HF) (Camm et al., 1996). stimulation elicited in her. She was supposed to give perceived
To increase the robustness of emotion estimation, features intensity for all discrete emotions, and in many instances co-
SDNN, RMSSD, SDSD, pNN50, pNN20, LF, LFnorm, HF, HFnorm occuring emotions (Harley et al., 2012) appeared, in which the
and LF/HF were normalized by dividing with their baseline values. participant perceived more than one discrete emotion as very
Considering the complex control systems of the heart it is intense. Therefore, an algorithm was developed for finding refer-
reasonable to assume that nonlinear mechanisms are involved in ent emotions that were elicited even in such ambiguous cases. The
the genesis of HRV. The nonlinear properties of HRV have been algorithm resolves the referent emotion based on the intensity of
analyzed using approximate and sample entropy (Richman sadness, disgust, fear, happiness and other reported discrete
and Moorman, 2000), obtaining ApEn1 (m ¼ 1, r ¼ 0.15 SDNN), emotions, depending on the intended emotion that a particular
ApEn2 (m ¼ 2, r ¼0.15 SDNN), ApEn3 (m ¼3, r ¼ 0.15 SDNN), ApEn4 stimuli sequence was expected to elicit (Fig. 6).
(m ¼ 4, r ¼0.15 SDNN) and SampEn (m ¼2, r ¼ 0.15 SDNN) features By conducting the algorithm over all subjective ratings of the
(Richman and Moorman, 2000). participants, the following numbers of samples for each discrete
In this way, a total of 26 heart rate variability features were emotion were obtained:
calculated.
- 89 samples of sadness.
4.4. Features calculated from respiration signal - 99 samples of disgust.
- 38 samples of fear.
Features are also calculated from the raw respiration signal. We - 78 samples of happiness.
calculated the power mean values of four subbands within - 226 samples that algorithm annotates with “Other”, which are
following ranges: 0–0.1 Hz, 0.1–0.2 Hz, 0.2–0.3 Hz and 0.3– not used for the analysis of the classification of discrete
0.4 Hz. The power spectral densities of detrended respiration emotions.
signal were obtained using the Burg algorithm (Stoica and
Moses, 1997). To increase the robustness of emotion estimation, Additionally, it was assumed that the participants reached the
features were also calculated from detrended respiration signal neutral emotional state in the last 90 s of the neutral-stimulus
normalized by dividing with mean peak-to-peak magnitude of period between the two consecutive sequences. With the division
respiration signal in baseline. In this way, a total of 8 features from of that end period into 15 s segments additional 156 samples of a
respiration signal were calculated. neutral emotional state were included into the classification of
discrete emotional states. On the basis of the number of samples
included in classification analysis for each discrete emotion, it can
5. Results and discussion be concluded that by using static images reliable and unambig-
uous elicitation of fear was the most challenging.
In order to classify discrete emotions based on participants' Table 5 contains accuracy comparison results of discrete emo-
subjective ratings, every segment of participant's physiology tion classification using different feature reduction and machine
acquired during stimulation was associated with the referent learning methods. The comparison was performed with leave-one-
emotional state. During the experiment, each participant gave session-out crossvalidation (LOSOCV) model evaluation procedure,
Fig. 6. Illustration of the algorithm for finding referent elicited emotions for each stimulus. To define the dominant emotion as referent, intensity of dominant emotion must
be larger than 5 and necessary difference in intensities between dominant and second most intensive emotion must be larger than 2. The necessary difference is increased by
1 if dominant emotion is not expected to be elicited by sequence in which this stimulus appears, or if second most intensive emotion cannot be uniquely defined.
which is, unlike the LOOCV, suitable for the evaluation of real-time Chi2 procedures, 35 different physiological features were selected
estimators. Several machine learning methods were used for and ranked. Second, for each classification procedure a number of
classification: kNN, SVM, RF, MLP artificial neural network, RIPPER features that yields the highest classification accuracy was identi-
production rules algorithm, C4.5 decision tree and NB (Breiman, fied, with and without FP postfiltering feature reduction. When
2001; Chang and Lin, 2011;Witten et al., 2011). Parameters with using SFFS wrapper feature reduction, either standalone or in a
which machine learning methods were tested are listed in Table 4. combination with FP, 35 physiological features were selected
Parameters tuning was performed, but had negligible impact on based on the same criterion that was used for evaluation of
accuracy in comparison to the selected feature reduction method. classifiers. Table 5 contains only the classification accuracies for
Consequently, program defaults or most common parameters from subsets of features that produced the best results (for illustration
literature were used. see Fig. 7(a)).
The classifiers were used in combination with SFFS (Pudil et al., By examining the results in Table 5 it is obvious that the best
1994), mRMR (Peng et al., 2005), ReliefF (Robnik-Šikonja and classification accuracy of discrete emotions is achieved by a combi-
Kononenko, 2003), IG (Witten et al., 2011), OneR classifier feature nation of SFFS selection and MLP (60.3%). Every machine learning
evaluation (Witten et al., 2011) or Chi2 (Liu and Setiono, 1997) methods yields best results using SFFS selection, followed by the
feature selection methods, both standalone and coupled with the mRMR, ReliefF, OneR, IG and, finally, Chi2 algorithm. Generally, MLP
FP algorithm (Fisher, 1936). During evaluation, tested computer is the best classifier, but differences in accuracy between various
methods for feature selection were divided into two main groups: machine learning methods are very small. Therefore, for real-time
filters and wrappers (Kohavi and John, 1997). Filter methods, like applications it is important to use efficient classifiers that quickly
mRMR, ReliefF, IG, OneR or Chi2, heuristically select a subset of build their models. For example, naïve Bayes that is second best in
features based on the statistical properties of the given data set, classification accuracy learns faster than the most accurate MLP
regardless of the used estimation (or classification) method. method, and kNN – which as an instance based algorithm does not
Wrappers use the estimator (or classifier) for assessing the require a model at all – yields only 4% worse classification results
performance of selected subset of features (Kohavi and John, 1997). than the MLP in the best and average case.
When feature selection methods were coupled with FP algo- Comparison of Fig. 7(a) and (b) illustrates how the choice of
rithm, a well-known method of reducing dimensionality by find- evaluation method can profoundly influence accuracy results. We
ing a linear projection of the data to a space of fewer dimensions therefore believe that the choice of evaluation method is an
where the classes are well-separated, the selection procedure was important reason why the best-case accuracies obtained in
applied as a simple preprocessing step for reduction of feature set Table 5 are in the range 55–60%, even though related work
before applying the FP algorithm. analysis has shown the accuracies around 60–80% for classification
The evaluation of filter selection methods was performed in of a comparable number of distinct emotions. While reviewed
two steps. First, by filtering data with mRMR, ReliefF, IG, OneR and studies typically used the LOOCV method to see how well the
Table 4
Machine learning methods parameters.
Method Parameters
kNN k ¼ 5; all neighbors have the same weight

SVM C-Support Vector Classification type of SVM; radial basis kernel function: exp(-gamman|u-v| 4 2); gamma ¼1/n, where n is the number of input features
RF 10 trees; number of randomly selected feature on each split¼ int(log(n) þ 1), where n is the number of input features
MLP One hidden layer, logsig activation function of the hidden layer and purelin for the output; number of neurons in the hidden layer¼ (nþ 5)/2, where n is the
number of input featurs; Levenberg-Marquardt backpropagation training algorithm (maximum number of epochs ¼ 100, initial mu ¼ 0.001, mu decrease
factor¼0.1, mu increase factor¼ 10)
RIPPER Number of folds for REP ¼ 3; minimal weights of instances within a split ¼ 2.0; number of runs of optimizations ¼2
C4.5 Confidence threshold for pruning¼ 0.25; minimum number of instances per leaf¼ 2; number of folds for reduced errorpruning ¼3
NB –
Table 5
Comparison of classification accuracy using different features reduction and machine learning methods.
Feature reduction methods Machine learning methods Average (%) Range (%)
kNN (%) SVM (%) RF (%) MLP RIPPER (%) C4.5 (%) NB (%)
SFFS 49.24 50.00 54.88 60.30 55.97 53.15 56.83 54.34 11.06
SFFS-FP 56.18 57.61 55.31 54.01 56.18 54.23 59.00 56.07 04.99
mRMR 50.33 49.13 49.89 50.54 47.51 44.69 47.29 48.48 05.85
mRMR-FP 49.02 48.70 47.07 47.51 47.51 48.37 47.51 47.96 01.95
ReliefF 42.73 46.74 47.72 47.72 47.07 47.07 35.57 44.95 12.15
ReliefF-FP 41.00 43.91 39.91 43.38 44.69 44.69 39.70 42.47 04.99
IG 36.01 41.52 42.52 44.47 45.77 42.08 42.08 42.06 09.76
IG-FP 42.52 41.52 40.78 40.78 44.69 44.69 41.21 42.31 03.91
OneR 36.96 40.22 47.39 50.43 47.82 46.52 34.35 43.38 16.08
OneR-FP 40.34 40.87 42.17 40.87 41.74 42.61 39.57 41.17 03.04
χ2 36.23 44.57 46.20 46.00 46.85 43.38 34.92 42.59 11.93
χ2-FP 40.78 40.22 41.00 41.87 42.52 40.13 39.26 40.83 03.26
Best 56.18 57.61 55.31 60.30 56.18 54.23 59.00 – –
Average 43.45 45.42 46.24 47.32 47.36 45.97 43.11 – –
Range 20.17 17.39 15.40 19.52 14.44 14.10 24.65 – –
Fig. 7. Dependence of classification accuracy on the number of features. Dashed lines indicate the number of features for subsets that produced the best results. The used
classifier was kNN (k ¼5). (a) LOSOCV evaluation-accuracy 56.18% and (b) LOSOCV evaluation-accuracy 78.96%.
Table 6 Table 7
Execution times of 9 wrapper feature selection methods during selection of the Execution times of 4 filter feature selection methods during selection of the original
original 15 physiological features. 15 physiological features.
Feature reduction Machine learning Programming Execution Feature reduction method Programming language Execution time (s)
method method language time (h)
mRMR Matlab þ C þ þ 0.7436
SFFS kNN Matlab 1 Chi2 Matlab þ Java 21.2056
SFFS-FP kNN Matlab 1 ReliefF Matlab þ Java 27.5045
SFFS NB Matlab þ Java 26.5 IG Matlab þ Java 21.2956
SFFS-FP NB Matlab þ Java 16.5 OneR Matlab þ Java 23.4103
SFFS RIPPER Matlab þ Java 19
SFFS-FP RIPPER Matlab þ Java 12.5
SFFS MLP Matlab 101
SFFS-FP MLP Matlab 67
Table 8
SFFS RF Matlab þ Java 50
The confusion matrix for the combination of a kNN with mRMR after using the
leave-one-session-out crossvalidation method.
estimator generalizes from the training set, for our comparative
Sadness Disgust Fear Happiness Neutral Accuracy (%)
analysis we used the LOSOCV method instead, for reasons
explained in Section 2. As Fig. 7(b) shows, when using the LOOCV Sadness 38 5 6 9 31 42.70
method with our dataset, we obtain accuracies similar to Disgust 5 66 7 4 17 66.67
other works. Fear 3 15 9 3 8 23.68
Happiness 13 7 2 28 28 35.90
To more thoroughly examine different machine learning and Neutral 15 21 1 29 91 57.96
feature reduction methods, it is necessary to consider real-time
constraints and execution times of individual emotion classifica-
tion procedures. Tables 6 and 7 contain execution times of feature
reduction procedures for selection of 15 original physiological Table 9
features using LOSOCV evaluation. Because SFFS is a wrapper The confusion matrix for the combination of a MLP with SFFS after using the leave-
method, data in Table 6 implicitly indicate combined learning one-session-out crossvalidation method.
and execution times of individual machine learning methods. For
Sadness Disgust Fear Happiness Neutral Accuracy (%)
example, kNN machine learning method has considerably lower
combined execution and learning time than MLP. Sadness 50 16 1 9 13 56.18
According to Tables 6 and 7, the compared procedures have Disgust 18 57 6 3 15 57.58
been implemented in different computer languages (e.g. Matlab, Fear 5 23 4 3 3 10.53
Happiness 20 8 1 33 16 42.31
Java). For program profiling the following configuration was used. Neutral 15 12 3 16 111 70.70
- Processor: Intel (R) Core (TM) 2 Duo E4500 (2.20 GHz).
- Memory: 2 GB.
- Operating System: Windows 7 Enterprise (64-bit). preferred kNN þmRMR method under time constraints, confusion
As can be seen in Table 6 in some cases SFFS has a very long matrices are provided in Tables 8 and 9. The confusion matrices
execution period. Although SFFS-based selection procedure can be show expected results that negative and highly arousing emotions,
accelerated by prefiltering the initial set of physiological features, i.e. fear and disgust, are mutually confused more than each of
SFFS is not applicable for feature selection during sessions. How- them is confused with happiness. Moreover, confusion matrices
ever, SFFS is recommended for selection before sessions because of show that recognition of fear negatively impacts the results, which
its high attained accuracy. Among tested filter methods the most may be related to previously described difficulty in unambiguous
accurate and the fastest is the mRMR and as such is the best choice fear elicitation using static images.
for feature selection within sessions. Table 10 contains accuracy comparison results of discrete
In order to provide a more comprehensive insight into accuracy emotion classification using different physiological channels as
of the most accurate SFFS þMLP method, as well as accuracy of the input, in order to assess which physiological channels contribute
Table 10 adaptation due to considerably lower combined execution and

Comparison of classification accuracy using different physiological feature sets. learning time of kNN versus MLP.
Future work involves an in-depth analysis of longitudinal
Feature set Accuracy with Accuracy with
SFFS þ MLP (%) mRMR þkNN (%) personalization of the estimator to a particular person and real-
time estimator adaptation during the session. This analysis will
Without SC features 53.04 39.78 include an experimental verification of improvements that can be
Without RESP features 54.78 49.13 obtained under certain conditions such as the number of sessions,
Without ECG features 54.78 49.35
Without TEMP features 57.17 50.00
the number of data types that can be obtained during the session,
All features 60.30 50.33 the homogeneity of the participant's group, etc.
the most to a correct classification. The implementation of discrete Acknowledgments

emotion classification with two best algorithms (SFSS þMLP over-
all, mRMRþ kNN in real time) using all available features and This research has been partially supported by the Ministry of
selected features subsets has revealed that some physiological Science, Education and Sports of the Republic of Croatia. We thank
signals seem to provide more informative features than others. Dr. Dragutin Ivanec, Dr. Mirjana Tonković and Dr. Anita Lauri
As can be seen in Table 10, with the skin conductance signal Korajlija from the Department of Psychology at the University of
excluded, the accuracy decrease is the highest, especially with Zagreb, Faculty of Humanities and Social Sciences for valuable
mRMRþ kNN combination. On the contrary, skin temperature assistance with the experiment design. We also thank the anon-
signal only marginally improves the performance obtained with ymous reviewers for their valuable comments.
the 3 other physiological channels.
Appendix A. Supplementary material

6. Conclusion
Supplementary data associated with this article can be found in
This paper has addressed the gap in comparative analysis of the online version at http://dx.doi.org/10.1016/j.ijhcs.2014.05.006.
popular feature reduction and machine-learning methods in
physiology-based real time emotion estimation and therefore
provides an additional insight into the possibilities of creating a References
personalized adaptive emotion estimator. To realize the compara-
tive analysis of methods for physiology-based emotion estimation, Appelhans, B.M., Luecken, L.J., 2006. Heart rate variability as an index of regulated
an emotion elicitation experiment was conducted using a set of emotional responding. Rev. Gen. Psychol. 10, 229–240.
Barke, A., Stahl, J., Kröner-Herwig, B., 2011. Identifying a subset of fear-evoking
static images from the IAPS database to elicit sadness, happiness, pictures from the IAPS on the basis of dimensional and categorical ratings for a
disgust, fear, and a neutral emotional state. Emotional responses German sample. J. Behav. Ther. Exp. Psychiatry 43, 565–572.
were measured by standard set of physiological signals which Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32.
Camm, A.J., Malik, M., Bigger, J.T., Breithardt, G., Cerutti, S., Cohen, R.J., Coumel, P.,
included ECG, respiration, skin conductance and skin temperature, Fallen, E.L., Kennedy, H.L., Kleiger, R.E., 1996. Heart rate variability: standards of
as well as by individual subjective ratings of displayed images. measurement, physiological interpretation, and clinical use. Circulation 93,
In order to establish which emotion estimation methods may 1043–1065.
Chang, C.C., Lin, C.J., 2011. LIBSVM: a library for support vector machines. ACM
be the most suitable for real time applications, comparative Trans. Intell. Syst. Technol. 2, 27.
analysis was based on estimation accuracy, execution and learning Ćosić, K., Popović, S., Kukolja, D., Horvat, M., Dropuljić, B., 2010. Physiology-driven
time of emotion estimators that considered various feature reduc- adaptive virtual reality stimulation for prevention and treatment of stress
related disorders. Cyberpsychol., Behav. Soc. Netw. 13, 73–78.
tion and machine learning methods. Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann.
The highest classification accuracy was achieved with the MLP Hum. Genet. 7, 179–188.
machine learning and SFFS feature selection algorithm. Generally, Fletcher, R.R., Dobson, K., Goodwin, M.S., Eydgahi, H., Wilder-Smith, O., Fernholz, D.,
Kuboyama, Y., Hedman, E.B., Poh, M.Z., Picard, R.W., 2010. iCalm: wearable
SFFS selection yielded the best results, followed by the mRMR,
sensor and network architecture for wirelessly communicating and logging
ReliefF, OneR, IG and Chi2 methods with best-to-worst accuracy autonomic activity. IEEE Trans. Inf. Technol. Biomed. 14, 215–223.
differences from 14% to 24%. Harley, J.M., Bouchet, F., Azevedo, R., 2012. Measuring learners' co-occurring
The results showed that average execution times of the SFFS emotional responses during their interaction with a pedagogical agent in
MetaTutor. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (Eds.),
method are too long to apply the method for feature selection Intelligent Tutoring Systems, ITS '12, Lecture Notes in Computer Science, vol.
during emotion estimation sessions. Because of its high accuracy 7315. Springer-Verlag, Berlin, Heidelberg, pp. 40–45.
SFFS is recommended for feature analysis before sessions. Among Healey, J.A., Picard, R.W., 2005. Detecting stress during real-world driving tasks
using physiological sensors. IEEE Trans. Intell. Transp. Syst. 6, 156–166.
other studied feature selection methods, mRMR was shown to be Jurin, T., Jokic-Begic, N., Korajlija, A.L., 2012. Factor structure and psychometric
the fastest and the most accurate, and as such should be chosen properties of the anxiety sensitivity index in a sample of Croatian adults.
preferably for feature selection during sessions. Assessment 19, 31–41.
Katsis, C.D., Katertsidis, N., Ganiatsas, G., Fotiadis, D.I., 2008. Toward emotion
MLP produced the best overall classification, but best-to-worst recognition in car-racing drivers: a biosignal processing approach. IEEE Trans.
accuracy differences of machine learning methods were smaller Syst., Man Cybern., Part A: Syst. Hum. 38, 502–512.
than for feature reduction methods, from 2% to 12%. Therefore, in Katsis, C.D., Katertsidis, N.S., Fotiadis, D.I., 2011. An integrated system based on
physiological signals for the assessment of affective states in patients with
real-time applications the learning time is the most important anxiety disorders. Biomed. Signal Process. Control 6, 261–268.
factor that determines the choice of the machine learning method. Kaya, N., Epps, H.H., 2004. Relationship between color and emotion: a study of
In this sense, all methods are faster than MLP. For example, the college students. Coll. Stud. J. 38, 396–405.
Kim, J., André, E., 2008. Emotion recognition based on physiological changes in
naïve Bayes that is the second best in classification accuracy
music listening. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2067–2083.
converges faster than MLP, and kNN, which does not require the Kim, K.H., Bang, S.W., Kim, S.R., 2004. Emotion recognition system using short-term
supervised training phase at all, yields only 4% less accurate monitoring of physiological signals. Med. Biol. Eng. Comput. 42, 419–427.
best-case results than MLP. Combined with mRMR, kNN and MLP Kohavi, R., John, G.H., 1997. Wrappers for feature subset selection. Artif. Intell. 97,
273–324.
had the highest, negligibly different, accuracy results; however, Kolodyazhniy, V., Kreibig, S.D., Gross, J.J., Roth, W.T., Wilhelm, F.H., 2011. An
mRMRþ kNN is preferable option for real-time estimator affective computing approach to physiological emotion specificity: toward
subject-independent and stimulus-independent classification of film-induced Peng, H., Long, F., Ding, C., 2005. Feature selection based on mutual information:
emotions. Psychophysiology 48, 908–922. criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans.
Kreibig, S.D., Wilhelm, F.H., Roth, W.T., Gross, J.J., 2007. Cardiovascular, electro- Pattern Anal. Mach. Intell. 27, 1226–1238.
dermal, and respiratory response patterns to fear and sadness-inducing films. Picard, R.W., 1997. Affective computing. MIT Press, Cambridge, MA.
Psychophysiology 44, 787–806. Picard, R.W., 2010. Emotion research by the people, for the people. Emot. Rev. 2,
Kring, A.M., Smith, D.A., Neale, J.M., 1994. Individual differences in dispositional 250–254.
expressiveness: development and validation of the Emotional Expressivity Picard, R.W., Vyzas, E., Healey, J., 2001. Toward machine emotional intelligence:
Scale. J. Personal. Social Psychol. 66, 934–949. analysis of affective physiological state. IEEE Trans. Pattern Anal. Mach. Intell.
Kukolja, D., 2012. Real-time emotional State Estimator Based on Physiological 23, 1175–1191.
Signals Mining, (Ph.D. thesis). University of Zagreb, Croatia. Pudil, P., Novovičová, J., Kittler, J., 1994. Floating search methods in feature
Lacey, J.I., Lacey, B.C., 1958. Verification and extension of the principle of autonomic selection. Pattern Recog. Lett. 15, 1119–1125.
response-stereotypy. Am. J. Psychol. 71, 50–73. Rainville, P., Bechara, A., Naqvi, N., Damasio, A.R., 2006. Basic emotions are
Lang, P.J., Bradley, M.M., Cuthbert, B.N., 2008. International Affective Picture System associated with distinct patterns of cardiorespiratory activity. Int. J. Psycho-
(IAPS): Affective Ratings of Pictures and Instruction Manual. University of physiol. 61, 5–18.
Florida, Gainesville, FL (Technical Report A-8). Rani, P., Liu, C., Sarkar, N., Vanman, E., 2006. An empirical study of machine learning
Leon, E., Clarke, G., Callaghan, V., Sepulveda, F., 2007. A user-independent real-time techniques for affect recognition in human-robot interaction. Pattern Anal.
Appl. 9, 58–69.
emotion recognition system for software agents in domestic environments.
Reiss, S., Peterson, R.A., Gursky, D.M., McNally, R.J., 1986. Anxiety sensitivity,
Engineering. Appl. Artif. Intell. 20, 337–345.
anxiety frequency and the prediction of fearfulness. Behav. Res. Ther. 24, 1–8.
Libkuman, T.M., Otani, H., Kern, R., Viger, S.G., Novak, N., 2007. Multidimensional
Richman, J.S., Moorman, J.R., 2000. Physiological time-series analysis using approx-
normative ratings for the international affective picture system. Behav. Res.
imate entropy and sample entropy. Am. J. Physiol.-Heart Circ. Physiol. 278,
Methods 39, 326–334.
2039–2049.
Lisetti, C.L., Nasoz, F., 2004. Using noninvasive wearable computers to recognize
Robnik-Šikonja, M., Kononenko, I., 2003. Theoretical and empirical analysis of
human emotions from physiological signals. EURASIP J. Appl. Signal Process.
ReliefF and RReliefF. Mach. Learn. 53, 23–69.
2004, 1672–1687. Rottenberg, J., Ray, R.D., Gross, J.J., 2007. Emotion elicitation using films. In: Coan, J.
Liu, C., 2009. Physiology-based affect recognition and adaptation in human- A., Allen, J.J.B. (Eds.), Handbook of Emotion Elicitation and Assessment. Oxford
machine interaction, (Ph.D. thesis). Vanderbilt University, Nashville, TN. University Press, New York, USA, pp. 9–28.
Liu, H., Setiono, R., 1997. Feature selection via discretization. IEEE Trans. Knowl. Smith, C.A., Ellsworth, P.C., 1985. Patterns of cognitive appraisal in emotion.
Data Eng. 9, 642–645. J. Personal. Social Psychol. 48, 813–838.
Mandryk, R.L., Atkins, M.S., 2007. A fuzzy physiological approach for continuously Stoica, P., Moses, R.L., 1997. Introduction to Spectral Analysis. Prentice Hall, Upper
modeling emotion during interaction with play technologies. Int. J. Hum.- Saddle River, NJ.
Comput. Stud. 65, 329–347. Teich, M.C., Lowen, S.B., Jost, B.M., Vibe-Rheymer, K., Heneghan, C., 2000. Heart rate
Mikels, J.A., Fredrickson, B.L., Larkin, G.R., Lindberg, C.M., Maglio, S.J., Reuter-Lorenz, variability: measures and models. In: Akay, M. (Ed.), Nonlinear Biomedical
P.A., 2005. Emotional category data on images from the International Affective Signal Processing. IEEE Press, Piscataway, NJ.
Picture System. Behav. Res. Methods 37, 626–630. van den Broek, E.L., Lisý, V., Janssen, J.H., Westerink, J.H.D.M., Schut, M.H.,
Nasoz, F., Lisetti, C.L., 2006. MAUI avatars: mirroring the user's sensed emotions via Tuinenbreijer, K., 2010. Affective man-machine interface: unveiling human
expressive multi-ethnic facial avatars. J. Vis. Lang. Comput. 17, 430–444. emotions through biosignals. Biomed. Eng. Syst. Technol., 21–47.
Pan, J., Tompkins, W.J., 1985. A real-time QRS detection algorithm. IEEE Trans. Witten, I.H., Frank, E., Hall, M.A., 2011. Data Mining: Practical Machine Learning
Biomed. Eng., 230–236. Tools and Techniques. Morgan Kaufmann, San Francisco, CA.

Comparitive Analysis PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Comparitive Analysis PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Int. J.

Human-Computer Studies 72 (2014) 717–727

Contents lists available at ScienceDirect

Int. J. Human-Computer Studies

Comparative analysis of emotion estimation methods based on

1. Introduction automated estimation of patient's emotions, like treatment of

Feature reduction methods Machine learning methods

Feature selection Feature transformation

Fig. 1. The timeline of emotion estimation.

Fig. 2. Block diagram of a supervised system for emotion estimation.

Fig. 3. Illustration of a timeline for experimental paradigm.

Fig. 4. Photograph of the experimental setup.

In total 356 different physiological features were calculated for

4.1. Features calculated from heart rate, skin conductance, breathing

The skin conductance signal includes two types of electroder- 1

kNN k ¼ 5; all neighbors have the same weight

Table 10 adaptation due to considerably lower combined execution and

the most to a correct classiﬁcation. The implementation of discrete Acknowledgments

Appendix A. Supplementary material

Das könnte Ihnen auch gefallen