Sie sind auf Seite 1von 17

Emotion Recognition

Deb * Mal * Ya Sin * Ha

F e l i c i t o u s C o m p u t i n g I n s t i t u t e visit to

Means of Emotion Recognition

High

Anger

Joy

Arousal

Low

Sadness
Low

Love
High

Valance

Hand Writing
Speech

Gestures

Brain Imagery Skin Conductance Dance Heartbeat Movement Features

Facial Feature

Bio Chemical Social Presence Eye & pupil

Active User Participation


High

Anger Users usually have to consciously take part in it Joy Chances of suppressing emotive cues Chances of showing inverted affective state Tend to be biased Love Sadness Low
Arousal
Low High

Valance

Hand Writing
Speech

Gestures

Brain Imagery Skin Conductance Dance Heartbeat Movement Features

Facial Feature

Bio Chemical Social Presence Eye & pupil

Passive User Participation


High

Anger Joy Users do not need to take part actively Users have much less control over the results Less attention and control means less bias Love Sadness Low
Arousal
Low High

Valance

Hand Writing
Speech

Gestures

Brain Imagery Skin Conductance Dance Heartbeat Movement Features

Facial Feature

Bio Chemical Social Presence Eye & pupil

Active
Transition Over time users tend to get familiarized The actions slowly gets more passive Controlling tendency decreases Faking of emotion decreases Other two Key Factors for the Transition: 1. Device invisibility 2. Sensor distance

Passive

Device invisibility

Instead of

An invisible tracker/sensor device sits in the background and do work for the users

MoodScope

Problem of devices with direct contact:


In various situations users will not have the mindset to actively engage (Trauma, Sadness, Forlorn) Sensor Distance is important in these situations

Sensor Distance
Modes of Passive Recognizers Skin Conductance Brain Imagery Bio Chemical Heartbeat Eye & pupil Facial Feature Movement and Gestures
Without Attachments

With External Attachments

No body attachments Large Sensor Distance Can be operated more passively Much more unconscious participation Bias can be more minimized

Passive, Ambient Sensors


Eye & pupil
needs

Facial Feature

Focus Facing to camera A degree of attachment to sensors

Movement and Gestures

In many cases where: 1. the face is not visible 2. there is no provision for attaching sensors to body 3. there is no speech input The movement and gesture detection is much more feasible to detect affect

Movements and Gestures: A scenario


Situations where body movements and gestures are crucial:
1. 2. 3. A Post Traumatic Stress Disorder (PTSD) patient pacing in the room. A schizophrenic patient at an asylum is going impatient and angry and doing frivolous, jerky movements. A patient of Chronic Depression is seen pacing slowly, hands in pocket, head drooping.

An Automated system that detects emotive states in such situations, can even save lives.

HaiXiu -

Records gestures and movement

Comes up with unique feature set

Trains a Neural Net for later detection

Continuous Emotion Detection

HaiXiu -

Microsoft Kinect for movement detection Rather than discreet affective states, our target is to detect Arousal and Valence Levels in continuous space. This model of continuous affective level detection can be implemented with other continuous affective spaces. e.g: Plutchiks Emotion Wheel, PAD model Presently HaiXiu detects only Arousal levels. Work is going on to include the Valence level.
High

Anger

Joy

Arousal

Low

Sadness
Low

Love
High

Valance

Feature Set for Arousal level detection


Kinect gives us 20 Different Joint position data We Calculate: 1. 2. 3. 4. 5. 6. 7. 8. Minimum coordinates for X , Y and Z axis (Relative to spine) Maximum coordinates for X , Y and Z axis (Relative to spine) Speed = s/t Peak Acceleration = u/t Peak Deceleration = - u/t Average Acceleration = ( (u/t))/f Average Deceleration = - ( (u/t))/f Jerk Index = ( (a/t))/f t = 0.2 second; f = total time / t

Training the Neural Net


Initially we took 20 movement features (without the position features) and told 2 subjects to walk in various arousal levels. We measured Speed, Accel, Decel, JerkIndex for upper body joints. Type: Bipolar Feedforward ANN Layers: 3 (20 : 6: 1) Learning: Backpropagation Learning Sample Size: 34 Walks (in different arousal levels) of 2 subjects Error Limit of learned Net: 0.0956

Detection
The ANN outputs one variable for Arousal Level The output range is from -1 (totally relaxed) to +1 (Very Aroused)

Challenges
1. Short working range of Kinect : .8m to 4.0m 2. Shorter than the range needed in practical scenarios

3. Data not consistent enough for precise movement feature Calculation 4. Fault Tolerance in case of recording and detection is needed. 5. Kinect does not follow BVH format thus available gesture databases in BVH can not be natively used without a converter module (less efficiency)

Next Step
1. 2. 3. 4. 5. 6. Introducing the Position CoOrdinates Fine tune the Arousal level recognizer A Robust Gesture recognition module Building a Valence recognizer module Getting more test data with more number of subjects Multiple Kinect integration for better recognition

7. A slightly better user interface


High

Anger

Joy

Arousal

Low

Sadness
Low

Love
High

Valance

Integrated Emotion detection

1. 2. 3. 4. 5.

Every one of the modes of recognition have their merits There are a plethora of existing facial expression detectors like affectiva Speech based emotion recognition has also been extensively done MoodScope has changed the smartphone based affect detection Powerful tools like AmbientDynamix makes integration of various sensor inputs ease for processing and using in small devices like a smartphone

Thank You

Das könnte Ihnen auch gefallen