Sie sind auf Seite 1von 4

Henry Hilt

The Interpretation of Emotion in Tone of Voice


Abstract:
This paper reviews research concerning various aspects of emotion in tone of voice in
spoken language. In 1872, Charles Darwin was perhaps the first to formally theorize that
emotion produced changes in the speakers voice (Murray and Arnott 1993). Early research
determined that humans could identify different emotions depending on the orators tone of
voice. Since then, research has progressed to focus on understanding the acoustic qualities that
define tone of voice. Current research largely focuses on the listeners processing of tone of
voice. Few of the results to date, however, are entirely conclusive. Disagreement exists between
studies about the basic components of tone of voice and how the brain processes tone of voice.
Emotional inflection has long been considered an important part of communication. Despite the
unresolved questions, this has been almost universally affirmed by scientific literature
Review:
I heard you had a great day today. While the preceding sentence has a simple literal
meaning, the statement has a plethora of possible interpretations in conversation. The speaker
could be expressing happiness or surprise thus inviting the listener to elaborate. He might be
using sarcasm to point out that the speaker had a poor day. The emotion the speaker intends to
convey is often expressed in the tone of voice of the speaker, which is interpreted by the listener.
Research has confirmed long held theories that listeners can distinguish specific emotions
from tone of voice. In 1872, Darwin was one of the first to scientifically suggest that emotion
produced changes in qualities of voice (Murray and Arnott 1993). Tone of voice itself can
convey emotion even in the absence of visual cues given by the speaker. Even recorded
statements provide enough auditory cues for listeners to distinguish between fear, sadness, anger
disgust, surprise and happiness (Bleichner 2013).
Types of auditory cues that give an emotion its recognizable and distinct qualities have
been a major topic of research. This area of research has had limited success at determining what
type of auditory cues inform the listener of the speakers tone of voice. One of the most common
theories is that emotion is interpreted by the brain based on three variables (Murray and Arnott
1993). The studies proposed that the emotion itself varied on this three dimensional scale, and
that a specific emotion is directly linked to a tone of voice. Valence is the level of positivity or
negativity, arousal relates to the speakers level of stress, where strength corresponds to the
difference between response and activity by the subject. Studies using this model have little
agreement on the names of the dimensions. Similar theories have attempted to use only two
dimensions of the three aforementioned dimensions to describe the range of emotions (Bleichner
2013).

Different levels of focus further complicate research efforts to determine standardized


acoustic components of tone of voice. Some experiments have focused on dissecting tone of
voice into distinct prosodic properties in the manner the auditory system would process these
cues. These acoustic qualities include tempo, volume and pitch of the voice (Nygaard 2008).
Other studies have suggested that change during speech in these qualities produces what we
interpret as tone of voice (Rockwell 2000). Studies have made some attempt to correlate these
acoustic qualities of tone of voice to valence, arousal and strength. As of yet, however, no
concrete model to accurately integrate a multidimensional theory of auditory emotion with
acoustic cues exists (Murray and Arnott 1993).
Research into the sarcastic or ironic tone of voice tends to focus on acoustic qualities.
The existence of a sarcastic tone of voice is so commonplace that some studies have succeeded
in manipulating statements to be sarcastic or sincere (Voyer 2014 and 2015). The sarcastic tone
of voice is typically characterized as having a slow tempo and a high level of arousal (Kreuz
1995). Other possible markers of sarcasm (Rockwell 2000) include nasal sounds and greater
volume. Literature has started to identify verbal irony as slower, louder, and different in pitch
from normal speech (Cheang and Pell 2008).
For all tones of voice, recent research has largely focused on the listeners processing of
information. Despite this focus on the listener, only at certain levels has the listeners reaction to
emotional prosody been thoroughly examined. Literature concerning neural activity evoked by
various emotional inflections is very sparse. Until this gap in knowledge is addressed by future
studies, it will be hard to determine the mechanisms underlying prosodic processing in the brain.
Much of the current knowledge on listener processing comes from cognitive based experiments
(Voyer 2014) or analyses of spoken language (Cheang and Pell 2008).
From experiments focusing on reactions of listeners after cognitive processing of
prosodic stimuli, a wealth of research has appeared in regard to how the brain understands the
content of verbal communication. Specifically, many recent studies have focused on how
prosody facilitates or hinders processing of the content itself. Many theories of prosodic
processing have been proposed, which can be grouped into two distinct categories. The first
category, abstract theories, hypothesize that during voice processing, information about the
prosody of the language is filtered out by neural circuits and processed separately, leaving only
the content to be processed as language (Krestar 2013). The processing in which content of the
spoken language and prosody are separated is referred to as normalization (Goldinger 1996).
Abstract theories are widespread in current literature.
The second, episodic, category of theory is one in which the prosodic information is
processed in conjunction with the content of the language. Episodic theories state that prosodic
information is included with the textual information because prosody can give important clues to
deciphering the meaning of ambiguous spoken language (Goldinger 1996).
Both abstract and episodic theories have strong scientific grounding in literature, and
current research is focused on finding some way to reconcile the two schools of thought, or find

strong evidence for either category of theory. One attempt to solve the dichotomy has been the
proposal of weakly episodic schools of thought, which suggest that different pathways allow
both abstract and episodic processing to take place. One theory is that abstract processes take
place early during word recognition, whereas episodic processing happens later if necessary
(Krestar 2013). The assumption is that episodic cues take longer to process, but can provide
contextual cues if the listener has trouble understanding the word.
While tone of voice is not entirely understood yet by researchers, significant progress has
been made toward understanding the influence of tone of voice on listeners and on human
speech. All that is conclusively decided after decades of research is that tone of voice plays a
major role in interpretation of language. Sometimes, as is the case with sarcasm, inflection and
prosody override the literal meaning of what was said (Rockwell 2000). The future of tone of
voice research will probably progress towards a unified and coherent model of how prosodic
elements of speech convey recognizable human emotions. In all likelihood, such a model would
incorporate how prosodic information is processed by the auditory system and the brain and how
it is interpreted as meaning and emotion.
References:
Bleichner, Martin et al. Valence, arousal and task effects in emotional prosody processing.
Frontiers in Psychology Vol 4 (2013)
Cheang, Henry S., and Marc D. Pell. The Sound of Sarcasm. Speech Communication Vol 50(5)
(2008): 366-381
Goldinger, Stephen D. "Words and Voices: Episodic Traces in Spoken Word Identification and
Recognition Memory." Journal Of Experimental Psychology-Learning Memory And Cognition
Vol 22(5) (1996): 1166-183.
Jacob, Haike. Cerebral integration of verbal and nonverbal cues: Impact of individual nonverbal
dominance. Elsevier Science Vol 61(3) (2012): 738-747
Krestar, Maura L. Examining the effects of variation in emotional tone of voice on spoken word
recognition. Quarterly Journal of Experimental Psychology Vol 66(9) (2013): 1793-1802
Kreuz, Roger J., and Richard M. Roberts. "Two Cues for Verbal Irony: Hyperbole and the Ironic
Tone of Voice." Metaphor and Symbolic Activity Vol 10(1) (1995): 21-31.
Murray, Iain R., and Arnott, John L. "Toward the Simulation of Emotion in Synthetic Speech: A
Review of the Literature on Human Vocal Emotion." Journal of the Acoustical Society of
America 93, no. 2 (1993): 1097-108.
Nygaard, Lynne C., and Queen, Jennifer S. "Communicating Emotion: Linking Affective
Prosody and Word Meaning." Journal of Experimental Psychology: Human Perception and
Performance Vol 34(4) (2008): 1017-030.
Rockwell, Patricia. "Lower, Slower, Louder: Vocal Cues of Sarcasm." Journal of
Psycholinguistic Research Vol 29 (5) (2000): 483-95.

Voyer, Daniel H., Sophie-Hlne J. Thibodeau, and Breanna J. Delong. "Context, Contrast, and
Tone of Voice in Auditory Sarcasm Perception." Journal of Psycholinguistic Research, 2014,
Journal of Psycholinguistic Research, 7 October 2014.
Voyer, Daniel P., and Janie P. Vu. "Using Sarcasm to Compliment: Context, Intonation, and the
Perception of Statements with a Negative Literal Meaning." Journal of Psycholinguistic
Research, 2015, Journal of Psycholinguistic Research, 22 April 2015.

Das könnte Ihnen auch gefallen