Beruflich Dokumente
Kultur Dokumente
I. INTRODUCTION
Over the last half century, there has been growing research interest in the natural vocal communication systems
of nonhuman primates, in part because their phylogenetic
affinity to humans recommends them as models for studying
potential precursors to human vocal communication, including language. Such comparatively based research has revealed a number of differences between human and nonhuman primates, for example, in detailed features of vocal
anatomy that are likely to influence the range and quality of
possible sounds that can be produced Negus, 1949; Lieberman et al., 1969; Schon Ybarra, 1995; Mergill et al., 1999;
reviewed in Fitch and Hauser, 1995; Fitch, 2000; Hewitt
et al., 2000, and seemingly also in the relative contribution
of experiential influences to vocal development reviewed in
Owren et al., 1993; Seyfarth and Cheney, 1997. It has also
revealed a number of important parallels. These include
similarities between the two groups in some of the functional
properties of vocal signals e.g., Seyfarth et al., 1980; Gouzoules et al., 1984; in the basic structure, layout, and modes
of operation of the principle sound production components
of the vocal tract i.e., larynx and supralaryngeal airways:
reviewed in Fitch and Hauser, 1995; Owren and Linker,
1995; Fitch, 2000; and thus also in the resulting acousticstructural features of the signals produced including certain
speech sounds e.g., Owren et al., 1997. Given the latter
vocal-anatomical and acoustic-structural similarities, acoustic research on primates has also begun to make productive
a
3390
0001-4966/2003/113(6)/3390/13/$19.00
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
3391
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
Research to date has described some of these phenomena in nonhuman primate vocalizations. For example, vervet
monkeys produce several discrete types of alarm call, each of
which is given to a different class of predator. However, the
amplitude, amount, and rate of calling appear to vary with
the level of arousal associated with the degree of predator
threat Seyfarth et al., 1980. These same acoustic features
vary with the intensity of hunger-related arousal in rhesus
monkeys producing food calls, or coos Hauser and Marler, 1993, as do the F 0 and relative tonality or harshness of
the calls Hauser, 2000. Similarly, the amplitude, F 0, and
relative tonality of calling for several different calls in the
repertoire of squirrel monkeys appear to reflect variation in
the intensity of the affective state experienced Jurgens,
1979; Hammerschmidt and Fichtel, 2001, while the amplitude and F 0 of alarm calls given by redfronted lemurs to
terrestrial predators likewise vary with the degree of threat
involved Fichtel and Hammerschmidt, 2001; see also Fichtel et al., 2001.
Many of these same features have been identified as
correlates of affect intensity in humans. Thus, variation in
amplitude, tempo, F 0 , F 0 range, and F 0 contours are among
the most commonly cited features associated with arousal
variation reviewed in Murray and Arnot, 1993; Scherer,
1995; Bachorowski, 1999. Although their connection to different discrete emotions e.g., joy, sadness, anger, surprise is
more equivocal, increases in each feature are typically associated with an increase in generalized psychophysiological
arousal. Several additional features related to voice quality
are also commonly examined as indices of affect intensity in
humans. These include short term variation in the frequency
and amplitude of vocal-fold vibration jitter and shimmer, respectively, and the relative emphasis of F 0 compared to its harmonics. These various measures of voice
quality are often connected to perceptions of roughness,
hoarseness, or breathiness in the voice and have been studied
as potential indices of age and vocal pathology, but also of
arousal variation e.g., Williams and Stevens, 1972; van Bezooyen, 1984; Bachorowski and Owren, 1995a; Protopapas
and Eimas, 1997; Protopapas and Lieberman, 1997.
Using such relationships as a guide, this paper reports
acoustic analyses of a large sample of grunts recorded from
wild baboons under circumstances associated with varying
affect intensity. It begins with a general review of the acoustic features of grunts, including further analysis of the potential correlates of individual identity and their possible primary connection to features associated with vocal tract
resonance. It then examines variation in the acoustic structure of grunts associated with variation in the circumstances
of call production during concerted group movements and
interactions with mothers and their infants as these relate to
basic distinctions in the intensity of the underlying affective
states.
II. METHODS
A. Study site and subjects
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
Mean
SD
CV
I. Tempo-related
Calls per bout no.
Intercall interval ms
Call duration ms
4.4
550.9
137.9
3.4
319.3
38.9
0.78
0.58
0.28
II. Source-related
Amplitude dB
F 0 Hz
F 0 Contour-begin Hz
F 0 Contour-end Hz
Jitter %
H1AH2A dB
79.7
118.4
5.7
2.8
4.6
0.8
3.0
22.4
6.2
6.8
6.8
5.3
0.04
0.19
1.08
2.34
1.47
3.47
458
1383
2680
15.2
54.9
127.0
198.7
9.4
0.12
0.09
0.07
0.62
III. Filter-related
F1F Hz
F2F Hz
F3F Hz
LPC slope dB
D. Grunt sample
3393
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
TABLE II. Confusion matrix for split-sample, discriminant function classification of calls based on caller
identity.
Predicted ID
Actual ID
LX
BT
CT
SH
SS
WR
AL
BL
LX
BT
CT
SH
SS
WR
AL
BL
31
0
8
0
0
0
1
4
0
26
1
8
6
0
4
1
3
0
15
0
5
0
0
6
0
5
2
26
2
0
2
3
2
3
6
1
29
0
0
1
0
1
0
1
0
23
1
0
0
7
0
8
1
0
23
1
3
2
3
3
2
0
0
23
general, formant frequencies showed relatively low variability, whereas many but not all of the tempo- and
source-related features showed comparatively high variability.
A. Acoustic correlates of caller identity
To evaluate the distinctiveness of grunts by caller identity and the relative contribution of specific acoustic features,
a stepwise discriminant function analysis was conducted.
Variables were entered in this analysis according to the magnitude of their contribution to individual differentiation of
calls as determined by the criterion of largest F-value, or
greatest percentage change in the test statistic Wilks
lambda used to evaluate overall distinctiveness. The results
of discriminant function classification of calls are given in
Table II, and the order of entry of each acoustic feature in the
stepwise process, its F-value and associated probability
level, and its percentage contribution to Wilks lambda are
given in Table III.
With all acoustic features making significant contributions to differentiation between individuals entered in the
analysis, the overall Wilks lambda was 0.07. The value of
this test statistic ranges from 0 to 1, where 1 indicates no
differentiation among groups in this case, individuals and
TABLE III. Relative influence of specific acoustic features in discriminant
function analysis based on caller identity.
F value
% change in
Wilks lambda
1. F1F
2. F 0
3. F2F
4. LPC slope
5. Call duration
6. F3F
7. H1AH2A
8. F 0 Contour-end
9. Jitter
10. Intercall interval
11. Amplitude
12. F 0 Contour-begin
13. Calls per bout
61.04a
37.50a
36.40a
27.68a
20.90a
20.59a
20.49a
10.65a
10.14a
7.27a
5.90a
3.63a
2.86b
41.68
30.54
29.95
24.56
19.76
19.58
19.48
11.20
10.74
7.95
6.56
4.15
3.30
P0.001.
P0.01.
% Correct
79.5
59.1
42.9
55.3
64.4
100.0
74.2
59.0
19630364.7
3395
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
TABLE IV. Acoustic features showing significant variation between highand low-arousal conditions of the infant and move contexts.
Infant context
Wilks lambda0.653
df 13, 421
F17.21, P0.001
Move context
Wilks lambda0.660
df 13, 157
F6.23, P0.001
Tempo-related
Source-related
F 0a
F 0 Contour-begina
F 0 Contour-enda
H1AH2Ab
Filter-related
F2Fa
F 0a
F 0 Contour-beginb
F 0 Contour-endb
H1AH2Ab
Jitterb
F2Fa
P0.01.
P0.05.
individuals calls was accounted for by acoustic features related to the effects of vocal tract filtering.
B. Acoustic correlates of affect intensity
Potential acoustic variation in grunts attributable to differences in caller arousal, or affect, was evaluated using multivariate analysis of variance MANOVA with arousal condition high versus low as the factor. Separate MANOVA
tests were conducted on arousal condition in each context
infant and move as a test of the consistency of any possible
effects. To remove variation in grunt acoustics attributable to
the effects of different callers, the data for all acoustic features were first standardized within individuals.
Results of MANOVA tests on arousal condition for
grunts produced in each context are given in Table IV. Both
tests produced a statistically significant overall effect for
arousal condition infant context: Wilks lambda0.653, F
17.21, P0.001; move context: Wilks lambda0.660,
F6.23, P0.001). In each context, this was attributable to
statistically significant variation in several acoustic features
between high- and low-arousal conditions. The specific features showing significant variation according to arousal condition were extremely consistent between the infant and
move contexts listed in Table IV. In fact, with one exception, the acoustic features varying significantly between
high- and low-arousal conditions of the infant context were
identical to those that varied significantly between the highand low-arousal conditions of the move context. The one
exception was jitter, which varied significantly in the move
but not the infant context.
As in the case of individuals, there was clear patterning
in the set of acoustic features that varied according to arousal
condition in both contexts, this time involving all three of the
tempo-related features calls per bout, intercall interval, call
duration, four of the six source-related features (F 0 , F 0
contour-begin, F 0 contour-end, H1AH2A, and only one of
the four filter-related features F2F. Thus, while much of the
variation between individuals was attributable to filterrelated features, that associated with arousal differences
stemmed largely from tempo- and source-related features.
3396
To further evaluate relationships among the acoustic features identified as varying significantly by arousal condition,
a principal components analysis PCA was conducted. The
goal of this analysis was to construct a small set of orthogonal factors principle components that would summarize the
major dimensions of independent acoustic variation in the
grunts when the effects attributable to individuals had been
controlled. To this end, a standard PCA with no factor rotation was conducted using the data standardized within individuals. The resulting scree plot and eigenvalues were then
examined for obvious discontinuities in order to select the
minimum number of factors needed to best account for the
greatest share of variance in the data. This process highlighted four factorsall with eigenvalues greater than
1which accounted for 55% of the variation. The PCA was
then run again, this time specifying retention of only four
factors and using a varimax rotation to improve the interpretability of the resulting factors i.e., to maximize variance in
the loadings of different acoustic features on each factor.
The resulting factors, proportion of variance accounted for
by each, and the sign of the loading of specific acoustic
features on each factor are given in Table V.
The first factor, accounting for nearly double the variation accounted for by any of the remaining three, was associated with all three acoustic features related to the tempo of
call production i.e., intercall interval, calls per bout, call
duration as well as F 0 . The signs associated with each feature indicate that as the number of calls per bout increased,
so too did the duration and F 0 of each call, while the interval
between successive calls decreased. The second factor was
associated with the two F 0 contour features (F 0 contourbegin, F 0 contour-end and H1AH2A. In this case, when
the F 0 contour was steeper the amplitude difference between
H1 and H2 was greater i.e., relatively more energy in H1.
The third factor was associated with the three formant frequencies, which varied directly with one another. Finally, the
fourth factor was associated with amplitude and LPC slope.
When the amplitude of calls was high, LPC slope was flatter
i.e., less negative.
As an internal check on the results of MANOVA and
PCA, one final analysis was conducted, a MANOVA on
Drew Rendall: Identity and affect cues in baboon grunts
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
3397
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
Factor 2a
Factor 3
Factor 4
F1F ()
Amplitude ()
F2F ()
LPC slope ()
F 0 Contourbegin ()
F 0 Contourend ()
H1AH2A ()
F3F ()
12.62%
12.57%
11.11%
Paralleling another result of the earlier study, the acoustic structure of adult female grunts was found to be highly
individually distinctive. Where successful classification
based on chance for the sample of eight females studied here
would have been 12.5%, discriminant analysis using 13
acoustic features of grunts successfully classified 65% of the
calls.
Individual differences in grunts could be traced to a variety of acoustic features including tempo-, source- and
filter-related features, suggesting that various dimensions of
vocal production might differ between individuals. However,
a stepwise discriminant analysis of individual distinctiveness
3398
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
3399
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
amplitude difference between F1 and F3) and sound pressure level, higher frequencies increasingly emphasized in
vowels as voice loudness increases see also Holmberg
et al., 1988; Traunmuller and Eriksson, 2000. Extrapolating
from this work, any similar effort- or intensity-related effects
on the spectral slope of baboon grunts were predicted to
reflect variation in the intensity of arousal underlying call
production. And, in fact, amplitude and LPC slope were correlated in the baboon grunts as evidenced by their grouping
together in the same factor of the principal components
analysis. However, neither feature varied significantly by
arousal condition. This is surprising for two reasons. First, as
noted, call amplitude is probably the feature of vocal production most widely assumed to index arousal intensity. Second,
the voice quality feature, H1AH2A, did vary significantly
by arousal condition, and one reliable correlate of this
featureat least in humansis variable emphasis of higher
frequencies and thus variable LPC slope Holmberg et al.,
1995.
One possible explanation for the association between
arousal condition and H1AH2A but not either amplitude or
LPC slope is that the latter two acoustic features may be
more adversely affected by variation in subjectmicrophone
i.e., recording distances. Although all recordings used in
this analysis were made under ideal field conditions at relatively close range, there was nevertheless some inevitable
variation in the recording distance from 0.5 to 2 m. Such
variation is a routine obstacle to the use of amplitude in
analyses of field-based recordings, and so amplitude results
must always be interpreted cautiously. In this case, variable
recording distances might have obscured some of the natural
amplitude difference between arousal conditions with similar
effects on spectral slope given the association between the
two, but have had comparatively little impact on the low
frequency end of the spectrum from which the H1AH2A
measure derives.
Another possibility for LPC slope is that this acoustic
feature of grunts also varies with behavioral context. Owren
et al. 1997 found that LPC slope was the acoustic feature
that best distinguished grunts produced in the move and infant contexts, a finding later confirmed by field playback
experiments Rendall et al., 1999. Hence, this feature of baboon grunts may reflect an interaction of the effects of context and vocal intensity that precluded it from sorting cleanly
according to the two arousal conditions studied here.
One additional acoustic variable that did not vary consistently by arousal condition was the voice quality feature,
jitter. Although a second voice quality feature, H1AH2A,
did vary consistently between high- and low-arousal conditions of both contexts, jitter did so only in the move context.
Although equivocal, this variable pattern of results is actually consistent with the history of work on features of voice
quality in humans, in which jitter, and other measures of
vocal perturbation, sometimes show up as related to arousal
or anxiety and other times do not reviewed in Protopapas
and Lieberman, 1997. The failure to find consistent effects
for jitter across both contexts of grunting, then, might indicate that it is not a reliable marker of affect intensity, or that
it interacts with other production processes in complex ways
Drew Rendall: Identity and affect cues in baboon grunts
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
that produce this sort of variable outcome. Another possibility is that the inconsistent effect reflects variable association
with different emotional states Scherer, 1986. This study
focused only on variation in the intensity of nonspecific
arousal in two contexts of grunt production. While this operationalization seems to have captured some of the arousalrelated variation in these calls, the tone or quality of emotional state associated with grunt production might well vary
between the two contexts in addition to varying in its intensity within each, and jitter might not be associated with
variation in the intensity of the emotion accompanying
grunting in the infant context.
In sum, the results of this study tend to support theoretical predictions and a small set of empirical findings on the
relative importance of formants in identity cueing in nonhuman primate vocalizations. They also reveal a variety of potential acoustic correlates of affect intensity in baboon
grunts, many of which show signs of continuity with vehicles of vocal affect expression in other species, including
humans. Both results are promising vis a vis the development of general, production-based schemes that can relate
enduring constitutional attributes of callers e.g., sex, age,
individual identity, general health and vigor, as well as their
more transient physiological and psychological states e.g.,
arousal, motivational, and cognitive states; acute pathologies, to specific dimensions of variation in their vocal output.
ACKNOWLEDGMENTS
3401
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30
Mergill, P., Fitch, T., and Herzel, H. 1999. Modeling of the role of nonhuman vocal membranes in phonation, J. Acoust. Soc. Am. 105, 2020
2028.
Murray, I. R., and Arnot, J. L. 1993. Toward the simulation of emotion in
synthetic speech: A review of the literature on human vocal emotion, J.
Acoust. Soc. Am. 93, 10971108.
Negus, V. E. 1949. The Comparative Anatomy and Physiology of the Larynx Hafner, New York.
Nolan, F. 1983. The Phonetic Bases of Speaker Recognition Cambridge
U. P., Cambridge, MA.
Nygaard, L. C., Sommers, M. S., and Pisoni, D. B. 1995. Speech perception is a talker-contingent process, Psychol. Sci. 5, 42 46.
Owren, M. J., and Bernacki, R. H. 1988. The acoustic features of vervet
monkey alarm calls, J. Acoust. Soc. Am. 83, 19271935.
Owren, M. J., and Bernacki, R. H. 1998. Applying linear predictive coding LPC to frequency-spectrum analysis of animal acoustic signals, in
Animal Acoustic Communication: Sound Analysis and Research Methods,
edited by S. L. Hopp, M. J. Owren, and C. S. Evans Springer-Verlag,
Heidelberg, pp. 129162.
Owren, M. J., and Linker, C. D. 1995. Some analysis methods that may
be useful to acoustic primatologists, in Current Topics in Primate Vocal
Communication, edited by E. Zimmerman, J. D. Newman, and U. Jurgens
Plenum, New York, pp. 127.
Owren, M. J., Seyfarth, R. M., and Cheney, D. L. 1997. The acoustic
features of vowel-like grunt calls in chacma baboons Papio cynocephalus
ursinus: implications for production processes and functions, J. Acoust.
Soc. Am. 101, 29512963.
Owren, M. J., Dieter, J. A., Seyfarth, R. M., and Cheney, D. L. 1993.
Vocalizations of rhesus Macaca mulatta and Japanese M. fuscata
macaques cross-fostered between species show evidence of only limited
modification, Dev. Psychobiol. 26, 389 406.
Palmeri, T. J., Goldinger, S. D., and Pisoni, D. B. 1993. Episodic encoding of voice attributes and recognition memory for spoken words, J. Exp.
Psychol. Learn. Mem. Cogn. 19, 309328.
Porter, F. L., Miller, R. H., and Marshall, R. E. 1986. Neonatal pain cries:
Effect of circumcision on acoustic features and perceived urgency, Child
Dev. 57, 790 802.
Protopapas, A., and Eimas, P. D. 1997. Perceptual differences in infant
cries revealed by modifications of acoustic features, J. Acoust. Soc. Am.
102, 37233734.
Protopapas, A., and Lieberman, P. 1997. Fundamental frequency of phonation and perceived emotional stress, J. Acoust. Soc. Am. 101, 2267
2277.
Remez, R. E., Fellowes, J. M., and Rubin, P. E. 1997. Talker identification based on phonemic information, J. Exp. Psychol. Hum. Percept.
Perform. 23, 651 666.
Rendall, D. 1996. Social communication and vocal recognition in freeranging rhesus monkeys Macaca mulatta, Ph.D. dissertation, University
of CaliforniaDavis.
Rendall, D., Owren, M. J., and Rodman, P. S. 1998. The role of vocal
tract filtering in identity cueing in rhesus monkey Macaca mulatta vocalizations, J. Acoust. Soc. Am. 103, 602 614.
Rendall, D., Rodman, P. S., and Emond, R. E. 1996. Vocal recognition of
individuals and kin in free-ranging rhesus monkeys, Anim. Behav. 51,
10071015.
Rendall, D., Cheney, D. L., and Seyfarth, R. M. 2000. Proximate factors
3402
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 200.130.19.171 On: Sat, 27 Jun 2015 01:15:30