Stoichita, Victo A and Bernd Brabec - Postures of Listening-An Ontology of Sonic Percepts From An Anthropological Perspective

Terrain
Anthropologie & sciences humaines

Lectures et débats
Postures of listening
An ontology of sonic percepts from an anthropological perspective
Victor A. Stoichita and Bernd Brabec de Mori
Publisher
Association Terrain
Electronic version
URL: http://terrain.revues.org/16418 Brought to you by Centre national de la
ISSN: 1777-5450 recherche scientifique (CNRS)
Electronic reference
Victor A. Stoichita and Bernd Brabec de Mori, « Postures of listening », Terrain [Online], Symposia and
Debates, Online since 14 November 2017, connection on 24 November 2017. URL : http://
terrain.revues.org/16418
This text was automatically generated on 24 November 2017.
Terrain est mis à disposition selon les termes de la Licence Creative Commons Attribution - Pas
d'Utilisation Commerciale - Pas de Modification 4.0 International.
Postures of listening 1
Postures of listening
An ontology of sonic percepts from an anthropological perspective
Victor A. Stoichita and Bernd Brabec de Mori
Listening as an anthropological issue

1 In his argument “against soundscape”, Ingold (2007) highlights the problematic status of
sound in anthropology. Most researchers (Ingold included) agree with the physicalist
definition of sound as pressure waves propagating through the air. Sticking to this
definition, sound should be the auditory equivalent of light. However, Ingold remarks, it is
more often, and apparently unproblematically, compared to sight (leading to concepts
such as “soundscape”, which Ingold criticizes). In Ingold’s terms, this incoherence
“reveals much about our implicit assumptions regarding vision and hearing, which rest
on the curious idea that the eyes are screens which let no light through, leaving us to
reconstruct the world inside our heads, whereas the ears are holes in the skull which let
the sound right in so that it can mingle with the soul” (Ingold 2007: 10).
2 The confusion between sounds as waves and sounds as auditory objects is partly
anglocentric. In German, for instance, a distinction between Schall (sonic waves) and
Klang (perceived sound) is well established and largely reflected upon. But Ingold’s
critique also reveals a deeper gap between the physicalist description and the way people
experience their auditory realms. It is not just the incorrigible anthropologists who
“misunderstand” the nature of sound. If you ask someone to locate the sound of a distant
waterfall and that person answers by pointing towards the waterfall, he already departs
from the wave definition. To follow it, he would need to point to a vast volume of air
around the waterfall and encompass himself within it. The fact that many people do not
locate sounds like this could indicate that they simply do not experience them as air
waves. Casati and Dokic (2005) summarize several arguments in this respect, leading to
the conclusion that “if sounds were sound waves, we would be almost always mistaken in
our aural perceptions on important aspects, which fact … amounts to accounting for
Terrain , Lectures et débats

auditory perception in terms of a massive error theory”. Massive error theories may not
be a problem in physics, but they are hardly ever satisfactory in anthropology.
3 If we are to understand what people experience through their ears, we then need to leave
aside the fact that ears sense pressure waves. Just as with the eyes perceiving light rays,
this is not untrue from a biological point of view, but it is not of immediate interest for
understanding cultural representations and social interactions. What we need is a more
specific description of the kind of things people sense with their ears, and the general
interactions these things afford.
4 After providing more background to auditory perception in the following section, we will
present the main thesis of this paper, which is a model of three alternative modes of
listening that can be adopted by any listener. The modes are akin to postures a listener
occupies when paying attention to a sonic event. We call these postures of listening: (A)
indexical listening; (B) structural listening; and (C) enchanted listening. These will be
described in detail. During discussions with colleagues, we found that some specific
questions were repeatedly raised.1 After proposing the model of listening postures, we
include a brief section for answering these frequently asked questions (FAQs). Finally, we
conclude with a section dealing with some of the consequences of our model, specifically
for the case of “music” as a form of enchanted listening.
From hearing to listening

5 Some of the cognitive processes involved in the treatment of auditory information are
considered universal in cognitive sciences. Such is the case of “auditory scene analysis”
(Bregman 1994), which describes the segregation and processing of auditory streams. The
segregation is a prerequisite for scene analysis and relies on low-level “bottom-up”
mechanisms (Alain et al. 2000). Although it has been shown that auditory stream
segregation must be learned, it is considered universal among humans as well as many
non-human animals (Bregman 2013; Faragó et al. 2014 for emotional valence in dog’s
streams; Shamma et al. 2011).2 An auditory stream receiving focus of attention appears in
the foreground and is perceived as more salient than other streams that are blended into
the background or blocked out entirely.
6 From a cognitive point of view, “attention” can be defined as the “mechanism that allows
certain information to be more thoroughly processed in the cortex than non-selected
information” (Cohen et al. 2012: 411). It is tempting to compare auditory attention to a
kind of “bottleneck” (e. g. Pilcher et al. 2016: 1039), although in detail it is more likely a
network of several subsystems which sometimes operate in parallel (Caporello Bluvas &
Gentner 2013: 12; Cohen et al. 2012: 412; Demany et al. 2015; Irsik et al. 2016). Attention
should be distinguished from “awareness”, by which “we often become conscious of [an
item’s] attributes at the expense of unattended items” (Cohen et al. 2012: 411). People are
able to control their attention and focus selectively on specific parts of an auditory scene.
The unattended parts receive less cognitive processing, do not reach awareness, and can
induce spectacular effects of “change deafness” (e.g. Cherry 1953; Fenn et al. 2011; Irsik et
al. 2016; Vitevitch 2003). On the other hand, it has also been shown that unexpected
auditory events in unattended streams may receive increased cortical processing (i.e.
attention) while remaining below the level of conscious awareness (reviews in Cohen et
al. 2012 and Snyder et al. 2012: 9 sqq.). Finally, some sounds should ideally always capture
attention and make their way “bottom up” into awareness (alarms, for instance, are

meant to bypass any “change deafness”). The distinction between attention and
awareness helps to understand where auditory processes stop being universally
predictable. In fieldwork, anthropologists and ethnomusicologists gather data relevant to
people’s aware perceptions. But the allocation of attention is probably the earliest stage
where perception can be modulated by cultural preferences.
7 Beyond low-level processes of auditory stream segregation, most cognitive operations
involved in hearing arguably depend on learned systems of knowledge and meaning. This
was shown in Steven Feld’s work (1990 2000) about hearing and meaning among the
Papuan Kaluli, and even more specifically by Rafael José de Menezes Bastos (1999, 2013).
This author shows how the analysis of native terminologies and axionomies leads us to
observe that, for example, among the Xinguano indigenous people, “sound is as material
as stones are for us” (de Menezes Bastos 2013: 292), or that in the same society, sound is
not merely perceived, but “sound is actively sought and captured by the ears” (de
Menezes Bastos 2013: 292). The very base of auditory perception acquires a different
range of meaning and agentive power when conceptualized in such a way. Building upon
this, Menezes Bastos’s former students Mello and de C. Piedade (2005) also demonstrate
that the construction of space based on auditory perception among the Central Brazilian
Wauja shows particularities that are different from an alleged Western classical tradition.
They assume that the ontological characteristics of sound depend on culture, so that
although psychophysical basic processes may be similar, the characteristics of the things
heard are particular.
8 Ethnography shows that anthropologists cannot really presume what kind of things
people hear. The internals of the biological ear (an organ which clearly senses pressure
waves) do not tell much about the things that actually constitute people’s auditory
realms. This applies to “sounds” but also to “language” or “music”. Firstly, these concepts
are rather cumbersome in cross-cultural comparisons. “Language”, for example, can have
different extensions in various societies, including or not the sounds of animals, rivers
and so on. Likewise, “music” does not even exist as a category in most non-European
languages (sounds for healing, hunting or having fun are not necessarily linked by an
overarching concept). More importantly perhaps, “language”, “music”, or “the acoustic
environment” refer us to phenomena. We propose not to take phenomena for granted,
and to start by describing the modes of awareness which allow for auditory phenomena
to occur. This is one significant difference with most existing theories of listening: we do
not ask how many ways there are to hear “music”, or how listening to “music” compares
with listening to “language”. We propose to start our inquiry before, at a stage where we
do not know at all what people listen to.3 From there, we can simply observe how humans
interact with their surroundings through their ears: what they do, how they react, what
they say about it. On this basis, we try to give “functional specifications to the structures
that must be present” (Hutchins 1995: 131) beyond the reach of our methods of
investigation.
9 This approach is not meant to reveal especially new things about how people listen to
“sounds”, “music” or “language” (although it might serve to clarify a few points along the
way). The expected benefit would rather be to understand how they listen to things like
“gods-as-vibrations”. Beings such as these are not unusual encounters in ethnography,
but they remain critically absent from typologies of listening (when they are allowed
entrance, it is only as “sounds” or as “music”; see for instance Becker 2004). Another
objective of our inquiry is to determine whether some listening modes are found in all

human societies. On the one hand, ethnographies show great variability in the way
systems of auditory knowledge are built. On the other hand, humans engage in
remarkably cross-cultural activities: they listen for prey when they go hunting, they talk
to each other (sometimes to animals and rivers too), they dance … Beyond the general
ability to pay attention to sounds, are all modes of listening culturally variable? Or are
there some modes which are shared by all human beings? By investigating this question,
we might ultimately even be able to say something about temptingly universal
phenomena such as “language” and “music”.
10 Here we will deal exclusively with listening, leaving aside sound production. Listening, as
opposed to hearing, implies awareness. There is, however, more to listening than
awareness. In many cases, the sound one hears affords a certain reaction to the hearing
individual. Gibson suggested that the perception of affordances is embedded at a very low
level in our appraisal of the world (Gibson 1977, 1986).4 As Martin Clayton puts it for
sounds, “we do not passively perceive and subsequently decode sonic information, so
much as actively scan sound energy for patterns of which we can make sense” (Clayton
2001: 11). Cognitive and acoustic experiments have shown that this “scanning” requires a
mixture of short-term memory and anticipation (for a recent overview see Castellengo
2015: 146 sqq.). This means that auditory objects like tunes, words or familiar noises are
never “sensed” literally. The present, through our eardrums, brings us only a short
glimpse of them. In Husserl’s terms, “the objectivity of the sound that lasts is constituted
in the ‘continuum’ of an action that is in part remembrance, in a very short, punctual
part, perception, and in a larger part, expectation” (Husserl 1964: 36–37, quoted in
Castellengo 2015: 487, our translation). By “listening” we refer to this complex of action,
memory and imagination.
11 We define alternatives of listening as distinct ways of using a given item of sensory
information by the same being to construct different kinds of listening objects. These
objects are of different kinds when their affordances for that being differ. In a given
acoustic environment, the listener adopts one of these alternative listening postures, and
thereby perceives a specific set of entities, which opens the way to specific relational
possibilities. As their name implies, alternatives yield incompatible results: the same
auditory field can be apprehended in one way or another, but probably not in two ways at
the same time (see FAQ section for discussion).
12 We propose that three alternative postures can be identified in all human societies. We do
not claim that these listening modes are the only ones which humans have developed. We
do think, however, that these modes of listening are used and applied among all humans,
and they are probably the only ones with such universal character. Such an assertion can
of course not be proved. The best one can do to address something that might be common
to all humans is to make one’s claim refutable (in the sense of Popper 1959). To that
effect, we try to describe each alternative and the kind of auditory world it constructs in a
precise way, while also setting out its most obvious empirical consequences.
Alternative A: Indexical listening

13 We sometimes infer from the sounds that we hear the physical state of the world. Animals
can do this too. It is a matter of forming a hypothesis about the cause of the pressure
wave.

14 For such a hypothesis to be obtained, sounds are treated as indexes in the sense of Peirce
(see also Peirce 1992; 1998: 13). Through this way of listening – moving upstream to the
physical cause of the sound heard – animals endowed with hearing attain three different
kinds of knowledge.
15 Firstly, they discover and locate other entities acting in their surroundings (“something is
walking there”). From the sounds heard, they assume the mere existence of “something”.
16 Secondly, they can form a hypothesis about the thing or being which they hear. Acoustic
signatures allow us, for example, to infer that what is there in the branches is a fruit
dove, or that the person on the phone is Aïcha. This is a hypothesis about the entity’s
identity. Although it may occur simultaneously with the first inference, existence and
identity are distinct. We can indeed be wrong about the species of the bird in the tree or
the identity of the person on the phone, but something or someone is definitely there.
Both inferences are indexical, because they are built on causal associations. These
associations are grounded in observed recurrences: this kind of sound is usually produced
by a fruit dove, that one by Aïcha, etc. (for a perspective from cognitive psychology, see
e.g. Keller & Stevens 2004).
17 There is a third kind of indexical inference in sound. It enables listeners to build a
hypothesis about the interior state of other beings. This is what occurs, for example,
when we think that a person’s voice “betrays” the speaker’s thoughts or feelings. The
feelings may not be apparent in the words uttered by that person, but we still think that
we sense fear, happiness or anger in the tone of the voice. We rely for this on an indexical
interpretation of vocal prosodic features (intonation, rhythm, timbre, etc.) that we
understand causally as being produced by specific interior states. Note how this is
different from intentional communication: when we say that a voice “betrays” the
speaker’s inner feelings, the inference cannot be said to rely on the deliberate use of a
shared symbolic system. The “betraying” indexes are in fact not symbols at all in the
sense of Peirce. There is no convention stating how they should be interpreted. The
listener can only guess Aïcha’s interior state through the observance of previous
regularities (e.g. usually, when Aïcha sounds like this, she is in an angry state of mind),
and the inference is affected by gradual variations in sound (Aïcha can sound more or less
angry depending on how salient the corresponding prosodic features are in her voice).
18 Of course, indexical interpretations can be wrong, and indexes can also be faked. Let’s
consider the first inference about existence. It is usually accurate in ecological
environments: if you hear something there, something is probably there. But illusions can
be built on this inferential process with special tools like recording and playback
equipment. Listen to a recording of your favourite string quartet for example.
19 How many things do you hear? If you listen in stereo, the direct indexical answer should
be two. Most Europeans, however, will feel that they are listening to “the quartet” – two
violins, a viola and a cello, that is four – not to the loudspeakers. In such a case, the
recording and playback apparatus function as an extension of the situation in which four
actual sound sources were recorded and mapped onto the stereo panorama.5
20 The second kind of indexical inference – identification – can also be faked. The fruit dove,
for example, can be tricked into believing that it hears the calls of a potential mate, when
in fact a hunter lies hidden in the bushes. It is worth noting that tricks of this kind work
precisely because they are exceptional. Humans too rely on voice recognition in many
daily interactions, because voices are difficult to imitate.

21 Lastly, interior states can also be falsely attributed. By using vocal “icons of crying”
(Urban 1988), professional mourners can convey feelings of sadness while mourning for a
family they hardly know (cf. Amy de la Bretèque 2013: 89). Here again, it takes some
special skills to get the imitation right and make it convincing. Stage actors also need to
train their voices in order to acquire the special ability to embody various states of mind
on demand. Because such skills remain uncommon, listeners tend to infer the interior
states of other beings from the way they sound, even when they are well aware that the
feelings are enacted and not spontaneously felt.
22 One of the reasons why acoustic indexes are hard to fake is because the listener’s
inferences are very sensitive to infinitesimal variations. In indexical listening, it makes a
difference whether the voice we hear is a little higher or lower pitched, whether
consonants receive a bit more or less stress, whether the steps we hear sound a bit
further or closer. As shall be seen, many of these infinitesimal variations are stripped
away in structural listening, which will be outlined hereafter. Some of them are relevant,
again, in enchanted listening, but there they pertain to a totally different ontology.
23 Another specific characteristic of indexical listening is the way it maps sounds in space.
When we hear them as indexes, sounds share the same location as their physical source. If
we hear the fruit dove on that tree, for example, we expect to also find it there physically.
In indexical listening, hearing, sight and touch should map a similar world. We will see
that this contrasts with the two other kinds of listening to be described.
Alternative B: Structural listening

24 Contrary to indexical listening, structural listening does not strive to reach the physical
cause of the sounds heard. Instead, its main scope is to abstract relevant patterns from
auditory data. Relevance can take on different forms, depending on the cognitive task at
hand.
25 This mode of listening is used, for example, in understanding language. Saussure (1916)
and later Martinet (1965: 15; 1991: 20) have insisted on the functional importance of
double articulation in all human languages: utterances can be broken into minimal
semantic units (signs, or monemes in Martinet’s terminology), and the latter can
themselves be broken down into a limited number of non-semantic units (phonemes).
The core of this theory is that meaning is not achieved through the “positive” content of
the units used to construct the message, but rather through their oppositional “value” in
a given system (Saussure’s term, see 1916, Chap. 4). This approach has been exported
from linguistics to various other research domains, including anthropology and
(ethno)musicology.
26 Our own proposal is much simpler than most structuralist theories. It states merely that
listening in a structural way is possible. In other words, we claim that acoustic data can be
parsed for oppositional units, according to a system. We do not presume to say that this is
the most relevant way of describing a specific activity, not even linguistic
communication. Like the other alternatives described here, it is simply an available
option for the listener.
27 The interesting fact, for our purpose, is that if the listener adopts the structural hearing
posture, the cognitive result is, by definition, entirely abstract. Structural elements do
not depend on the media which conveys the information. For example, the same words

can be uttered by various voices, with various pitches, various intonations, louder or
softer, or at various speeds.6 And they can also be written down, flashed over the sea in
Morse code, or transmitted as binary data over computer networks. Such operations
imply an abstract level, where the “value” of the different phenomena, notwithstanding
their different material forms, remains the same.
28 This is how Saussure reached his conclusion that “it is impossible for sound, which is
material, to belong by itself to language” (de Saussure 1916: 164, our translation). In a
broader definition of language, which would include notably prosodic factors, gestures
and contextual implications, structural listening is only a small part of what we do in
linguistic communication. But when we listen for structures, we set aside most acoustic
features of sound, and the resulting auditory object is hardly acoustic at all. Whereas
infinitesimal variations are important in both indexical and enchanted listening postures,
here only a few oppositions matter.
29 Structural listening is not limited to linguistic communication. It can be triggered at will
by the listener, on any kind of sound. For example, bird calls can be “understood” in a
structural sense, when the way in which the bird vocalizes transmits an omen (cf. Walker
2010). In entirely different contexts, data sonification is often used to convey information
that is clearly abstracted from the actual sounds (see e.g. Supper 2014). Musical notations
are yet another achievement of structural listening. In order to write music down
(whether we compose or transcribe it from what we hear), we need an abstraction layer,
where sound and graphic signs share some common structure. The mundane ability to
consider that, say, a xylophone and a flute play “the same” melody points to a similar
abstraction. To assert their equivalence, the listener must retain only the structural
relations of the pitches, discarding the obvious differences in the sound spectra of these
instruments.
30 Structural listening contrasts with both the indexical and enchanted postures in several
ways. We have already mentioned its insensibility to many variations of sound. This
extends to spatial location. Physical space is important in indexical listening. As we shall
see, enchanted listening constructs a space of its own. But space is abolished altogether in
structural listening. Whether the person telling you Cinderella’s story is sitting to your
right or your left does not change the representation you form of the narrative. A
conventional musical transcription will be similarly unaffected by the position of the
sound source being transcribed.
Alternative C: Enchanted listening

31 We have distinguished indexical listening, which relates the sounds to their physical
causes, and structural listening, which searches them for abstract patterns. Humans have
at least one more way of directing their consciousness to sounds. It is characterized by a
split between sounds and their physical causes. When listening in this way, sounds seem
to form an autonomous realm “which stands apart from the ordinary workings of cause
and effect, and which is irreductible to any physical organization. At the same time, it
contains a virtual causality of its own, which animates the elements that are joined by it”
(Scruton 1997: 39).
32 The elements mentioned by Scruton are objects of listening, hence “sounds” in a broad
sense. But they have properties that sounds lack in the other two ways of listening.

“Virtual causality” and “animation”, for example. A listener may sense entities which
“move”, relate to each other in various ways, and possibly embody an agency of their
own. She might also perceive them as “coloured”, “shaped” or “textured”. Our listener
can be well aware that colours cannot normally be heard (even true synaesthetes do not
actually “see” colours in sound7). She hears them nevertheless, and they have nothing to
do with the colours of the sound sources. She may also check that these properties vanish
away when she adopts another manner of listening. She can “step back” and listen
indexically, for example, to the physical source: where is it, what is it, does it move, etc.
In that world, there are no more lines, movements or textures, just some sound coming
from some being or object. She can then step in once more, whereupon only the same
original movements, colours and textures will exist again.
33 If she uses a European language, our listener will probably label what she perceives in this
listening as “music”. Ethnomusicology has shown, however, that this concept is not found
in many other cultures. In the Amazon, for example, indigenous people often use terms
that refer to specific ways of singing, at the same time excluding the sounds of
instruments, which, on the other hand, are understood as voices of spirits (Brabec de
Mori 2012; de Menezes Bastos 2013; Piedade 2013). Even in those societies where the
concept of “music” is used, its extension is ambiguous. In many Muslim societies, for
example, what is called musīqī (with this loanword from Greek) explicitly excludes calls to
prayer (adhān) and recitation of the Qur’an, both of which are nevertheless definitely
perceived as “music” by the average European, and as peculiarly agentive vocal
productions by the pious Muslim (Hirschkind 2004; Shiloah 1997). For this reason, it is
better to stick to the listening ability itself. We will refer to it as “enchanted listening”.
34 We understand enchantment in reference to the “technologies of enchantment”
discussed by Alfred Gell (1988, 1992, 1996, 2006). Gell’s proposal stemmed initially from an
analysis of the concept of technology (Gell 1988). He observed that some technologies are
meant to modify the world, while others target instead the way the world is perceived. 8
The latter he named “technologies of enchantment”. Enchantment remained, however,
an obscure concept throughout his work, which focused primarily on the “technological”
side of the proposal. Some years later, on a different path, Philippe Descola pointed out
that the beings and things which constitute people’s experiential worlds are first
recognized and categorized through a set of low-level cognitive processes. He called these
processes “schemas”, in the general sense of “abstract structures that organize
understanding and practical action without mobilizing mental images or any knowledge
conveyed in declarative statements” (Descola 2005: 149, translation from 2013: 59).
Descola proposed a cross-cultural survey of collective schemas which govern
identification and relationships. He compared in particular the distribution of
“interiorities” and “physicalities” in different ethnographies, and arrived at the
conclusion that “these principles of identification define four major types of ontology,
that is to say systems of the properties of existing beings” (Descola 2005: 176, translation
from 2013: 64). Now auditory experience is normally not an autonomous ontological
realm. Most of the time human beings do not consider it under a distinct “system of the
properties of existing beings”. Indexical listening points to non-auditory causes.
Structural listening points to non-sonic structures. On occasion, however, human
audition can materialize specific systems of sonic beings which then display particular
sets of sensory and relational properties. People typically describe them in terms of
unhearable dimensions which are not linked to any physical causes. Quite often too, sonic

beings are endowed with autonomous agencies, meaning the capacity to initiate actions
by themselves.9 We call enchanted listening the fact of experiencing a properly auditory
ontology.
35 This ability appears to be shared by all human beings. In all human societies there exist
interactions that rely on sounds being apprehended as a world of their own. In such a
world, sound events are related to each other rather than to their physical sources. They
obey specific intrasonic causalities. For example, people may feel “tensions” and
“releases” in sound, and react to them emotionally, as well as bodily through dance. This
is a classic way of experiencing tonal music. The building of “tension” and “release” is
taught in composition classes, and is also used as a basic analytical framework by many
musicologists.
36 “Tension” and “release” refer to a world where sounds have the ability to build up in an
equilibristic pile, oppose “their own” antagonistic energies, lose “their own” momentum,
leading “by themselves” to new sound events which resolve the instability and release the
tensions accumulated. None of this happens in a world where sounds occur because a
musician pressed some buttons on her instrument. It happens in a world where sounds
obey their own causal rules, a world where sounds occur because of other sounds. 10
37 “Tension” and “release” are merely examples that apply to specific kinds of music. They
are by no means universal concepts, and other representations can be shown to fulfil
similar roles elsewhere. In Papua New Guinea, for instance, the Kaluli use (or at least used
in the 1980s) a wide range of water-derived terms to describe positions and movement
tendencies in the sound realm. According to Feld (1981), in the Kaluli language, sa is a
standalone term for a waterfall. It can also be used as a prefix for many things related to
water, and also to song. For example, a sa-we:l refers to “the ledge or upper place from
which the waterfall drops”, which in song corresponds to “the leading pitch in a line or
phrase from which the melody descends”. Hence one can correct someone’s singing using
such a sentence as “the waterfall ledge is too long before the fall”. Feld gives numerous
other examples outlining a consistent use of hydraulic representations by the Kaluli
people in commenting their songs.
38 In Kaluli aesthetics, references to water flows play a role similar to the mechanics of
“tension” and “release” in Western tonal music. Listening to something like a water flow
or an equilibrium of energies demanding resolution are both enchanted experiences,
because sounds then interact with each other in a suspended realm, according to rules of
their own.
Enchanted sounds or verbal metaphors?

39 The phenomenal reality of such worlds has often been overlooked in anthropology and
musicology, possibly because the corresponding verbalizations have been categorized as
“metaphors” by the concerned scholars. The usual assumption is that “metaphors” are
not “for real”. With a few exceptions (Gay 1998; Rice 2001), they tend to be treated as
mere linguistic devices with only a vague relation to the speaker’s experiences. 11
Ethnomusicologists often strive to map these metaphors to the “real” features of sound.
Even though acousticians have generally acknowledged Schaeffer’s early warnings that
“signals”, as measured by instruments, should not be confused with “sounds”, as
experienced by humans (see e.g. Schaeffer 1966: 269; Castellengo 2015: 34, 139),

frequency, amplitude and spectral composition are treated as actual parameters of

“sound” perception throughout anthropological and ethnomusicological literature.
Tension, colour or “waterfallness” are not, although they are probably closer to the
actual percepts experienced by the listener.
40 In their classic study of metaphorical thinking, Lakoff and Johnson (1980) have shown
that metaphors are not merely ways of speaking, but actual cognitive devices which
enable humans to grasp the world through analogies, and which influence in return the
way humans experience the world. In their general definition, “the essence of metaphor
is understanding and experiencing one kind of thing in terms of another” (Lakoff &
Johnson 1980: 5). “Tension” and “release” are metaphors mapping mechanical processes
onto the sound realm. “Flowing like a waterfall” is a metaphor mapping hydraulic
processes and geographical features onto the voice. In the early twentieth century,
musicians in Cairo used the Arabic word tawriq, meaning “covering with a thin layer of
paper or plaster”, to designate a subtle heterophonic outline of the vocal part added by
the instruments to a singer’s voice (Racy 1988). Ethnography is replete with such
examples where sound is understood in terms of something else.
41 Metaphors bridge ontological gaps. It does not follow that they are arbitrary poetic
figures contrasting with more “literal” uses of language. For one thing, the concept of
literal meaning – as opposed to “figures” of speech – remains a puzzle in experimental
linguistics (Gibbs & Colston 2007: 582). Secondly, Lakoff and Johnson have shown that
conventional metaphors can be used to say true or false things in just the same way as
non-metaphorical propositions (Lakoff & Johnson 1980: 159–184). It follows that even
wordings like “covering with a thin layer of paper or plastic”, which are obviously not
grounded in auditory perception, can be used by listeners to understand and structure
their actual sound experiences (a fact demonstrated ad absurdum when metaphors are
used at wrong occasions and contradicted by the listeners as being false). The problem,
then, is to determine which aspects of the source domain are mapped onto the target.
42 Enchanted listenings often behave as collective schemes of identification and
relationality (Descola 2005: 149 sqq.). Let us illustrate this through another example from
Mount Bosavi. Aside from the water flow vocabulary, in some contexts Kaluli people also
use a completely different set of representations which describe the “hardening” or
“softening” of sound (Feld 1986). Through the “hardening” (halaido) of its sound (a
complex process involving both woodcraft and magical operations), the ilib drum, Feld
explains, can leave the listeners with “the sensation of hearing the voice of a bird ‘inside’
the sound of the drum, and then hearing a further reflection, the voice of a spirit child
coming through the voice of the bird to call ‘father’” (Feld 1986: 94).
43 The ilib drum “‘talks’ (tolan), and actually says dowo, ‘father’” (Feld 1986: 94). For the
Kaluli, this drum ideally embodies a ghostlike being endowed with its own agency. A
complex net of non-verbal interactions depends on this being’s existence. There is, for
example, the reaction of extreme sadness, which prompts some listeners to “thrust their
way out onto the dance floor, sobbing loudly, and, brandishing a resin torch, strike and
burn the drum whose sound moved them so deeply” (Feld 1986: 93). For this to happen,
the listeners need to really experience the sensation of a bird reflecting the spirit of a
dead child calling for his father from inside the drum. Referring to Boyer’s analysis of
“supernatural” beings, one could say that this experience is “counterintuitive” or
“counterontological” for the Kaluli (Boyer 2001: 64 sqq.). They are perfectly aware that
there is no actual bird inside the drum. The bird at which they aim their torch in despair

is a purely auditory object, a sonic being. The Kaluli experience of ilib drumming bears
the mark of enchanted listening, because it causes, or results from, ontological shifts and
the reframing of human/non-human agencies.
44 Enchanted listening does not actually require articulate cosmologies. “Movement”, for
instance, is a basic metaphor for sound processes throughout Western music (e.g. “rise”
and “fall” in a tune, rhythmic “swing”, “walking” bass).
“Buildup” and “breakdown” in a trance music track, analyzed in Butler 2006 : 315.
Abbreviations: bass drum (BD), riff 1 (R1), snare drum2 (SD2), snare drum 3b (SD3b), riff 2 (R2).
Musical excerpt from Communication (Somebody Answer The Phone) by Mario Più, 1999, Incentive –
CENT2T.
45 It is nevertheless an enchanted experience. The loudspeakers stand still, the band

members on stage stand still, but something moves in the music. Clarke (2001) has argued
that musical “movement” is best understood as a truly perceptual phenomenon, rather
than as a way of saying (see also Becker 2010). Clear signs indicate that people experience
it, even without their verbalizing it. The “movement” in what they hear sets them in
motion; they dance, bounce their heads, tap their feet … Like with ilib drumming, the
ontological shift in sound is manifested by a vocabulary, but also by a complex net of
interactions, which would be absurd without it. It should be noted that one cannot dance
to indexes or to structures: they have no “movement” to communicate. Contagious
auditory movements operate from a time-space of their own.
46 They can be personified too. The bourgeois salon is in this respect no less a stage for
supernatural encounters than more “exotic” rituals. Here is a famous introspection by
Marcel Proust into his fictional Swann’s mind:
“With a slow and rhythmical movement she led him here, there, everywhere, towards a state
of happiness, noble, unintelligible, yet clearly indicated. And then, suddenly having reached
a certain point from which he was prepared to follow her, after pausing for a moment,
abruptly she changed her direction, and in a fresh movement, more rapid, multiform,
melancholy, incessant, sweet, she bore him off with her towards a vista of joys unknown.
Then she vanished. He hoped, with a passionate longing, that he might find her again, a
third time”. (Proust 1987: 327, adapted from the translation by Scott Moncrieff 1922)
47 This is about a musical phrase, or more precisely, about its relation to Swann. Although
the translation by Moncrieff uses the inanimate “it”, which seems logical in English, this
pronoun does not exist in French. A “little phrase” (petite phrase) is feminine, and Proust
originally writes about elle (her), which makes the description even more ambiguous. Out

of context, the reader could well believe that Swann encounters an actual female person.
This is intentional, and over the course of the novel, the petite phrase, which always
appears personified, becomes a sort of emblem for Swann’s relation with Odette de Crécy.
48 It probably takes Proust’s mastery to verbalize sound impressions with such delicacy. On
the other hand, that most readers understand his description points to the fact that this
kind of interaction is not completely foreign to them. Watt and Ash (1998) have shown
that British listeners are prone to relate musical excerpts to traits of personality such as
age, gender and emotional states. Once you admit that auditory objects can be “female”,
“move” and have affects of “their own”, it is indeed easy to figure as well how they might
“charm” a listener like Swann. Or, if you admit that a drum can “speak” like a bird, and
that a bird can reflect a dead spirit calling “father” (which is admittedly more
complicated for European minds), it becomes understandable that one could relate
emotionally to the pulsating sound of the Kaluli ilib drum.
49 Enchanted listening implements collective auditory schemes pertaining to distinctly
auditory ontologies. The ontologies are collective in the sense that they are shared by a
significant number of people, allowing for interactions and mutual understandings. Only
the listening mode however – the fact of switching to a specifically auditory ontology –
can be considered universal.
50 To conclude this section, let us summarize how enchanted listening differs from the other
two modes. Indexical listening is the “default” mode. Its objects are auditory aspects of
non-auditory beings or events. The objects of structural listening are hardly auditory at
all: the whole process is oriented towards abstract patterns. Enchanted listening is the
only mode where auditory objects stand by themselves. This entitles them to a distinct
mode of identification (neither indexes nor structures will do) and a distinct mode of
relation (auditory objects relate directly to other auditory objects): in other words, an
ontology which exists only in audition.
51 One of the interesting properties of such an ontology is that it brings new beings to social
interactions. Contrary to indexes or abstract structures, the objects of enchanted
listening do not refer us to something else. They instantiate other beings here and now.
Such beings materialize into sound and literally “possess” the acoustic vibrations (see
Sartre 1940: 165 for an equivalent in images). Western musical theory also has a
concurrent interpretation of a process similar to what we call enchantment. Following
Pierre Schaeffer, the severing of indexical ties, and the carelessness for abstract
meanings, has been described as a “reduced” form of listening (Chion 1983: 32 sqq., 1994;
Schaeffer 1966, Chap. 15 and 20). In Schaeffer’s terms, the listening is “reduced” because
it puts the world into “parenthesis”. This is supposed to lead to an objectivation of sound
“for its own sake” (Chion 1983: 33). We cannot contend that this might be what some
listeners are seeking. But what Schaeffer seems to have missed is that when auditory
experience is apprehended “for itself”, the outcome is often a transient irruption of new
things and beings into human interactions. All over the world, ethnography shows the
actual opposite of a “reduction”: enchanted listening practised as a path to augmented
social realities.

Discussion of this distinction and its relevance for

anthropology
52 After presenting this model in various seminars and conferences, the authors have
gathered many insightful remarks and objections. Some of these were integrated above,
in the proposal itself. Some other frequently asked questions are addressed here. The
most important one – regarding the usefulness of such a model – is discussed extensively
in the conclusion.
Q: How can you say that these ways of hearing are universal, without a systematic review
of all available ethnographies?
→ A claim that something is “universal” can only be proven false (Popper 1959).
Therefore, we propose a hypothesis, under a form which can be tested and falsified. If
there exists at least one human society where at least one of our A/B/C alternatives is
not an option, our hypothesis is false. Until now, we have been unable to find such a
society across any of the ethnographies we are aware of. Posture A (indexical listening)
is obviously available to all mammals endowed with hearing: all of them are able to
infer the presence of something or the occurrence of an event from the sound it makes.
Posture B (structural listening) is demonstrated each time humans use systems of
sounds with double articulation to convey meaning. Such systems exist in all human
societies, and we usually refer to them as “language”. We have shown that structural
listening is not used only for “language”, and we are also aware that “language” can
involve far more processes than structural listening. Nevertheless, the existence of
“language” in any human society is a strong indicator that posture B (structural
listening) is a shared human capacity. The real question is whether posture C
(enchanted listening) is practised in all human societies. A society without posture C
would be one where sounds cannot have unhearable properties (colour, texture or
depth, for example), are never connected to each other by causal laws, move only if
their source moves physically, and of course cannot be endowed with autonomous
agencies. Does such a society exist? We have at least been unable to find it.
Q: Why are there only three alternatives?
→ Each of the three listening postures described here could be broken down further
into subcategories. A fourth alternative would not be a subcategory, but rather a
posture operating at the same level of generality, and incompatible with the three
postures already described. We are not aware of such an alternative, although we
cannot exclude that it exists. If it exists, it remains to be seen whether it is observable
in all human societies. Our attempt here is not to address all the ways of listening, but
only those which are available to all human beings.
Q: Isn’t listening to language distinct from structural listening? You cite Saussure and
Martinet who insisted on the peculiarity of double articulation, compared to other sound
systems. Wouldn’t double articulation deserve a fourth alternative?
→ Linguistic systems, in the sense of Saussure, are indeed special among other sound
systems because they articulate two systems of value: an utterance can be broken into
semantic units, which can further be broken down into non-semantic ones. Each kind of
unit has value only through its opposition to the other units of its kind. This double
articulation has important consequences for the system (Martinet 1965: 8 sqq.), which

set human languages apart from other sound systems like musical scales. Nevertheless,
this is a special property of an object. It doesn’t imply that the listening mode which
apprehends it is special too. In our view, parsing acoustic data for oppositional units is
a distinct listening mode, whether the underlying system is articulated at one or two
levels.
Q: Are you sure that A/B/C modes of listening are alternative? Could they not be used in
conjunction too?
→ The question is whether a person can focus on the same auditory streams in
different ways at the same time. There are indeed sound activities which are defined,
culturally, as having an interest for several modes of listening. "Songs", for instance,
are vocal productions whereby Western listeners deem it interesting to focus on the
voice and its melody just as much as on the lyrics. Cognitive experiments found
contrasted evidence showing that, in the perception of sung words, pitch and semantic
content were treated either independently of each other (e.g. Besson et al. 1998; Bonnel
et al. 2001), or as an integrated percept (Gordon et al. 2010; discussion in Schön et al.
2005). The main problem here is that these studies assess only how listeners identify
words, pitch patterns and semantic or syntactic incongruities. These are all instances of
the same listening mode (i.e. structural). To answer our question, an experiment should
investigate, for example, the interplay of structural listening with the particular
ontologies of the enchanted mode. To the best of our knowledge this has not yet been
tried. There are some indications, however, regarding indexical vs structural
processings. Vitevitch (2003) asked people to repeat a list of words spoken by a
recorded voice; the recorded speaker was changed midway through the list, but only
half of the participants noticed the change. In a more realistic simulation of a phone
conversation, Fenn et al. (2011) found even lower results (only 6% of their participants
noticed that the interlocutors at the other end of the line had changed). People could of
course reliably detect the changes if cued to do so, or if the differences between the
voices were overtly salient (Fenn et al. 2011). These findings show that it is certainly
possible to use indexical and structural listening in a conversation, but that the normal
mode of attending to someone speaking is structural. This sends to the background
significant acoustic features which would have been relevant in the indexical mode.
Such clues prompted us to present the listening modes as alternatives. This is in line
with our general definition of listening, which implies selective focus. However, since
attentional processes can, to some extent, be allocated in parallel (Bonnel et al. 2001;
Cohen et al. 2012; Demany et al. 2015), we do not make a strong claim about an essential
incompatibility between the three modes.
Q: What about the sound producers? Where do they fit into your system?
→ We set aside the logic and postures of sound production as a different topic. In our
view, sound producers can adopt any of the three listening postures mentioned above.
For example, in a logic of “music” production, the producers are simultaneously
primary listeners of their own sound. They can do it in an enchanted or a structural
posture, depending on the moment and the kind of music (whether they improvise or
play from a score, for example). In “speech”, on the other hand, people hardly ever
listen to their own voice. Instead, they usually concentrate on the meanings they wish
to convey. These examples illustrate that listening postures are only loosely correlated
with sound production postures. The latter would deserve a study of their own.

Q: What exactly do we gain from this theory? After all, we already knew that daily sounds,
language and music were different. Is your demonstration not just a sophisticated way of
rediscovering the wheel?
→ Our search for listening postures started a few years ago, precisely from our
repeated frustrations with concepts such as “sound”, “language” and “music”. We
found that these concepts were used inconsistently in current anthropology and
ethnomusicology alike. Definitions were seldom given, or, when given, seldom followed
(see Ingold’s remark on the “sound” in “soundscape”, for example). We realized that
adding yet another definition for these concepts was not the way to go. Instead, we
looked for what people did with sounds, and what that revealed about their auditory
experience. One crucial fact, for which we tried to maintain a central place throughout
our proposal, is that human audition is never entirely constrained by the outside world.
The same vibrations become alternative kinds of things in audition, depending on the
posture adopted by the listener. The differences are ontological. They affect which
things exist in audition and what properties they have, including their interactional
affordances. We believe that this is an interesting and possibly new way of framing
auditory experiences in anthropology. One of the problems our distinction could
address is the relevance and the extension of the concept of “music”.
In place of a conclusion: Music, an old anthropological

challenge
53 As mentioned above, many languages do not have a word to encompass the same
activities as the English term “music” (Nettl 2001). It is therefore not an option to “let the
informants decide” on what they consider “music”. Anthropologists have to choose
between either following local vocabularies of sound activities, or seeking an overarching
definition in order to bypass linguistic differences.
54 The first path (advocated most notably by Feld 1994, 2004, and de Menezes Bastos 2013)
has the advantage of sticking closely to the relations drawn by indigenous people. Words,
birds, waterfalls, rattles, chanting … all of these can be linked and traversed fluidly, as no
prior concepts or distinctions are brought in by the researcher. A problem arises,
however, if one attempts to compare such descriptions cross-culturally. “Sound” is no
better a concept for these comparisons than “music”. In its acoustic definition (pressure
waves propagated from a source and sensed by the ears), it is simply a medium, akin in
this respect to light. Hence Ingold’s critique, quoted at the beginning of this article: to
look at the way people relate to their surroundings in “sound” (as a medium) would be
just as insipid as to look for their relations in “light”. We have shown that the
phenomenal objects of auditory experience can be of very different kinds, and that their
distinction has little to do with the pressure waves themselves. Moreover, we argue that
universal alternatives exist in listening. If this is true, researchers can investigate and
cross-culturally compare something somewhat more precise than relations in “sound”.
55 Many researchers actually already do this by following the second path mentioned above.
Throughout anthropological and ethnomusicological literature, “music” is used to refer
to a general human ability, comparisons are made between “musics” of various societies,
researchers gather for conferences about “music”, and publish in journals about “music”.
Outside the specific domain of ethnomusicology, in anthropology at large, the use of
“music” and related terms like “singing” goes largely unquestioned. They appear

commonly, for example, in ethnographic descriptions of rituals, parties and ceremonial

gatherings, with no real discussion of their relevance to local categories of sound
production. In ethnomusicology, Blacking’s definition is still pivotal in any discussion of
the topic.
56 At the core of Blacking’s pioneering work stands the idea that music is “humanly
organized sound” (Blacking 1973: 3). Blacking’s proposal constituted a powerful scientific
proposal in its own time, compounded by a political stance: that music was indeed a
shared human capacity (rather than a “gift”, which some people had and others not). But
problems arise when attempting to specify how to understand “organized”. Indeed, the
definition should automatically encompass not only music, but also language. Most
researchers, however, consider these to be distinct, and an extensive bibliography strives
to map the relations of music and language, therefore implying their heterogeneity (see
Feld & Fox 1994 for a general overview, and Levman 1992 for a discussion specifically
related to Blacking’s definition).
57 It is likewise debatable whether “humanity” binds together an interesting set of
phenomena. Some animals are known to produce sounds which are neither
predetermined genetically nor clearly functional from a semiotic point of view. On these
grounds, some researchers suggest that “music” is a biological ability shared with other
species (Keller 2012; Mâche 1992; Martinelli 2009).
58 On yet another line, anthropologists have challenged the centrality of “humans” in the
understanding of social interactions in general (Ingold 2011, and Latour 2005, among
others; for ethnomusicology see Brabec de Mori & Seeger 2013). Considering the auditory
realm in this perspective, the ontological coherence of “humanly organized sounds”
cannot be taken for granted. From an ethnographic point of view, it is actually quite
common for humans to locate musical agency outside the human realm. Examples range
from Amazonian shamanic cures to Melanesian medium performances, or to rock and
hip-hop musicians driven by a “flow”. Computer music adds to this array, because
algorithms organizing sounds can only ambiguously be understood as “human”. The
“humanly organized sound” conception overlooks the variety of these interactions in an
effort to trace back all agencies to human beings (see also Brabec de Mori in print).
59 Notwithstanding its shortcomings, Blacking’s elegant definition has grounded many
ethnomusicological endeavours. Slight alternatives have been suggested, like “human
sound communication outside the scope of spoken language” (Nettl 2005: 25), or “sounds
produced and organized by a culture” (Nattiez 1987: 95). All such definitions frame music
as an object, rather than a process. In a thorough critique of this conception, Hennion
(2007) proposed to understand music as an emergent phenomenon. The epistemic
consequence is to no longer study the music, but rather how it is fleshed out from a
multitude of social interactions. This approach is, however, incompatible with the idea
that music could be something universal (Hennion 2007: 351). There is indeed only little
likelihood that the same social processes give rise to the same emergent phenomenon in
different societies. Understanding music as a social process takes us back to following the
variety of indigenous wordings, with no overarching concept to enable comparisons.
60 The approach we suggest bypasses many of the problems outlined above. To begin with,
we do not need to define “music”. We identified three alternative ways of relating to
sounds. One of them (enchanted listening) is often triggered by the kind of sound
activities that are subsumed under the term “music”. We have shown above that “music”

can also be apprehended through the other two listening modes. But, if there is anything
specific to it, it is probably due to its privileged link with auditory enchantment.
61 We believe that the general properties we described for this way of listening – the
ontological shift and the mapping of agency onto the sound realm – can account for many
effects attributed to “music”. Specific agencies operate in “music” because the things it is
made of have particular ontological properties. But we can intentionally switch back and
forth between this and other ways of listening. In other words, the enchanted alternative
is always (just) an option. As with the other alternatives, it is adopted by an individual,
often according to culturally formed suggestions. One of these suggestions could be the
ontological category of the sound producer – enchanted listening is linked to human
sound productions in some societies – although it cannot be considered a universal
condition. The same holds for criteria such as organization. We do not see these as
constituting a specific kind of object or activity (“music”). We see them as cultural
determinants that tend to orientate the listeners in given contexts towards specific ways
of listening.
62 Music is not universal, in any sense of the word. But enchanted listening is, as a capacity
to consider a distinct realm where sounds interact primarily with each other. If this is
true, we should also question the implicit assumption that what people describe as
colours, movements or beings in sound are “in the end” frequencies, amplitudes and
spectral components of air waves. It should be possible to take people seriously and give a
positive empirical status to the enchanted things and beings that appear at times in their
auditory experiences.
BIBLIOGRAPHY
ALAIN CLAUDE, STEPHEN R. ARNOTT and others, 2000.
“Selectively attending to auditory objects,” Frontiers in Bioscience no. 5, D202–D212.
AMY DE LA BRETÈQUE ESTELLE, 2013.

Paroles mélodisées. Récits épiques et lamentations chez les Yézidis d’Arménie, Paris, Classiques Garnier.
BECKER JUDITH, 2004.

Deep listeners. Music, emotion and trancing, Bloomington, Indiana University Press.
BECKER JUDITH, 2010.

“L’action-dans-le-monde. Émotion musicale, mouvement musical et neurones miroirs, ” Cahiers
d’ethnomusicologie no. 23, pp. 29–52. Available online: ethnomusicologie.revues.org/961 [last
accessed November 2017].
BESSON MIREILLE, FRÉDÉRIQUE FAITA, ISABELLE PEREITZ and others, 1998.

“Singing in the brain: Independence of lyrics and tunes,” Psychological Science no. 9/6, pp. 494–98.
BLACKING JOHN, 1973.

How musical is man?, Washington, University of Washington Press.

BONNEL ANNE-MARIE, FRÉDÉRIQUE FAITA, ISABELLE PERETZ & MIREILLE BESSON, 2001.
“Divided attention between lyrics and tunes of operatic songs: Evidence for independent
processing,” Perception & Psychophysics no. 63/7, pp. 1201–13.
BOYER PASCAL, 2001.

Religion explained: The evolutionary origins of religious thought, New York, Basic Books.
BRABEC DE MORI BERND, 2012.

“About magical singing, sonic perspectives, ambient multinatures, and the conscious
experience,” Indiana, no. 29, pp. 73–101.
BRABEC DE MORI BERND & ANTHONY SEEGER, 2013.

“Introduction: Considering Music, Humans, and Non-humans,” Ethnomusicology Forum no. 22,
pp. 269–86.
BRABEC DE MORI BERND, in print.

Music and Non-human Agency. In Ethnomusicology. A Contemporary Reader, Vol. II, Jennifer Post (ed.),
New York/London, Routledge.
BREGMAN ALBERT S., 1994.

Auditory Scene Analysis: The Perceptual Organization of Sound, Cambridge, Mass., MIT Press.
BREGMAN ALBERT S., 2013.

“Three directions in research on auditory scene analysis,” Proceedings of Meetings on Acoustics
no. 19, 010021.
BUTLER MARK J., 2006.

Unlocking the groove. Rhythm, Meter and Musical Design in Electronic Dance Music, Bloomington/
Indianapolis, Indiana University Press.
CAPORELLO BLUVAS EMILY & TIMOTHY Q. GENTNER, 2013.

“Attention to natural auditory signals,” Hearing Research no. 305, pp. 10–8.
CASATI ROBERTO & JEROME DOKIC, 2005.

“Sounds”, in Stanford Encyclopedia of Philosophy. Available online: www.science.uva.nl/~seop/
entries/sounds/.
CASTELLENGO MICHÈLE, 2015.

Écoute musicale et acoustique: avec 420 sons et leurs sonagrammes décryptés, Paris, Eyrolles.
CHERRY COLIN E., 1953.

“Some experiments on the recognition of speech, with one and with two ears,” Journal of the
Acoustical Society of America, no. 25/5, pp. 975–9.
CHION MICHEL, 1983.

Guide des objets sonores. Pierre Schaeffer et la recherche musicale, trans. C. North & J. Dack). Paris,
Buchet Chastel/Institut national de la communication audiovisuelle. Available online:
monoskop.org/log/?p=536 monoskop.org/log/?p=536, last accessed November 2017.
CHION MICHEL, 1994.

“The three listening modes,” in Audio-Vision, C. Gorbman (trans.), New York, Columbia University
Press, pp. 25–34.
CLARKE ERIC, 2001.

“Meaning and the Specification of Motion in Music,” Musicae Scientiae, no. 5/2, pp. 213–34.

CLAYTON MARTIN, 2001.

“Introduction: Towards a Theory of Musical Meaning (In India and Elsewhere),” British Journal of
Ethnomusicology, no. 10, pp. 1–17.
COHEN MICHAEL A., PATRICK CAVANAGH, MARVIN M. CHUN & KEN NAKAYAMA, 2012.
“The attentional requirements of consciousness,” Trends in Cognitive Sciences no. 16/9, pp. 411–7.
DEMANY LAURENT, MAYALEN ERVITI & CATHERINE SEMAL, 2015.

“Auditory attention is divisible: Segregated tone streams can be tracked simultaneously,” Journal
of Experimental Psychology: Human Perception and Performance, no. 41, pp. 356–63.
DESCOLA PHILIPPE, 2013.

Beyond nature and culture, trans. Janet Lloyd, Chicago/London, The University of Chicago Press
[French ed. 2005, Par-delà nature et culture, Paris, Gallimard].
FARAGO TAMÁS, ATTILA ANDICS, VIKTOR DEVECSERI and others, 2014.

“Humans rely on the same rules to assess emotional valence and intensity in conspecific and dog
vocalizations,” Biology Letters no. 10/1, 20130926.
FELD STEPHEN, 1981.

“‘Flow like a Waterfall’: The Metaphors of Kaluli Musical Theory,” Yearbook for Traditional Music
no. 13, pp. 22–47.
FELD STEPHEN, 1986.

“Sound as a Symbolic System: The Kaluli Drum,” in David P. McAllester & Charlotte Frisbie (eds.),
Explorations in Ethnomusicology: Essays in Honor of David P. McAllester, Detroit, Information
coordinators, vol. 9, pp. 147–58.
FELD STEPHEN, 1990.

Sound and Sentiment. Birds, Weeping, Poetics, and Song in Kaluli Expression, Philadelphia, University of
Pennsylvania Press [2nd ed.].
FELD STEPHEN, 1994.

“From Ethnomusicology to Echo-Muse-Ecology,” The Soundscape Newsletter no. 8, pp. 9–13.
FELD STEPHEN, 2000.

“Sound Worlds,” in Patricia Kruth & Henry Stobart (eds.), Sound, Cambridge/New York,
Cambridge University Press, pp. 173–200.
FELD STEPHEN & DONALD BRENNEIS, 2004.

“Doing anthropology in sound,” American Ethnologist, no. 31/4, pp. 461–74.
FELD STEPHEN & AARON FOX, 1994.

“Music and Language,” Annual Review of Anthropology, no. 23, pp. 25–53.
FENN KIMBERLY M., HADAS SHINTEL, ALEXANDRA S. ATKINS and others, 2011.
“When less is heard than meets the ear: Change deafness in a telephone conversation,” The
Quarterly Journal of Experimental Psychology, no. 64/7, pp. 1442–56.
GAY LESLIE C., 1998.

“Acting up, Talking Tech: New York Rock Musicians and Their Metaphors of Technology,”
Ethnomusicology, no. 42/1, pp. 81–98.
GELL ALFRED, 1988.

“Technology and Magic,” Anthropology Today no. 4/2, pp. 6–9.

GELL ALFRED, 1992.

“The Technology of Enchantment and the Enchantment of Technology” in J. Coote & A. Shelton
(eds.), Anthropology, Art and Aesthetics, Oxford, Oxford, Oxford University Press, pp. 40–63.
GELL ALFRED, 1996.

“Vogel’s Net. Traps as Artworks and Artworks as Traps,” Journal of Material Culture, no. 1, pp. 15–
38.
GELL ALFRED, 1998.

Art and Agency. An Anthropological Theory, Oxford, Clarendon Press.
GELL ALFRED, 2006.

“Parfum, symbolisme et enchantement. Terrain, no. 1, pp. 19–34.
GIBBS RAYMOND W., Jr. & HERBERT L. COLSTON, 2007.

“The Future of Irony Studies,” in R. W. J. Gibbs & H. L. Colston (ed.), Irony in language and thought: a
cognitive science reader, New York/London, Taylor & Francis, pp. 581–93.
GIBSON JAMES J., 1977.

“The Theory of Affordances,” in Robert Shaw & John Bransford (eds.), Perceiving, Acting, and
Knowing: Toward an Ecological Psychology, Hillsdale, Lawrence Erlbaum, pp. 67–82.
GIBSON JAMES J., 1986.

The Ecological Approach to Visual Perception, London, Taylor and Francis.
GORDON REYNA L., DANIELE SCHÖN, CYRILLE MAGNE and others, 2010.
“Words and Melody Are Intertwined in Perception of Sung Words: EEG and Behavioral Evidence,”
PLoS ONE, no. 5, e9889.
HENNION ANTOINE, 2007.

La passion musicale. Une sociologie de la médiation, Paris, Métailié [2nd ed.].
HIRSCHKIND CHARLES, 2004.

“Hearing Modernity: Egypt, Islam, and the Pious Ear,” in Veit Erlmann, Hearing Cultures. Essays on
Sound, Listening and Modernity, Oxford/New York, Berg, pp. 131–52.
HUSSERL EDMUND, 1964.

Leçons pour une phénoménologie de la conscience intime du temps, trans. Henri Dussort, Paris, PUF.
HUTCHINS EDWIN, 1995.

Cognition in the Wild, Cambridge/London, MIT press.
INGOLD TIM, 2007.

“Against soundscape,” in Angus Carlyle (ed.), Autumn leaves: Sound and the environment in artistic
practice, Paris, Double Entendre, pp. 10–3. Available online: lajunkielovegun.com/
AcousticEcology-11/AgainstSoundscape-AutumnLeaves.pdf, last accessed November 2017.
INGOLD TIM, 2011.

Being alive: Essays on movement, knowledge and description, London/New York, Routledge.
IRSIK VANESSA C., CHRISTINA M. VANDEN BOSCH DER NEDERLANDEN & JOEL S. SNYDER,
2016.
“Broad attention to multiple individual objects may facilitate change detection with complex
auditory scenes,” Journal of Experimental Psychology: Human Perception and Performance no. 42,
pp. 1806–17.

KELLER MARCELLO S., 2012.

“Zoomusicology and Ethnomusicology: A marriage to Celebrate in Heaven,” Yearbook for
Traditional Music, no. 44, pp. 166–83.
KELLER PETER & CATHERINE STEVENS, 2004.

“Meaning From Environmental Sounds: Types of Signal-Referent Relations and Their Effect on
Recognizing Auditory Icons,” Journal of Experimental Psychology: Applied, no. 10/1, pp. 3–12.
LAKOFF GEORGE & MARK JOHNSON,1980.

Metaphors We Live By, Chicago/London, University Of Chicago Press.
LATOUR BRUNO, 2005.

Reassembling the social: An introduction to actor-network-theory, Oxford/New York, Oxford University
Press, Clarendon lectures in management studies.
LEVMAN BRIAN G., 1992.

“The Genesis of Music and Language,” Ethnomusicology, no. 36/2, pp. 147–70.
MÂCHE FRANÇOIS B., 1992.

Music, myth, and nature, or, The Dolphins of Arion, Chur, Switzerland/Philadelphia, Harwood
Academic Publishers.
MARTINELLI DARIO, 2009. Of birds, whales, and other musicians: An introduction to zoomusicology,
Scranton/Chicago, IL, University of Scranton Press.
MARTINET ANDRÉ, 1965.

La linguistique synchronique. Études et recherches, Paris, PUF.
MARTINET ANDRÉ, 1991.

Éléments de linguistique générale, Paris, Armand Colin.
MELLO MARIA I. C. & ACÁCIO T. DE C. PIEDADE, 2005.

“Diferentes escutas do espaço: hipóteses sobre o relativismo da percepção e o caráter espacial da
audição,” Simpósio Internacional de Cognição e Artes Musicais. Curitiba: Editora do Departamento de
Artes da UFPR. MENEZES, Maria Lúcia Pires, pp. 219–48.
MENEZES BASTOS RAFAEL J. DE, 1999.

A musicológica kamayurá: para uma antropologia da comunicação no Alto-Xingu, Florianópolis, Editora
da UFSC.
MENEZES BASTOS RAFAEL J. DE, 2013.

“Apùap World Hearing Revisited: Talking with ‘Animals’, ‘Spirits’ and other Beings, and Listening
to the Apparently Inaudible,” Ethnomusicology Forum, no. 22, pp. 287–305.
NATTIEZ JEAN-JACQUES, 1987.

Musicologie générale et sémiologie, Paris, Christian Bourgeois.
NETTL BRUNO, 2001.

“An Ethnomusicologist Contemplates Universals in Musical Sound and Music Cultures,” in Nils L.
Wallin, Björn Merker & Steven Brown (eds.), The origins of music, Cambridge, MIT Press, pp. 463–
72.
NETTL BRUNO, 2005.

The Study of Ethnomusicology. Thirty-one Issues and Concepts, Urbana, University of Illinois Press.
OCKELFORD ADAM, 1991.

“The Role of Repetition in Perceived Musical Structures,” in Peter Howell, Robert West & Ian

Cross (eds.), Representing Musical Structure, London/San Diego/New York, Academic Press,
pp. 129–159.
OCKELFORD ADAM, 2004.

“On similarity, derivation and the cognition of musical structure,” Psychology of Music, no. 32/1,
pp. 23–74.
PEIRCE CHARLES S., 1992.

“What Is a Sign?,” in Nathan Houser, André de Tienne, Jonathan R. Eller and others (eds.), The
Essential Peirce: Selected Philosophical Writings. Volume 2 (1893–1913), Bloomington, Indiana
University Press, pp. 4–10.
PIEDADE ACÁCIO T. DE C., 2013.

“Flutes, Songs and Dreams: Cycles of Creation and Musical Performance among the Wauja of the
Upper Xingu (Brazil),” Ethnomusicology Forum, no. 22/3, pp. 306–22.
PILCHER JUNE J., KRISTEN S. JENNINGS, GINGER E. PHILLIPS & JAMES A. MCCURBIN, 2016.
“Auditory Attention and Comprehension During a Simulated Night Shift: Effects of Task
Characteristics,” Human Factors, no. 58/7, pp. 1031–43.
POPPER KARL R., 1959.

The logic of scientific discovery, New York, Hutchinson & Co.
.PROUST MARCEL, 1987.

Un amour de Swann, Paris, Flammarion [American ed., 1922, Swann’s Way. Remembrance Of Things
Past, C.K. Scott Moncrieff (trans.), New York, Henry Holt and Company].
RACY ALI J., 1988.

“Sound and society: The ‘takht’ music of early twentieth-century Cairo,” in James Porter & Ali
J. Racy, Selected Reports in Ethnomusicology, Los Angeles, Dept. of Ethnomusicology, UCLA, no. 7,
pp. 139–70.
RICE TIMOTHY, 2001.

“Reflections on Music and Meaning: Metaphor, Signification and Control in the Bulgarian Case,”
British Journal of Ethnomusicology, no. 10/1, pp. 19–38.
SACKS OLIVER, 2008.

Musicophilia: Tales of Music and the Brain, Revised and Expanded Edition, London, Picador.
SAFFRAN JENNY R. & GREGORY J. GRIEPENTROG, 2001.

“Absolute pitch in infant auditory learning: Evidence for developmental reorganization,”
Developmental Psychology, no. 37/1, pp. 74–85.
SARTRE JEAN-PAUL, 1940.

L’imaginaire. Psychologie phénoménologique de l’imagination, Paris, Gallimard.
SAUSSURE FERDINAND DE, 1916.

Cours de linguistique générale, Paris, Payot.
SCHAEFFER PIERRE, 1966.

Traité des objets musicaux: Essai interdisciplines, Paris, Le Seuil.
SCHÖN DANIELE, REYNA L. GORDON & MIREILLE BESSON, 2005.

“Musical and Linguistic Processing in Song Perception,” Annals of the New York Academy of Sciences,
no. 1060, pp. 71–81.
SCRUTON ROGER, 1997.

The Aesthetics of Music. Oxford, Clarendon Press.

SHAMMA SHIHAB A., MOUNIA ELHILALI & CHRISTOPHE MICHEYL, 2011.

“Temporal coherence and attention in auditory scene analysis,” Trends in Neurosciences, no. 34/3,
pp. 114–123.
SHILOAH AMNON, 1997.

“Music and Religion in Islam,” Acta Musicologica, no. 69/2, pp. 143–55.
SNYDER JOEL S., MELISSA K. GREGG, DAVID M. WEINTRAUB & CLAUDE ALAIN, 2012.
“Attention, Awareness, and the Perception of Auditory Scenes,” Frontiers in Psychology, no. 3.
Available online: ncbi.nlm.nih.gov/pmc/articles/PMC3273855/, last accessed November 2017.
SOLOMOS MAKIS, 1999.

“Schaeffer phénoménologue,” in Ouïr, entendre, écouter, comprendre après Schaeffer, Paris, Buchet
Chastel/INA-GRM, pp. 53–67. Available online: univ-montp3.fr/~solomos/Schaeff.html, last
accessed 8 February 2017.
SOURIAU ÉTIENNE, 2009.

Les différents modes d’existence, Paris, PUF.
SUPPER ALEXANDRA, 2014.

“Sublime frequencies: The construction of sublime listening experiences in the sonification of
scientific data,” Social Studies of Science, no. 44, pp. 34–58.
URBAN GREG, 1988.

“Ritual Wailing in Amerindian Brazil,” American Anthropologist, no. 90, pp. 385–400.
TAYLOR HOLLIS, 2010.

“Blowin’in Birdland: Improvisation and the Australian pied butcherbird,” Leonardo Music Journal,
no. 20, pp. 79–83.
VITEVITCH MICHAEL S., 2003.

“Change deafness: The inability to detect changes between two voices,” Journal of Experimental
Psychology: Human Perception and Performance, no. 29/2, pp. 333–342.
WALKER HARRY, 2010.

“Soulful voices: birds, language and prophecy in Amazonia,” Tipití, no. 8/1, article 1. Available
online: digitalcommons.trinity.edu/cgi/viewcontent.cgi?article=1111&context=tipiti, last
accessed November 2017.
WATT ROGER J. & ROISIN L. ASH, 1998.

“A psychological investigation of meaning in music,” Musicae Scientiae, no. 2/1, pp. 33–53.
Available online: journals.sagepub.com/doi/abs/10.1177/102986499800200103, last accessed
November 2017.
ZOBEL BENJAMIN H., RICHARD L. FREYMAN & LISA D. SANDERS, 2015.

“Attention is critical for spatial auditory object formation,” Attention, Perception, & Psychophysics,
no. 77/6, pp. 1998–2010.
NOTES
1. We are particularly grateful to the participants in the workshop “Sonic beings? The
ontologies of musical agency”, which we convened at the EASA Conference 2012. At the
Research Centre for Ethnomusicology in Nanterre (CREM-LESC/CNRS/UMR 7186) and the
Institute of Ethnomusicology in Graz (University of Music and Performing Arts), many

colleagues and students helped us shape our argument over the years. We received
important intellectual contributions from Estelle Amy de la Bretèque, Emmanuel de
Vienne, Matei Candea, Malik Sharif and Thibaud Aimard-Kesraoui, who reviewed in detail
and discussed with us preliminary versions of this text. We are also grateful to the
anonymous reviewers who expressed helpful comments on our proposal.
2. Attention plays a crucial role at later stages of auditory scene analysis. It actually also
modulates “from the top down” some very early processes of stream segregation
(Caporello Bluvas & Gentner 2013; Zobel et al. 2015).
3. This may seem similar to (and is probably inspired by) Pierre Schaeffer’s discussion of
preobjective modes of listening (Schaeffer 1966: 113 sqq.). Our study differs in method
and ethnographic coverage, but the most important distinction will perhaps appear in
relation to “enchanted listening”. In our analysis, the suspension of indexical and
structural/semantic interpretations is not a “reduced” listening, as Schaeffer posits, but
rather the “augmented” experience of a new auditory realm. We agree with Solomos
(1999) in his argument that Schaeffer did not actually consider the dissolution of the
“sound object” into distinct ontologies dependent on the listener’s system of knowledge
and intentions.
4. “The affordances of the environment are what it offers the animal, what it provides or
furnishes, either for good or ill” (Gibson 1986: 127). As summarized by Gibson himself, the
core of his thesis is that “the composition and layout of surfaces constitute what they
afford. If so to perceive them is to perceive what they afford.” In other words, appraising
action possibilities does not occur after perception but right within it. A liquid surface, for
example, is “sink-into-able” for heavy mammals but “stand-on-able” for water bugs.
Mammals and bugs never actually perceive the same surface.
5. This is true as long as the extension is within the range of possibility: compare the
string quartet recording with a progressive rock album, where “stereo effects” are
employed, so that, for example, the guitar solo circles around the listener or the drums
jump from right to left: in that case, the upstream inference of indexical listening would
invoke a space with flying guitarists and teleporting drums. This space is not possible,
because it cannot exist without changing the ontological properties of reality (on
possibility as a mode of existence, see Souriau 2009: 134 sqq.). If such ontological shifts in
space occur, we are confronted with another kind of auditory space that is explored in
detail in the section about enchanted listening.
6. This competence probably needs learning. Babies, for example, are initially more
sensitive to vocal pitch than adults. They must learn to lose some of this sensitivity in
order to acquire language (Sacks 2007: 138; Saffran & Griepentrog 2001).
7. Sacks (2007: 182) relates the following discussion with Michael Torke, a “true
synaesthete” (who happens to be a composer). Torke explained to Sacks that he vividly
saw the colour blue when he heard a D-major chord. Sacks asked Torke what would
happen if he listened to D-major when looking at a yellow wall. Would he see green?
Torke’s answer was negative: both the musical and the visual colors were “true” colours
for him, but they would not mix together. This indicates that even for “true
synaesthetes”, auditory colours remain distinct from optical ones.
8. To illustrate : “a flute, no less than an axe, is a tool, an element in a technical sequence;
but its purpose is to control and modify human psychological responses in social settings,
rather than to dismember the bodies of animals” (Gell 1988: 6).

9. We adopt Gell’s definition of agency: “whereas chains of physical/material cause-and-

effect consists of ‘happenings’ which can be explained by physical laws which ultimately
govern the universe as a whole, agents initiate ‘actions’ which are ‘caused’ by themselves,
by their intentions, not by the physical laws of the cosmos. An agent is the source, the
origin, of causal events, independently of the state of the physical universe” (Gell 1998:
16). In theory there are no limits to a chain of cause and effect. In practice, however,
infinite causalities are unmanageable for finite brains. At any given time, it is a cognitive
requirement to consider that some objects can just do things “by themselves”. At that
particular moment, such an object is an “agent” in the subject’s cognition.
10. An in-depth exploration of intra-sound causal principles in Western tonal music can
be found in Ockelford (1991, 2004).
11. See the distinction between “true” and “metaphorical” synaesthesia in psychiatry:
“For most of us the association of color and music is at the level of metaphor. ‘Like’ and
‘as if’ are the hallmarks of such metaphors. But for some people one sensory experience
may instantly and automatically provoke another. For a true synaesthete, there is no ‘as
if’ – simply an instant conjoining of sensations” (Sacks 2007: 177).
ABSTRACTS
This essay identifies and describes three ways of listening that are available to all human beings.
Beforehand, we argue that the concept of “sound”, as borrowed from acoustics and commonly
used in anthropology, is too vague and too limited. In order to be able to understand the full
range of human auditory experiences as found in ethnography, as well as the social interactions
which they afford, we propose a distinction of at least three postures of listening. We define
these as “indexical”, “structural” and “enchanted”, by contrasting their interactional salience in
various settings. The auditory “things” that exist for each of the three stances (their ontologies)
are also shown to be different. This trichotomy provides a promising theoretical framework for
some longstanding problems in anthropology. After discussing some critical questions and
possible shortcomings of our model, we conclude by looking closely at one of these issues: the
definition of “music” and its ethnographic relevance throughout the world.
INDEX
Keywords: audition, sound, ontology, language, music, enchantment, agency

Stoichita, Victo A and Bernd Brabec - Postures of Listening-An Ontology of Sonic Percepts From An Anthropological Perspective

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Stoichita, Victo A and Bernd Brabec - Postures of Listening-An Ontology of Sonic Percepts From An Anthropological Perspective

Hochgeladen von

Copyright:

Verfügbare Formate

Terrain

Anthropologie & sciences humaines

Victor A. Stoichita and Bernd Brabec de Mori

This text was automatically generated on 24 November 2017.

Victor A. Stoichita and Bernd Brabec de Mori

Listening as an anthropological issue

Terrain , Lectures et débats

From hearing to listening

Terrain , Lectures et débats

Terrain , Lectures et débats

Alternative A: Indexical listening

Terrain , Lectures et débats

Terrain , Lectures et débats

Alternative B: Structural listening

Terrain , Lectures et débats

Alternative C: Enchanted listening

Terrain , Lectures et débats

Terrain , Lectures et débats

Enchanted sounds or verbal metaphors?

Terrain , Lectures et débats

frequency, amplitude and spectral composition are treated as actual parameters of

Terrain , Lectures et débats

45 It is nevertheless an enchanted experience. The loudspeakers stand still, the band

Terrain , Lectures et débats

Terrain , Lectures et débats

Discussion of this distinction and its relevance for

Terrain , Lectures et débats

Terrain , Lectures et débats

In place of a conclusion: Music, an old anthropological

Terrain , Lectures et débats

commonly, for example, in ethnographic descriptions of rituals, parties and ceremonial

Terrain , Lectures et débats

AMY DE LA BRETÈQUE ESTELLE, 2013.

BECKER JUDITH, 2004.

BECKER JUDITH, 2010.

BESSON MIREILLE, FRÉDÉRIQUE FAITA, ISABELLE PEREITZ and others, 1998.

BLACKING JOHN, 1973.

Terrain , Lectures et débats

BOYER PASCAL, 2001.

BRABEC DE MORI BERND, 2012.

BRABEC DE MORI BERND & ANTHONY SEEGER, 2013.

BRABEC DE MORI BERND, in print.

BREGMAN ALBERT S., 1994.

BREGMAN ALBERT S., 2013.

BUTLER MARK J., 2006.

CAPORELLO BLUVAS EMILY & TIMOTHY Q. GENTNER, 2013.

CASATI ROBERTO & JEROME DOKIC, 2005.

CASTELLENGO MICHÈLE, 2015.

CHERRY COLIN E., 1953.

CHION MICHEL, 1983.

CHION MICHEL, 1994.

CLARKE ERIC, 2001.

Terrain , Lectures et débats

CLAYTON MARTIN, 2001.

DEMANY LAURENT, MAYALEN ERVITI & CATHERINE SEMAL, 2015.

DESCOLA PHILIPPE, 2013.

FARAGO TAMÁS, ATTILA ANDICS, VIKTOR DEVECSERI and others, 2014.

FELD STEPHEN, 1981.

FELD STEPHEN, 1986.

FELD STEPHEN, 1990.

FELD STEPHEN, 1994.

FELD STEPHEN, 2000.

FELD STEPHEN & DONALD BRENNEIS, 2004.