Sie sind auf Seite 1von 59



Submitted in partial fulfillment of the requirements for the Degree of

Master of Fine Arts

in Electronic Music and Recording Media Mills College, Spring 2011

Approved by:

Reading Committee

Chris Brown Director of Thesis

James Fei Reader of Thesis


Peter Ho-Kin Wong

Chris Brown Head of the Music Department

Dr. Sandra C. Greer Provost and Dean of the Faculty


1. Introduction



2. Background:


2.1. The mapping question


2.2. Conceptual metaphor


3. Interface strategies



Manifested metaphors: simple

yet telling examples


3.1.1. A thought experiment:


a prototype conforming to an alternate pitch metaphor


3.1.2. Our prototype as a methodological proposal:


what really happened here?



Existing interfaces and conceptual coherence


3.2.1. Tangible user interfaces (TUIs)


3.2.2. The “CHOAM” fiducial ball controller



Proposed future work


3.3.1. Input: gestural imaging and

marked objects


3.3.2. Mapping image–schemas to low or high level outputs


4. Informing the conceptual sphere




“Choices:” just pitch space, the conduit metaphor


5. Conclusion



6. Appendix: contents of accompanying media


1. Introduction

Arguably the quintessential musical instrument of the early 21 st century,

the computer as a musical performance instrument has received generous

attention. Practitioners of computer music have devoted their energy to detailed

study of very narrow aspects of its use; by small changes in key places in the

chain of causation, it can be made into nearly any kind of instrument. How

then, within a nearly infinite realm of possibility with regard both to generable

sounds and to input mechanisms, can one decide what kind of instrument to

build into it? After some background on the “mapping question” as it pertains

to computer music, and on some particularly pertinent aspects of human

cognition, I will examine a crude controller prototype which will illustrate the

fundamentals behind a design procedure that maintains coherence between

mappings of gestural controls to sonic outputs and metaphorically based

cognitive structures. After a discussion of how this method pertains to existing

interfaces, both of others’ construction and of my own, I will propose a

direction for future work which maintains consistency with this methodology.

Finally, I will examine ways in which, beyond the more technical

correspondences in the previous sections, the cognitive structure discussed can

inform the conceptual and compositional grounding of musical works, using a

piece of my own as an example.


The ideas which I will bring up in this paper are incredibly simple.

However, their simplicity belies a subtlety which should not be discounted; the

structure of the language we must use inadvertently encourages eliding some

important distinctions. 1

2. Background

There are two areas with which the reader will need to be familiar before

going further: the question of mapping as it pertains to computer music, and

the contemporary theory of conceptual metaphor.

2.1. The mapping question

The issue of input to output mapping through the computer as a musical

instrument is a vexing problem. Jon Drummond defines mapping as

“connecting gestures to processing and processing to response.” 2 Thus at its

most general, it is little more than connecting what goes into the black box with

what comes out. 3 (see Figure 1)

If one considers the case of traditional acoustic instruments, the

1 Michael J. Reddy, “The conduit metaphor: A case of frame conflict in our language about language,” in Metaphor and Thought, 2 nd ed., ed. Andrew Ortony (Cambridge: Cambridge University Press, 1993).

2 Jon Drummond, “Understanding Interaction in Contemporary Digital Music: from instruments to behavioral objects,” Organised Sound 14/2 (2009): 131.

3 Drummond's paper goes into much further detail about the different ways to conceive of this metaphor, with varying degrees of complexity. This simpler conception will provide better clarity here.


relationship is straightforward: A performer's physical gestures, breath and

A performer's physical gestures, breath and Figure 1: The computer as a musical instrument is

Figure 1: The computer as a musical instrument is essentially a black box.

movement, for example, “go into

the box,” directly causing some

kind of resonance, which then

“comes out of the box” as sound

in response to the gestures of

that performer. The mapping

here is direct physical coupling and cannot be altered without an alteration of

the instrument itself, which may or may not be possible, but will always have

limitations. This relationship is further simplified in that there is a direct

correlation between the processing and the response; the resonance of the

instrument’s body is, in a real sense, both these things. In addition, as Bown, et

al. point out, for traditional acoustic instruments “a musician is adaptive

towards an instrument,” 4 meaning that the instrument, by its synchronic

inalterability, forces change on the musician’s part during the interaction. In

order for the instrument to change, an instrument builder would need to

incorporate any modifications in the next iteration. Thus the mapping is largely

constant for a given instrument and performer pair.

The computer’s mapping is in no way as rigidly coupled. Because the

4 Oliver Bown, Alice Eldridge, and Jon McCormack, “Understanding Interactive Systems,” Organised Sound, 14/2 (2009): 191.


processing is, in contrast to acoustic instruments, something very open, its

connection to the sound output is arbitrary and can become so complicated

that it often is considered an integral part of the compositional process. Further,

especially with the current arsenal of human interface devices (HIDs), 5 the

kinds of gestures that can be captured as input to the processing are even more

open-ended than the already huge realm of possibility in translating a single

gesture to a single datum for processing. Several strategies have surfaced to deal

in particular with how to map the input half of the black box metaphor: one-to-

one, one-to-many, many-to-one , and many-to-many. 6 A one-to-one mapping is

the most transparent, but as a proliferation of such mappings can affect either

system performance or the instrument’s performability, a one-to-many strategy

is often employed to reduce processing load on input mappings and mental

load on the performer. A possible manifestation of this strategy would be a

single control updating several synthesizer parameters each of which use a

differently scaled value from the control. To reduce output mappings while

keeping a greater number of input mappings, a many-to-one strategy is used.

This could be useful if, as one example, several performers each have separate

controls for the same parameter. Many-to-many combines the two in any

5 Commonly found on laptops at the time of writing are joysticks, trackballs, trackpads (many multi-touch), keyboards, cameras, accelerometers, photosensors, fingerprint readers, infrared sensors, bluetooth modems, wireless ethernet, and microphones, to name a few. This listing excludes any attachable peripherals, which only increase the possibilities.

6 Drummond, “Understanding Interaction,” 131.


number of ways, and is probably the most commonly used in practice. 7 Notably,

with this freedom in gesture/processing/response mapping, and the clear notion

that the computer user is a programmer, 8 the adaptive relationship of the

musician towards the instrument is then broken down, and one need not wait

for the next iteration of the instrument or for the builder’s whims to also allow

the adaptation of the instrument toward the musician. 9 Furthermore, this

adaptation can take place immediately or even dynamically.

Given the degree of complexity in dealing with the mapping question,

electronic musicians have been known to approach the design of systems and

algorithms from a compositional standpoint. Even before the widespread use of

computers, for Gordon Mumma, his “designing and building of circuits is really

‘composing,’” and his “‘instruments’ are inseparable from the compositions

themselves.” 10 In light of this attitude, which is by no means Mumma’s alone, 11 it

seems unsurprising that a culture of behavioral objects 12 has arisen in the

internet-connected virtual community that is inseparable from the computer

7 Ibid.

8 A programmer for these purposes can be anyone who causes a change in a computer's behavior through intentional manipulation of that behavior. This manipulation can be accomplished through writing original software or by manipulating pre-written software.

9 Bown et al., “Understanding Interactive Systems.”

10 Gordon Mumma, “Creative Aspects of Live-Performance Electronic Music Technology,” Papers of 33 rd National Convention (1967): 1.

11 Among proponents are David Tudor (reported in John D.S. Adams, “Giant Oscillators,” Musicworks 69 (1996)), Chris Brown, and John Bischoff (Chris Brown and John Bischoff, Indigenous to the Net: Early Network Music Bands in the San Francisco Bay Area, (2002) <> (15 April


12 Bown et al., “Understanding Interactive Systems.”


music world. Interface systems and other components of the sound production

process, written by programmer/musicians for their own purposes, have been

and are being shared as code snippets and modular patches. These objects

(Bown, et al. use the term in both its material and its programmatic sense) can

take a number of different forms, and have varying degrees of utility. They may

be nearly whole programs that could almost be considered pieces in their own

right, or modules that require some manipulation to be usable at all. That these

objects “can be shared, modified and repurposed and are the currency and

building blocks both functionally and aesthetically in contemporary music

culture” 13 also bespeaks of an undercurrent of openness and a propensity for

hacking. Where the traditional instrumentalist is forced to be content being

handed a completed and largely unmodifiable object to produce sounds, the

computer musician feels compelled both to build most of the instrument she or

he will use, from new or modified components, and to then use the instrument

to produce interesting sounds. 14

Another way in which the mapping question poses interesting problems

is brought to light when Thor Magnusson exhorts us to

take Michel Waisvisz’s The Hands instrument. Every sensor of the complex interface is mapped to a specific parameter in the software-based sound engine. A change in the engine will result in a new (or altered) instrument. Although

13 Ibid., 195.

14 “Live coding” as practiced by a small community would be a notable counter-example where the building of the instrument is done in real time. However, this type of interface is far from intuitive.


the interface has not been altered by a change in the mapping algorithm, the instrument behaves differently. For Waisvisz, changing the algorithms that constitute the sound engine means learning a new instrument, which involves the re-incorporation of the conceptual understanding of the engine’s functionality into bodily memory. 15

Despite an immutability that parallels an acoustic instrument, a physical

interface can, for the purposes of motor memory, effectively become a new

instrument purely by virtue of a change in the internal mapping: in this case at

a lower level than the gesture-to-processing as discussed above, perhaps even at

the level of processing itself. This example brings forth yet another level of

mapping involved in the overall question, though one not specific to the

computer-as-musical-instrument. At the level of human cognition,

representation can be thought of to be manifested in the form of symbols 16 or

signs, 17 and in the act of living we develop a mapping schema between these

symbols and our perceptions of the outside world. 18 When one learns an

instrument, as in the cases of The Hands or a traditional instrument, one

performs such a mapping of a symbolic relation to a sonic result and another of

a motor-memory symbol, or motor program, that maps to the gesture necessary

15 Thor Magnusson, “Of Epistemic Tools: musical instruments as cognitive extensions,” Organised Sound, 14/2: 169 (footnote).

16 Douglas Hofstadter, Gödel, Escher, Bach: an Eternal Golden Braid (New York: Basic Books,


17 Charles Morris, Foundations of the Theory of Signs (Chicago: University of Chicago Press,


18 Gerald Edelman, Bright Air, Brilliant Fire: on the Matter of the Mind (New York: Basic Books, 1992), 81-98. (though Edelman and others in the school of embodied cognition might argue that these symbols are our perceptions, and that representation is obviated)


Figure 2: the mapping schema from sound to motor program on the cognitive side to

Figure 2: the mapping schema from sound to motor program on the cognitive side

to produce that result. 19 (see Figure 2)

It is interesting to note that this

implies a full circle from the black box

metaphor; the formerly implicitly

related input and output ends of the

box can now be seen as forming one

side of a feedback loop that the cognitive symbolic map now completes (Figure

3). Sensibly, a break in one part of that loop destroys it, and a new loop of

relationships must be built. Given this tenuous hold on connectivity and the

tendency of the computer music community toward rapid fluctuation in

technique and materials (it is a young field), and given also the rapid changes in

the hardware itself, it makes sense that one could fail to settle on a methodical

answer to the mapping question. 20

Along with the internal mappings from gesture to sonification is the

issue of how to actually capture the gestures themselves. Most HIDs that come

with computers, though they can be used as such, are not intended to be

musical interfaces. The devices beyond the usual keyboard, monitor, and

19 Gerald Edelman refers to this pattern as a re-entrant activation in what he calls a global mapping, implying that the separation portrayed here is false. The motor response in a global mapping is co-occurrent with the perceptual response (and thus is part of the same mechanism). (see §2.2 below)

20 For an interesting discussion in this vein with a comparison of classically trained acoustic instrumentalists to live-coders, see: Nick Collins, “Live Coding Practice,” (paper presented at the International Conference on New Interfaces for Musical Expression, New York, USA, June 6–10, 2007).


pointing device, though increasingly common on most laptops, are still not as

though increasingly common on most laptops, are still not as Figure 3: the feedback loop between

Figure 3: the feedback loop between the performer and the instrument in an interactive/reactive system

widely used for gesture-capture. Beyond

that, the integration of those standard

interfaces within the body of the machine

on laptops, and the resulting machine–

centric body language on the part of the

computer performer lend many to frown

upon them as musical control devices. To

combat this disengagement, many

performers are, at the very least, using

detached controllers, or, like Waisvisz, designing and building their own custom

interfaces. Dan Trueman built the Bowed-Sensor-Speaker-Array (BoSSA), a

multi-directional speaker clustered with various sensors that the performer

holds in his or her lap and bows much as an orchestral instrument would be

played. Trueman has also designed multi-directional speakers and has used

various corded peripheral interfaces (e.g., tablets and drum pads) for use with

laptops in the Princeton Laptop Orchestra (PLOrk). His aim here was to pull

the laptop performer away from the disengaging stance of integrated controller

use. He also challenged the problem of the disembodiment of the resultant


sound 21 with spatial localization via the speaker clusters. 22 These solutions show

that in addition to the problem of designing a mapping schema at the software

level to translate gesture data into processable data for sonification, a similar

bottleneck also exists at the level of performer/instrument interaction; the

computer musician/performer often needs to find her or his own solution to the

dilemma of gesture capture within a realm of near infinite possibility. This area

would, however, be one where thoughtful design could improve the ability to

capture musical gestures. With these factors in mind, perhaps the computer,

rather than be compared to “the saxophone” or “the dulcimer,” should be

likened to “an instance of all possible acoustic instruments.”

For these reasons, the question of mapping in the computer as a musical

instrument brings to light the strange fact that when it comes to the ability of

this instrument to be used in a live setting one is largely concerned with

questions that have traditionally had more to do with instrument building and

design than with performance. Because of its relative youth in the performance

world, furthermore, there exist few pre–designed systems that would allow

someone to approach it purely as a performer learning an instrument. The

question then becomes: how can one pare down the realm of possibility into a

design schema relevant to a musical performer?

21 Laptop concerts are nearly always amplified through a P.A. system of some kind, displacing the sound from its source of origin. See Powerbooks Unplugged ( (15 April 2010)) for a notable exception.

22 Dan Trueman, “Why a Laptop Orchestra?” Organised Sound 12/2 (2007).


2.2. Conceptual metaphor

Before we attempt an answer to that question, let’s pose another, more

basic question. What is relevant to a musical performer?

A musical performer, being a human, is also something of a black box.

However, it is one on which research in the last thirty years in cognitive science

and linguistics has shed some light. Of enormous explanatory potential and

central relevance to this essay is the contemporary theory of metaphor, which

builds upon its foundation in the theory of embodied cognition. This discipline

has given rise to a conception of cognitive organization that has direct relevance

to issues of music in general, but especially to music in relation to human–

computer–interface (HCI).

The theories are grounded in recent research in neuroscience, in what

Gerald Edelman calls “neural Darwinism.” 23 Edelman postulates that neural

development proceeds along lines explained by population thinking, as

understood in evolutionary biology, and that the units of selection are groups of

neurons which activate together when they receive a particular input stimulus.

These groupings become relevant to cognitive theories because they don’t

simply activate as a single group upon stimulation, but in networks of

interconnected groups 24 which Edelman calls maps. Key to music is the fact that

23 Edelman, Bright Air, Brilliant Fire, ch. 9.

24 He refers to this interconnection as “reentry.”


these maps are also activated together with non-mapped parts of the brain 25 and

with the motor behavior of the animal in question, in this case, the musician. 26

These findings in neuroscience are relevant because they indicate a

physical basis for Mark Johnson’s image schemas, on which conceptual

metaphor 27 depends. When Johnson says that “[r]ecurring adaptive patterns of

organism–environment interaction are the basis for our ability to survive and

ourish,” 28 he is referring to just this type of neuro–motor cross–activation.

Johnson and Rohrer describe it succinctly:

the contours of our world and

make it possible for us to make sense of, reason about, and act reliably within this world. Thousands of times each day we see, manipulate and move into and out of containers, so containment is one of the most fundamental patterns of our experience. Because we have two legs and stand up within a gravitational field, we experience verticality and up–down orientation. Because the qualities (e.g., redness, softness, coolness, agitation, sharpness) of our experience vary

The patterns of our ongoing interactions


continuously in intensity, there is a scalar vector in our world. 29

The experiences they mention are by no means arbitrary; they are some of the

fundamental image schemas on which we base our cognition through the

process of metaphor, which George Lakoff defines as “a cross–domain mapping

in the conceptual system.” 30 The plasticity of the human mind comes from an

25 i.e., specialized brain structures whose function is not mainly for cognition

26 Edelman, Bright Air, Brilliant Fire, 83–93. (cf. global mapping)

27 Hereafter, I will assume the reader understands that by “metaphor” I mean “conceptual metaphor” as defined by Lakoff and Johnson (1980), and not “metaphorical linguistic expression.”

28 Mark Johnson and Tim Rohrer, “We are living creatures: Embodiment, American Pragmatism and the cognitive organism,” in Cognitive Linguistics Research, 35.1: Body, Language, and Mind, Volume 1: Embodiment, eds. Tom Ziemke, Jordan Zlatev, Roslyn M. Frank (Berlin: Mouton de Gruyter, 2008): 32.

29 Ibid.

30 George Lakoff, “The contemporary theory of metaphor,” in Metaphor and Thought, 2 nd ed., ed.


ability to extend inferences one can make through anticipation of and

understanding of the structure of these general physical experiences to

inferences about unrelated but correlated experiences that don’t necessarily

have such a physical manifestation.

As an illustration of the relationship between image schemas and

language, consider a part of speech: the preposition. These are familiar words

we use everyday, and which are such a vexing problem to native speakers of

other languages learning English: words like in and into. However, even native

speakers are hard pressed to define them. The reason for this difficulty is that

they are linguistic representations of image schemas, which are so basic to

cognition as to be below the level of conscious attention. Johnson and Rohrer

have already introduced us to the container image schema, which is the basis of

the preposition in. At its most basic, the word can only be taken to represent the

most salient part of our idea of a container, which is the location of its

containment: a simple enough idea. With into, the situation is slightly more

complicated, as it demonstrates the fact that image schemas can be

compounded to aid in inference about situations or ideas that exhibit

characteristics which can map to attributes of more than one schema. The word

to is an expression of what Lakoff calls the SOURCE–PATH–GOAL 31 schema, which

Andrew Ortony (Cambridge: Cambridge University Press, 1993): 203. 31 A convention of the literature is to use small caps to refer to a set of conceptual mappings. The normal admonition accompanying them is that they are not referents to any particular linguistic expression but to a static set of conceptual correspondences.


is an extension of the PATH schema. We understand into as superimposing the

container schema onto the goal/destination of the SOURCE–PATH–GOAL

schema. 32 The interesting part, however, arises when people say things like,

“Falling in love is just getting yourself into trouble.” When interpreting this

sentence, we extend our inferences about the space inside containers to

reasoning about states of being, and about transitioning between states of being

in terms of moving along a path toward a destination. Inferences we can make

about being inside containers map to inferences we make about being in

certain states, and so we speak of being in and out of love, despite the fact that

there is no such clearly defined line between these states that corresponds to

the shell of a physical container. In this way metaphor can be both an

indispensable aid and an insidious hindrance to thought.

A second type of image schema is what are clumsily known as “vitality

affect contours.” Johnson and Rohrer describe them as

the swelling qualitative contour of a felt experience. We can experience an adrenaline rush, a rush of joy or anger, a drug–induced rush, or the rush of a hot–flash. Even though these rushes are felt in different sensory modalities, they are all characterizable as a rapid, forceful building up or swelling contour of the experience across time. 33

Such abstract experiences are of extreme importance to art, and their time–

based form understandably lends them to relevance to time–based arts like

music. Contours like the envelope of a rush are metaphorically extended to

32 Johnson and Rohrer, “We are live creatures,” 34.

33 Ibid., 36.


inferences about the course of actions, a means through which we generate

expectation about those actions. 34 As Johnson and Rohrer note, “We crave the

emotional satisfaction that comes from pattern completion, and witnessing even

just a portion of the pattern is enough to set our affect contours in motion.” 35

This kind of metaphor is the mechanism by which suspense, resolution,

cadence, and other such musical ideas work. In fact, Candace Brower brings

many of the metaphors discussed thus far into her analysis of Edgard Varèse’s

Density 21.5. 36 She analyses the first seventeen bars in a series of phrases, in

each of which the melody, seen as an agent whose will is a driving force toward

goal–directed motion, strains toward the boundary of a container defined by the

pitches in the given phrase. Each phrase builds tension by slowly expanding the

container’s boundaries as the agent battles both against those boundaries and

against opposing forces (another very basic image schema), encountering

barriers, and resting on the metaphorical platforms of stable pitches. Her

analysis differs from others’, but, by means of the metaphors she applies, a

coherent pattern emerges.

Despite the enormous descriptive power of this mode of thought,

metaphor can also serve to obscure reasoning in subtle ways, as I briefly alluded

34 Johnson and Rohrer give the example of a child quieting down as soon as it sees its parent begin to reach for the bottle. (Ibid., 34.)

35 Johnson and Rohrer, “We are live creatures,” 34.

36 Candace Brower, “Pathway, Blockage, and Containment in Density 21.5,Theory and Practice 22–23 (1997-98): 35–54.


to above. Michael Reddy was an early pioneer of this discipline, and the parable

he invents in his classic paper “The Conduit Metaphor” serves to illustrate one

paper “The Conduit Metaphor” serves to illustrate one Figure 4: the toolmaker's paradigm (after Reddy) of

Figure 4: the toolmaker's paradigm (after Reddy)

of the most pervasive ways that it does.

Reddy invites us to imagine a situation he

calls “the toolmaker’s paradigm” 37 wherein a

number of people live alone in a wheel

structure in wedges separated by spoke–like

walls, the outer circumference, and a hub.

(see Figure 4) In this hypothetical world,

there are no possible means of communication except through the hub, and no

information can be gained in any other way about each neighbor’s space. Each

inhabitant can only pass notes through the hub, and when one invents a new

tool with which she improves her own life, she passes a note to the others with

instructions to build it. The instructions being fundamentally imperfect, and

there being a drastic difference in the environment and resources in each cell,

the tools are always manifested differently by each toolmaker, unless one or

more of them engage in a dialogue to figure out more about what was intended

to be built versus what was actually built. Reddy presents this paradigm as a

model of the way communication must actually work: as a cooperative effort

between speaker and listener, or else as only a shadow of the speaker’s intent.

37 Reddy, “The conduit metaphor,” 171–176.


However, the conduit metaphor, which is the operative metaphor in common

speech about communication, in which, for example, words are metaphorized as

containers for ideas which are given, packaged and ready, to the listener,

obscures this cooperative effort, implying that the lion’s share of the effort in

communication lies with the speaker/packager. Reddy points to the conceptual

development of mathematical information theory as an illustration of the

insidiousness of the conduit metaphor. He first establishes that

“[i]nformation is defined as the ability to make nonrandom selections from some set of alternatives. Communication, which is the transfer of this ability from one place to another, is envisioned as occurring in the following manner. The set of alternatives and code relating these alternatives to physical signals are established, and a copy of each is placed at both the sending and receiving ends of the system. …The whole point of the system is that the alternatives themselves are not mobile, and cannot be sent, whereas the energy patterns, the ‘signals’ are mobile.” 38

In light of this simple but incredibly subtle distinction, Reddy portrays the

English language, and the underlying conceptual structure, as an “evil

magician” who flies over the toolmaker’s world and modifies the hub such that

the toolmakers believe they are receiving the tools themselves instead of

instructions to build those tools.

With the power of metaphor both to bring conceptual richness and to

wreak conceptual havoc fresh in the reader’s mind, consider another of Reddy’s

well–worded cautions:

“A code is a relationship between two distinct systems. It does not ‘change’ anything into anything else. It merely preserves in the second system the

38 Ibid., 181. (emphasis in original)


pattern of organization present in the first system. Marks or sounds are not transmuted into electronic pulses. Nor are thoughts and emotions magically

metamorphosed into anything.” 39







All of this information is as relevant to any human, but as I will show

below, it is of special relevance to the composer/performer of computer music

once the question of interface design arises.

3. Interface strategies

The brief survey presented above suggests that a cognitive theory of

metaphor can account simply and elegantly for a number of structures upon

which we call to manifest our language and our music. Given this frame of

organization, which, quite relevantly, is tied directly to bodily movement, one

should permit oneself to be informed by it when approaching the mapping

question in computer music performance.

After another brief reportage of relevant research, the reader will be

asked to consider an example of a simple controller which can create an

experience coherent with an image–schematic expectation. For the sake of

clarity and simplicity, the controller will be one that can be varied over a single

dimension, and which will then be mapped to control a single audible

parameter. The procedure followed in the conception, construction, and use of

this example controller will be proposed as a general methodology for

39 Ibid., 183–184. (emphasis in original)


metaphorically coherent interface design. A look at existing interfaces and

their relationship to metaphor, and at an interface I built for a performance

piece then precedes a proposal for future work.

3.1. Manifested metaphors: simple yet telling examples

A pervasive metaphor in English, as well as in many other languages, is

MORE–IS–UP, an extension of the VERTICALITY image schema. Such metaphors

can be realized in physical objects through which the metaphors they manifest

are reenforced in successive generations of users as they grow up with

experience of these objects. 40 Consider for a moment how a mercury bulb

thermometer works; as the temperature of the metal increases, its density

decreases but its volume increases. These may be just two ways of thinking about

the same physical result, but it’s important to distinguish that the physical

result of higher temperature, which can be likened to the input to the

thermometer’s “processing,” doesn’t have an inherent vector; it has only a delta

in its physical state which is physically mapped to the output, the

thermometer’s gauge. Whichever way one thinks of it, as an increase in volume

or a decrease in density or any arbitrary change in another attribute entirely, is

irrelevant, as the important aspect is that when the temperature gets hotter, the

substance behaves in a predictable way which can be factored into the design of

40 Lakoff, “Contemporary theory of metaphor,” 241.


the thermometer such that it expresses this change by means of the MORE–IS–

UP metaphor. Such objects “exhibit a correlation between MORE and UP and are

much easier to read and understand than if they contradicted the metaphor.” 41

It is key, though, that one realize that in this case one has no control over what

is taken here as the crux of the mapping question, because this mapping is

determined by physical laws. Though one could modify the scale on the output

to indicate Fahrenheit, Celsius, or even mood or DEFCON level for that matter,

the mapping from temperature (input gesture) to metal density/volume

(processing) remains constant and fixed in a one–to–one relationship.

Such an unbreakable mapping is not the case with computer input to

output. Take an example of a computer control which starts out analogously to

of a computer control which starts out analogously to Figure 5: the author actuating a fader

Figure 5: the author actuating a fader

the thermometer example: the fader.

(see Figure 5) The most common use

for this control is to change amplitude

of an audio signal, which, unless the

aim is to be contrarian or malicious, is

now always mapped to the fader’s path in correspondence to the MORE–IS–UP

metaphor. Certainly, there have been manifestations of both mappings

historically, but it is perhaps testament to the influence of metaphor that the

current standard has been the one to persevere; we map loud to more in our

41 Ibid.


thinking about amplitude in English, so one side of the fader, usually the

extreme that is farthest from the body of the actuating agent, is considered the

top of the fader’s throw. 42 Thus, just like the thermometer example, a fader maps

control motion to delta in amplitude coherently with the MORE–IS–UP

metaphor. However, since the hypothetical fader in question controls a

computer output, the mapping is not limited to this one application. What if it

were a pitch slider? How would it succeed or fail at manifesting our

metaphorical understanding of pitch?

Consideration of this remapping by an English speaker, who conceives of

pitch in terms of the metaphor HIGH–PITCH–IS–UP and is conditioned by the

conventional mapping of a fader, will likely seem so straightforward as to elicit a

hypothetical scoff. The reader is invited to scoff away, but to know that in fact,

though they are less common, English also exhibits other metaphors for pitch;

low pitch can be deep and high pitch can be shrill, for example. 43 Cross–

linguistically, however, HIGH–PITCH–IS–UP is not the only default metaphor used

conventionally to conceptualize pitch. In Kpelle, a language in the Mandé

family of the Niger–Congo macro–family spoken mainly in Liberia, speakers

distinguish wóo su kéte (“voice with a large inside”) from wóo su kuro têi (“voice with a small inside”). … These concepts of large and small apply to singing voices, instrumental sounds, and speaking voices, and the idea incorporates both pitch and resonance attributes. A large voice is both lower in

42 I am indebted to James Fei (personal communication) for bringing the BBC–style fader to my attention.

43 The words used are not always nicely paired antonyms.


pitch and more resonant than a smaller voice. 44

Also, Shayan et al. have reported a consistent use of a thick/thin metaphor in

Farsi, Turkish, and Zapotec, three unrelated languages. 45 Further, Eitan and

Timmers presented an empirical study of a comprehensive set of pitch

metaphors tested against speakers of languages that did not necessarily

conventionally use them. They found that when asked to describe pitch,

subjects would consistently map pairs of antonyms to the pitch vector in the

same way as would native speakers of languages which did conventionally use

those antonym pairs. 46 To complicate matters, the coincidence that exists in

English between the top of a slider being both louder and higher pitch

wouldn’t necessarily correspond in a language that uses a different metaphor, as

metaphorical attributes of a complicated concept such as pitch vary along

different vectors. Small sounds (our high sounds) are quieter in languages that

use the big/small metaphor, while big sounds (our low sounds) are louder. 47

44 Ruth M. Stone, “Toward a Kpelle Conceptualization of Music Performance,” Journal of American Folklore 94/372 (1981): 196. (Please note that for formatting reasons, some diacritics on the Kpelle transcriptions are not properly reflected in this quotation. Please refer to the original source for the authoritative orthography.)

45 Shakila Shayan, Ozge Ozturk, and Mark A. Sicoli, “The Thickness of Pitch: Crossmodal Metaphors in Farsi, Turkish, and Zapotec,” Senses & Society 6/1 (2011). The authors note that though Farsi and Turkish are not related, there is a cultural exchange between the two speech communities. However, such is not the case between Zapotec and either of the other two.

46 Zohar Eitan and Renee Timmers, “Beethoven’s last piano sonata and those who follow crocodiles: Cross–domain mappings of auditory pitch in a musical context,”Cognition 114 (2010): 405–422. Subjects even mapped the Shona crocodile/those who follow crocodile (for low/high) with statistical consistency despite its unusualness for most speakers of other languages.

47 Ibid., 420.


However, following from the above discussion about metaphor and its

grounding in image schematic knowledge of our physical environment, the

finding that these ideas can be readily understood in novel situations is not

surprising; the metaphors exist because they coherently map to an underlying

image schematic structure which is, if not universal, then at least quickly

understandable by our common physical experience.

3.1.1. A thought experiment:

a prototype conforming to an alternate pitch metaphor

Perhaps the simplest way to demonstrate my proposition for

metaphorical coherence with a computer interface is to offer a test case in

which the relevant principles are taken into account. The first step in designing

a coherent controller is to choose a metaphor that can be successfully realized.

Were we to choose, for example, the Shona pitch metaphor crocodile/those who

follow crocodile, the implementation might be nontrivial and the mechanism

would probably not be immediately transparent to a potential user. 48 If, however,

we were to choose the thick/thin metaphor mentioned above, a physical

realization would be far more intuitive; a brief internet search revealed,

however, that practical acquisition of sensors capable of digitizing such a

change would be prohibitively expensive, and a feasible control interface would

48 No, I don’t actually have an idea how to do this.


be harder to implement than others. Furthermore, whether or not its physical

form would afford 49 thinning and thickening would be debatable. For these

reasons, the current attempt will involve the big/small pitch metaphor used in

Kpelle and other languages. 50

A good shape for a big/small pitch controller would be a squeezable foam

ball. 51 Such a shape and material choice would encourage squeezing, which

a shape and material choice would encourage squeezing, which Figure 6: prototype control for big/small pitch

Figure 6: prototype control for big/small pitch metaphor

would be consistent physically

with the chosen metaphor;

squishing the ball smaller would

correspond to higher pitch and

releasing the ball into its bigger,

relaxed state would correspond

to lower pitch. 52 However, rather than the ideal sphere, the prototype for the

controller (see Figure 6) is a cube the size of a small handful, composed of anti-

49 On the notion of affordance, see: Orit Shaer and Eva Hornecker, “Tangible User Interfaces:

Past, Present, and Future Directions,” Foundations and Trends in Human–Computer Interaction 3/1–2 (2009): 62–63. For incisive clarification, see: Shaleph O’Neill, Interactive Media: The Semiotics of Embodied Interaction, (London: Springer–Verlag, 2008): 49–65.

50 Zibkowski cites the use of this metaphor in Bali and Java in: Lawrence M. Zibkowski, “Metaphor and Music Theory: Reflections from Cognitive Science,” Music Theory Online 4/1 (1998): note 12.

51 A nod must go to Andrew Mead who, unbeknownst to me until revision of this paper, proposed a very similar thought experiment in a footnote of his article: Andrew Mead, “Bodily Hearing: physiological metaphors and musical understanding,” Journal of Music Theory 43/1 (1999): 17, note 13.

52 An added benefit to such a design is that the controller would also be coherent with a tense/relaxed opposition for other sonic parameters, although that application will not be explored here.


static foam rectangles found in the packaging of integrated circuits, collected

together with two wired electrodes on opposing sides of the cube. 53 This whole

on opposing sides of the cube. 5 3 This whole Figure 7: circuit diagram for the

Figure 7: circuit diagram for the squeeze ball big/small pitch controller. R1 and C1 can be varied to change the frequency of the oscillator (approximate values suggested here). A 2 k Ω resistor was added on the oscillator output to reduce the extremely hot signal.

assembly is a resistor

which is attached to a

simple inverter

oscillator, such as can

be built with a 74C14

integrated circuit, 54 as

half of a voltage

divider which feeds

the audio input to a

computer. (see Figure 7) In this way one can control the amplitude of the analog

oscillator circuit with the squeezable cube. Importantly, the effect of the

control’s variance on the output of this circuit is irrelevant, so long as it varies

in a scalar fashion between two differentiable extremes; in this case, however,

squeezing the controller causes the amplitude to increase because a shorter

distance between the electrodes yields less resistance. The computer tracks the

amplitude of the incoming signal, using that value to vary the frequency of

some sound generator. The prototype controls a simple oscillator written in

53 A version of this resistor design can be found in: Nicolas Collins, Handmade Electronic Music:

the art of hardware hacking, 2nd ed., (New York: Routledge, 2009): 102. For the current demonstration, altered battery terminals were used instead of coins.

54 Ibid., 135.



Ndef(\hiamp_to_hipitch, { var in =, amptrack; amptrack =, 0.01, 0.01, 1200, 400); ! 2, 0, 0.5); }).play;

Since, in this case, the amplitude increases as the cube is squeezed, the

amplitude tracker’s output can simply be plugged into the frequency argument

of the sine oscillator. We now have a working prototype of a controller that

causes a sound to move to a higher pitch when the controller gets smaller.

3.1.2. The prototype as methodological proposal:

what really happened here?

Concisely, and in list form, what we did was:

1. pick a metaphor that could be realized physically.

2. build a controller that generates an arbitrary output that can correlate somehow to the chosen metaphor.

3. receive the controller output as an input signal to the computer.

4. map the input through some processing to a sonic parameter metaphorically coherent with the controller gesture. 55

5. output the signal.

Following these steps is sufficient to yield a metaphorically coherent interface

both for other instances of simple metaphors and also for more complicated

mappings and metaphors. (see §3.3.2 below) If one were to build on this

example, however, some caveats should be kept in mind.

55 Though the processing in this example is fairly transparent (tracking amplitude and assigning to pitch) it certainly still counts.


What is most important to keep stock of is twofold. First, the

correspondence between the physical form of the controller and the intended

output in step two does not exist until the successful completion of step four,

and only then if step four is approached in keeping with the aim of

metaphorical coherence. Before step four, it is merely a potential correlation.

Second, as is completely obvious when laid out this way, steps one through four

are not one step. These points are stressed to lay out the areas where conduit

metaphor thinking can get in the way of conceptualizing about control

interfaces and mapping. Though the simplicity of this example makes it easy to

grasp, one should remember that the signal is not contained within the controller.

Only once we built both the control interface and its chain of causation up to

step four could we generate a signal to transmit, and only when a listener

interprets the signal can a message then be of concern.

In keeping with the spirit of our attempt, we chose to map the

controller’s state to a signal that maintains consistency with our embodied

metaphorical understanding of the controller’s form. It would be entirely

possible to place the variable resistor, the squeezable foam, on the other side of

the voltage divider, thus causing a drop in amplitude from the output of the

analogue circuit when the cube is squeezed. Had we chosen to do it this way,

and then rewrote the computer code such that a drop in amplitude caused a


higher frequency output, then there would be no perceptible difference

between the two versions of the controller. However, had we instead chosen to

correlate the bigger, relaxed state to a higher frequency, by moving the resistor

and leaving the code the same, the controller would no longer be

metaphorically coherent. With such a simple one–to–one mapping, it would

clearly be no less performable, but it would no longer correlate to an attested

metaphorical understanding of pitch. Were we then to build a second controller

to vary the amplitude of the output, creating a foam–ball theremin of sorts, the

“reversed” mapping would correlate well to our metaphorical understanding of

loudness. Squeezing the resistor to create a drop in input amplitude could map

coherently to our output amplitude. Again, reversing the relationship would

destroy the coherence, but not necessarily the usability. The choice in this case

of correlating the compressed state to higher frequency or lower amplitude

demonstrates the crux of a metaphorically coherent control interface.

Along these lines, there has been a fair amount of research

demonstrating that computer interfaces, graphical or tangible, benefit in

usability when they cohere with our conceptual expectations of their behavior.

Jacob et al. have demonstrated this empirically, noting that a controller which

can simultaneously vary three parameters is a faster interface for completing a

task that requires the matching of three conceptually integral attributes, while


one that can only vary two at a time is faster for matching conceptually

separable attributes. Thus, their three–dimensional pointer provides a faster

means to match two shapes in X–Y position and size, while matching X–Y

position and color works better with a mouse; their experiment allows

positioning the shape with an unmodified mouse movement and changing the

color through mouse movement along one axis in conjunction with a button

depressed. They posit that this match between controller and task comes about

because of the conceptual separability of color versus the conceptual integrality

of size in our understanding of the shapes’ attributes. 56 Wanderley and Orio 57

have also applied these ideas specifically to musical tasks, and Antle, Courness

and Droumeva 58 have approached the specific question of interface and

mapping in gesture capture that is of concern in this paper. However, all these

studies exhibit a particular focus; as Wanderley and Orio ask, “[w]hat is part of

the composition, and what is part of the technology? How can we rate the

usability of an input device if the only available tests were done by few–possibly

one–expert and motivated performers?” 59 Their concern is more in empirically

testing the validity of these ideas, and applying them to the design of interfaces

56 Robert Jacob et al., “Integrality and Separability of Input Devices,” ACM Transactions on Computer–Human Interaction 1/1 (1994): 3–26.

57 Marcelo Wanderley and Nicola Orio, “Evaluation of Input Devices for Musical Expression:

Borrowing Tools from HCI,” Computer Music Journal 26/3 (2002): 62–76.

58 Alissa Antle, et al., “Human–computer–intuition? Exploring the cognitive basis for intuition in embodied interaction,” International Journal of Arts and Technology, 2/3 (2009): 235–254.

59 Ibid., 62.


for general musical use. However, on the strength of their work, as well as on

more performance–minded work such as that of Wessel and Wright, 60 I am

advocating an adoption of image schema and metaphor into the individual

practice of interface design, partly in answer to Wanderley and Orio’s question;

in computer music today, interface design and mapping choice, along with the

more traditional elements of structure, method, material, form and aesthetic are

all definitively part of the compositional process.

3.2. Existing interfaces and conceptual coherence

At this point, the question bears asking: is metaphorical coherence really

necessary? Palle Dahlstedt, in speaking about mapping schemas for live

synthesizer improvisation offered that

It has been said that a good mapping should be intuitive, in the sense that you should immediately understand the internals of the system. But this is not true for most acoustic instruments. Many musicians do not know their instrument from a physics point of view. Some phenomena are extremely complex, e.g., multiphonics in wind instruments, but instrumentalists learn to master them. 61

In his paper, Dahlstedt is advocating a mapping system that involves a large

degree of randomness, and thus raises this objection in order to defend his

position that a fundamentally incomprehensible mapping system can still yield

60 David Wessel and Matthew Wright, “Problems and Prospects for Intimate Musical Control of Computers,” Computer Music Journal 26/3 (2002): 11–22.

61 Palle Dahlstedt, “Dynamic Mapping Strategies for Expressive Synthesis Performance and Improvisation,” Computer Music Modeling and Retrieval: Genesis of Meaning in Sound and Music. 5 th International Symposium, CMMR Revised Papers (2008): 237.


an expressive performance instrument. Along similar lines are concerns raised

by Ian Whalley on the idea of software-agents, stating that one “should then

allow each interactive session to develop something of its own language.

Machine agency can then lead or follow in the interactive process with human

agency, acknowledging that not all conversations are symmetrical in terms of

knowledge and participation.” 62 He advocates the other extreme, proposing to

use adaptive and semi-autonomous software-agents to perform the mapping on

the computer side.

These are completely valid points whose approaches and results I would

be sorry to see gone from the world of computer music. I would venture one

point of clarification, however, with regard to Dahlstedt’s characterization of

“intuitive” interfaces. The system of mapping proposed above does not suppose

or even advocate that the user of a metaphorically coherent interface

“immediately understand the internals of the system.” On the contrary,

concerning the point of mapping the computer’s input to modulation of an

output, as we saw with our input amplitude controlling the output pitch, it is

irrelevant to the user of such an interface what the internals of the system

actually are. It is precisely what is limited to the externals of the system that is

of concern for the use of such a controller. Indeed, as advocated by the IUUI

62 Ian Whalley, “Software Agents in Music and Sound Art Research/Creative Work: current state and possible direction,” Organised Sound, 14/2 (2009): 165.


research group (Intuitive Use of User Interfaces), the very notion of intuitive use

precludes any conscious understanding on the part of the user; they define

intuitive use as the unconscious use of pre–existing knowledge. 63

Certainly, however, it is usually the case in computer music that the

builder of the interface and the mapper of the stimulus to the response is also

the composer and the performer, and so that person likely does “immediately

understand the internals of the system.” Indeed, in computer music, choosing

your mappings is part of the practice. However, the composer/performer, as a

human animal, is still beholden to image schemas, which “[b]ased on embodied

experience, … are learnt early in life, shared by most people and processed

automatically. Violating the metaphorical extensions results in increased

reaction times and error rates.” 64 As a young person in the 21 st century

composing and performing electronic music, and like many of my compatriots,

lacking confidence about my real–time musical performance skills while also

feeling the pressure to build and learn to play a new instrument for each piece,

as is often the case, I welcome the possibility of “decreasing reaction times and

error rates” so that I can focus on the music itself. That said, however, the music

itself, these days as like no earlier time in history, is so incredibly varied that all

approaches should be welcome.

63 Anja Nauman, Jörn Hurtienne, Johann Israel, et al., “Intuitive Use of User Interfaces:

Defining a Vague Concept,” Engineering Psychology and Cognitive Ergonomics: Lecture Notes in Computer Science, 4562 (2007): 128-136

64 Shaer and Hornecker, “Tangible User Interfaces,” 64.


Bearing these ideas in mind, let us now turn to examination of some

existing interfaces: the tangible user interface object in general, and then a

particular adaptation of that technology that I built for a piece of my own.

3.2.1. Tangible user interfaces (TUIs)

TUIs (Tangible User Interfaces) are a form of computer interface whose

focus places it in opposition to GUIs (Graphical User Interfaces), and pertain to

all uses of computers. Shaer and Hornecker cite three basic types of TUIs:

interactive surfaces on which objects, usually marked somehow, are placed,

constructive assemblies of smaller interactive modules, and token and constraint

systems that limit the movements or positioning of objects by means of physical

constraints. 65

One of the most powerful ideas with regard to TUIs is the notion of

“space–multiplexing,” which is given as a property of graspable user interfaces, a

subset of TUIs. Shaer and Hornecker describe it:

When only one input device is available, it is necessarily time–multiplexed: the user has to repeatedly select and deselect objects and functions. A graspable user interface on the other hand offers multiple input devices so that input and output are distributed over space, enabling the user to select an object or function with only one movement by reaching for its physical handle. 66

TUI research also touts the “integration of physical representations and



basically eliminates the distinction between input and output

65 Ibid., 49–50.

66 Ibid., 47.


devices.” 67 User feedback is collocated with the input device, thus giving the

seductive illusion that the user is directly touching the digital information. The

degree to which this coupling takes place is measured in the literature on a

continuum called, problematically, embodiment, which axis “represents how

closely the input focus is tied to the output focus in a TUI application, or in

other words, to what extent does the user think of the state of computation as

being embodied within a particular physical housing.” 68 In situations where the

“embodiment” index is high, the objects can help to extend memory during

reasoning or other kinds of thought:

Actions such as pointing at objects, changing their arrangement, turning them, occluding them, annotating, and counting all recruit external elements (which are not inside the mind) to decrease mental load. 69

Perhaps because the TUI discipline is largely concerned with general

computing and comes in a large part from product design, an arena where

mappings are predetermined by the designer and usually not alterable by the

end user, the coupling between TUIs, which generate an input, and their output

is often discussed as effectively a direct connection. For example, Overbeek and

Wensveen discuss action to function coupling in a way that implies that the

design process obviates the basic concern in this essay. In their paradigm,

although they cite six parameters that need to coincide for a “natural coupling,”

67 Ibid., 48.

68 Ibid., 52–53.

69 Ibid., 67.


all the parameters map directly to the output; the designer chooses and fixes the

entirety of the black box. 70 Thus, much of the discussion about what is

essentially mapping conflates the first four steps above (in §3.1.2), and the

discourse is replete with conduit metaphor phrases implying that the TUI

objects contain the information read from the outputs with which they are


Nonetheless, these interfaces are physical objects which were designed

by humans, for use by humans, and as such do exhibit a high degree of

metaphorical coherence despite any discussion about them. For a brief example,

despite any discussion about them. For a brief example, Figure 8: a fiducial marker (76) consider

Figure 8: a fiducial marker


consider the interactive surface class of TUIs. The surface

of interaction is, both in a physical and in a metaphorical

sense, a bounded space. The visual patterns recognized

by the computer, called fiducial markers (see Figure 8),

are objects in this bounded space, and are therefore

beholden to the “physical” laws that define that space. In the commercial

implementation of the reacTable, for example, some actions are triggered by

proximity between fiducials representing certain functions. These interactions

can be understood by the user in terms of attraction forces, as magnetism can

be, when inputs and outputs between objects are connected dynamically and

70 Kees Overbeek and Stephan Wensveen, “From perception to experience, from affordances to irresistibles,” Proceedings of DPPI03 (Designing Pleasurable Products and Interfaces (New York:

ACM, 2003): 95–6.


automatically based on physical collocation. That these markers are objects in a

bounded space implies that they are also destinations on an indeterminate path

through this space, 71 which lends such interfaces to exploration via the journey

metaphor as a point of departure.

Beyond the interactive surface category, constructive assemblies are also open

to metaphorical interpretation, as the bounded space they occupy is the same

one which we also occupy. Objects in this category are often manifested as

sequencers in a way that corresponds well to our metaphorical understanding

of time in terms of space and movement along a path. 72 However, these

interfaces are only intuitive so long as their chosen mappings pan out with

respect to our physical expectations.

In practice, while from a design perspective token and restraint systems

form a separate category, they can be seen with respect to the present

discussion to fall into subcategories of interactive surfaces or constructive

assemblies. Whether their plane of function is taken to be a surface one level

abstracted from our environment or to be part of our environment itself would

be the distinguishing factor in a given implementation.

With these ideas as background, I sought with the piece discussed below

71 See Lakoff’s discussion of duality in metaphorical representation: Lakoff, “Contemporary theory of metaphor,” 218–229.

72 Lakoff, “Contemporary theory of metaphor,” 216–18. For examples of this kind of interface, see: Martin Kaltenbrunner, “Musical Building Blocks,” Tangible Music, <> (15 Apr 2011).


to incorporate aspects of this type of interface in a simple and metaphorically

coherent controller.

3.2.2. The “CHOAM” fiducial ball controller

“Combine Honnette Ober Advancer Mercantiles.” 73 is a solo electronic

performance piece that I composed and performed in 2010. With many of the

ideas discussed thus far floating nebulously apart from the verbal level of my

mind, I attempted to build a new interface which would yield a more human–

controlled sound than I had been able to achieve to that point with other live

electronic pieces. I had been able to develop sounds which were expressively

modulated by the rotation and position data from fiducial markers, recognized

through the built–in camera on my laptop by means of the open–source

reacTIVision software and fed through Max/MSP to SuperCollider. The original

idea was to place them on large objects with which one or more performers

would interact in a rule–based game piece. However, as most of the time was

spent with sound design, and the concert was impending, that idea was

scrapped for a much simpler approach.

I duct–taped over the disagreeable pattern covering an inflatable rubber

ball, and placed the fiducial markers in various locations on its surface. Some

73 After the interstellar trade conglomerate in Frank Herbert’s Dune series. (The piece was written for a science fiction–themed concert.)


were repeated, and they were grouped in clusters such that different areas on

the ball would have distinct sonic characters. The compositional process was

then the choice of sounds, parameters of variance, and the arrangement of the

markers on the sphere.

Though I didn’t realize it at the time, the layout of the controller

manifested the MUSIC–IS–A–JOURNEY metaphor, an extension of the SOURCE–

PATH–GOAL schema. Because I had arranged the markers as stopping points,

they fulfilled the entailment SOUNDS–ARE–DESTINATIONS, and the interface

afforded exploration of the sonic “landscape,” which exploration I embarked

upon semi–improvisationally. An important point to note is that this type of

interface allows tactile control over rather high–level aspects of the musical

performance. Indeed, a major concern with my inquiry into interface is finding

a way to resolve questions of form in real–time and by means of a physical


A shortcoming of this interface was the use of the built–in camera on the

laptop display; its position opposing me necessitated a mental remapping of the

image, as the metaphorical agent should have been navigating the surface from

my point of view rather than from the computer’s. Though similar visual

distortions have been shown to be tolerable and even to be elided

experimentally with habituation, 74 the point with this interface was to be usable

74 George Stratton, “Vision without inversion of the retinal image,” Psychological Review 4/4


without any training. A vast improvement would be to use an external camera

mounted somewhere on my head such that the computer’s input image would

be from the same point of view as mine. Exploration of the surface between

sound–destinations would then be completely straightforward and require no

training or wasted compositional effort whatsoever.

3.3. Proposed future work

Along the lines of the ideas briefly skirted in “CHOAM,” my future work

will incorporate gestural imaging technology to more directly map gesture to

sound in conjunction with the use of TUI objects. This mapping, however, will

not be limited to the simplistic one–to–one correspondence of the thought

experiment in the above demonstration (in §3.1.1).

3.3.1. Gestural imaging and marked objects

Within the last year affordable full–body gesture recognition through

time–of–flight three–dimensional imagers has finally become a reality. The

cheapest such imager is the Xbox Kinect controller, but open–source versions

of similar function exist, such as ofxStructuredLight, which use a projection of

a known image onto a surface along with stereo imaging, which can calculate

the distance of everything in the field of view from a combination of distortions

(1897): 341–360.


in the known image and of triangulation between the two cameras. This

technique can acquire similar images without proprietary equipment. Of

particular relevance to this line of exploration is research by linguist Eve

Sweetser, who focuses on gesture as co–articulated with normal conversational

speech. Sweetser sees gesture of this nature, in TUI terminology, as space–

multiplexing metaphorical expression, often inconsistently, though never

incoherently, with the metaphors that are co–expressed verbally in the time–

multiplexed flow of speech. 75 I have plans to use this technology both to

gesturally 76 explore virtual representations of pitch space and of other

multidimensional spaces and to sonify dance, along similar lines as Antle et al. 77

Besides exploration of full–body gesture, I plan to continue development

of tangible objects. Of the many metaphors that can be applied to performance

and composition of music, I find MUSIC–IS–A–JOURNEY the most amenable both

to these objects and to realtime exploration of musical form. From this point of

view, I plan to expand on the “CHOAM” interface by building a larger ball

covered with velcro on which movable and recombinable fiducial badges and/or

barcodes can be placed. The new interface would use barcodes or smaller

fiducials to “zoom into” the fiducial destinations beside them, thus redefining

75 Eve Sweetser, “Looking at space to study mental spaces: Co–speech gesture as a crucial data source in cognitive linguistics,” in Methods in Cognitive Linguistics, eds. Monica Gonzalez– Marquez, Irene Mittelberg, and Seana Coulson (Amsterdam: John Benjamins Publishing,


76 That is, “metaphorically by means of physical movement.”

77 Antle, et al., “Human–computer–intuition?” 242.


the explorable space bounded by the object in hand. “Out–zooming” codes

would bring it back “up” a level, or physical zooming would shift focus to the

larger fiducials. One ball could thus effect multiple levels of control. The

compositional effort would then be in effectively systematizing the distribution

of markers to allow flexible and intuitive control with relatively constant

placement, keeping in mind that the markers would be movable.

Along the lines of the TUI constructive assembly model, marking physical

objects with symbols, either barcodes or fiducials, is also in the works. I am

planning small sculptures which bear barcodes that can be scanned to produce

sound, effectively making the sculptural object the score, though it may only be

machine–readable. A simple long sheet of paper is first, with barcodes as sound

destinations, where a branching path will be defined along the sheet; sounds

will be both signified by and encoded in the barcodes along this path. The

piece would essentially be a graphical score, and would provide a smooth

conceptual segue from more traditional forms of representation into slightly

less traditional three–dimensional manifestations.

3.3.2. Mapping image schemas to low or high level outputs

Aside from these mainly technical directions, I plan also to look deeper

into the mapping question. While one can usefully think of mapping in


Drummond’s more mechanical terms, taking a step back and re–acknowledging

the end goal can also be of help. Wanderley and Orio do so by positing different

levels of musical control, which they refer to as “note–level” and “score–level”

control. 78 However, rather than adopt their terminology here, I prefer to use the

more generic “low–level” versus “high–level” control, as these terms don’t invite

the possibility of drawing false distinctions in a style of music that favors

transgression of traditional demarcations. They bound a vector that implies a

continuum along which an arbitrary level of complexity can be seen as possible,

rather than arbitrarily privileging particular areas along this continuum.

How can we approach this low– or high–level control? Johnson posited

several basic image schemas beyond the few discussed above, some of which

may be appropriate to inform the construction of interfaces in keeping with the

proposed methodology. Borrowed from Brower, a list of some promising

schemas follows:

1. containment

2. balance

3. blockage

4. diversion 79

Low–level one–to–one mappings of these image schemas would manifest like

the pitch ball, and higher–level mappings like the “CHOAM” controller. As an

example, given the near ubiquity of three–axis accelerometers at the time of

78 Wanderley and Orio, “Evaluation of Input Devices for Musical Expression,” 69.

79 See Brower, “Pathway, Blockage, Containment,” 36.


writing, both in laptops and in mobile phones, the balance schema seems ripe

for appropriation in a control interface. A semi–fixed composition could be

guided through tension/resolution–based musical structures by tilting a control

device in and out of the level plane. 80

The balance schema could be combined with the containment schema

using a balance scale as an interface. Tokens could be placed on the pans of the

scale, and then recognized by the computer as triggers for sound sources, either

by RFID or by a visual recognition scheme, and the tokens’ physical weight

could offset the scale’s balance, and thus the composition of the musical

structure. As a free–swinging structure, the blockage and diversion schemas

could be factored in by tracking the rotation of the torque arm and pans,

executing appropriate changes when reversal (on the X–Y plane) or deflection

(on the Z plane) of direction occurs. Force and agency in such an interface

would come directly from the performer.

These hypothetical forays into metaphorically coherent interface design

serve to illustrate the kind of possibilities that pursuit of this line of thought

can open up. However, the mechanics of generating computer music is not the

only area that can benefit from application of these ideas.

80 Along vaguely similar lines, as mentioned above, the pitch ball example above could also be made to conform to a tense/relaxed concept by mapping to higher–level musical structures in a similar way.


4. Informing the conceptual sphere

Metaphor is also a rich conceptual domain that can be drawn upon for

compositional inspiration. The most elaborate piece I have produced to date is

as an example of a work informed by this area of inquiry, both in its musical and

in its conceptual content.

4.1. “Choices:” just pitch space, the conduit metaphor

“Choices” incorporates much of what I have learned during my study at

Mills. It is a multi–modal dance and electronic music piece made in

collaboration with Rebecca Gilbert, a Bay Area modern dancer and the main

choreographer. 81 In the piece, six dancers first bring plain, empty boxes, on

which are affixed two–dimensional Data Matrix barcodes bearing SuperCollider

code, to the musician, in the role of the checker, to be scanned for them,

resulting in an atmosphere which is polyrhythmic, pitched, and stable. Later

they take it upon themselves to scan codes on different parts of these boxes,

resulting in a much different sonic atmosphere, one that is chaotic, arhythmic,

and textural. The first six or so minutes, the pitched atmosphere, consists of

more homogenous group movement, from which some individual dancers show

a desire to break away, while the second half, about six minutes of the chaotic

81 Five other dancers, Kate Knuttel, Sergio Lobito, Mica Miro, Jeanne Platt, and Natalie Rael, also contributed to the development of the movement in the piece.


atmosphere, is characterized by highly social interactions between pairs of

dancers and socially motivated actions by individual dancers. The course of the

melody in the first half is indeterminate with respect to performance, varying

depending on the randomly selected box. The pitch material comes from a scale

composed of sixty–three pitches in a single octave in a seven–limit just–intoned

pitch space from which the harmonic progression is built by transposing a

single chord structure gradually to different center pitches within the octave.

The musical structure is complicated by the fact that the melodic line, derived

from a path moving from pitch to pitch within the chord (see Figure 11 below),

is separated into three parts, with rests holding places for pitches that appear in

another part. All of these parts played together with an identical rhythmic

structure would yield the underlying melody, but each part in practice has a

different number of beats, which lends a polyrhythmic texture and

systematically varying pitch contour when played together. Each box contains

codes for parts of the pattern around a particular tonal center, and so each time

a dancer brings up a box to be scanned, the chord goes through a transposition,

which happens in three stages by virtue of its split into three parts. The chaotic

material is mainly synthesized sounds modulated in various ways both by the

din of shopping carts rattling through a large warehouse and by the kernel of

my computer’s operating system read as a sound file.


The program notes were presented as follows:

Fredric Rzewski says of our choice to participate in the politics of our communities, “If people choose to ignore this fact [of their responsibility] (whether consciously and spontaneously, or because they are manipulated into doing so), if they choose to turn their minds away from politics, they give up their right to share in the political life of the community. They in fact abandon their duty to contribute to the collective organization of that community's future.” 82 (emphasis added) It is of the utmost importance that we choose whether or not to be manipulated.

By this oblique reference to politics, the piece is meant to convey that as

members of a society, despite being pulled in many directions by many

subgroups with their own self–interests in mind, it is our responsibility to

participate in the course of our collective progress. 83 However, many powerful

elements seek to manipulate the majority into believing they haven’t the right to

this participation, that only those elements’ own idea of a proper course should

be followed. The ritual of which the dancers are part in the first half, “the

purchase,” is a familiar one to us, and is also one that is a powerful form of

manipulation, though by no means the only one.

The piece incorporates metaphor in two ways that demonstrate the

fruitfulness of conceptual metaphor in composition: firstly, in the use and

exploration of just–intoned pitch–space, and secondly in its discourse on the

conduit metaphor.

82 Fredric Rzewski, “Music and Political Ideals,” in Nonsequiturs: writings and lectures on improvisation, composition and interpretation, eds.Gisela Gronemeyer and Reinhard Oehlschlägel, (Köln: MusikTexte. 2007): 188–200.

83 Progress in the sense of movement somewhere, not necessarily betterment. In other words, in our evolution, in its correctly construed sense.


The present discussion about just–intonation must necessarily assume

some background knowledge. 84 The piece explores pitch in terms of the seven–

limit just intoned pitch space, which is a three–dimensional representation of

space, which is a three–dimensional representation of Figure 9: chord shapes pictured in a 7–limit pitch

Figure 9: chord shapes pictured in a 7–limit pitch space:

dominant 9th (top) and the "4th/6th" chord (bottom)

pitch relationships in whole–

number ratios involving up to

three primes (one prime factor

per XYZ axis). The portrayal of

these relationships in what one

can understand as three–

dimensional space entails a

number of correspondences which can carry over from that understanding and

apply to reasoning about pitch relationships and movement within that space.

Thus, chords can be represented as shapes within that space (see Figure 9),

which remain consistent in their internal relationships regardless of their

transposition within that space (see Figure 10), purely by virtue of the system of

its organization. With regard to this piece, the melody was built thinking of

these chord shapes as path descriptors, with the pitch sequence manifesting the

course of travel along that path (see Figure 11). Thus, the compositional

approach to transposition and melodic structure can be seen as a metaphorical

84 For the reader who would like more background, the author would suggest: David B. Doty, The Just Intonation Primer: an introduction to the theory and practice of just intonation, 3 rd ed. (San Francisco: Other Music, Inc., 2002–6).


extension of the SOURCE–PATH–GOAL schema.

Perhaps at a deeper level than the metaphors involved in the

composition of the pitch material is the conceptual relationship of the piece to

the conduit metaphor. Of central importance is the distinction Reddy raised

Of central importance is the distinction Reddy raised Figure 10: chord transpositions as movement in pitch

Figure 10: chord transpositions as movement in pitch space

between the signal and the message.

During the course of the average

person’s humdrum, everyday

manipulation into complacency,

conduit metaphorical messages are

abundant. He is made to feel that his

life is empty, that there are things that

can be acquired that can fill it, or that

his spirit is empty and that some

beliefs can be imbibed that can fill it,

or made to believe in any number of conceivable lacunae and offered their

purported remedies. Throughout this type of discourse is the underlying

assumption that there is an indefinable something that can be received that can

fill these lacunae, and the unspoken further assumption about these things is

that they will be received, unpacked, and taken in without effort, as that effort

has already been expended by the packager, the giver as he would be led to


believe. In the piece, the checker/musician represents the mouthpiece of the

forces who bear this message, and the computer, the sound source, is the hub in

Reddy’s toolmaker’s paradigm. The music is the set of instructions which are to

paradigm. The music is the set of instructions which are to Figure 11: melodic structure as

Figure 11: melodic structure as path traversal along the chord shape (pitches are destinations)

be interpreted. In this case,

the “fourth wall” of the

toolmaker’s wheel is broken in

that the instructions are

passed between the performers and the audience, rather than between each

other. In the first half, the signal, which is the sound itself, open for

interpretation, is constrained by the ritual in which the dancers must

participate. They present these empty boxes hoping for the solutions they are

purported to hold, and leave confused when the boxes are immediately

discarded. Once they realize that the information they seek is not contained

within these shells, they are freed from the group bondage and interpersonal

growth and relationships can ensue. Thus, the composition can be interpreted

as a loose portrayal and extension of Reddy’s toolmaker’s paradigm.

It should be noted that this piece was originally conceived of long before

my research into conceptual metaphor and embodied cognition began in

earnest. However, the ideas in Reddy’s article had been simmering below the

level of verbal cohesion at least since then. It is testament to their coherence


with life experience that this piece on which I had expended so much thought

conformed so cleanly to interpretation through their lens.

Though the conduit metaphor can be a dangerous concept when

theorizing about many ideas that require conceptual precision, on an artistic

level it can provide rich and nuanced possibilities, as it underlies so much of

our thought about social mores and conventions. “Choices” is a telling

illustration of how it, and the theory of conceptual metaphor Reddy’s article

helped spark, have informed my work.

5. Conclusion

A theory so pervasive and that has such a clear generative conceptual

power as the contemporary theory of metaphor cannot be ignored when

considering the study of or production of a symbolic system of such

fundamental importance to the human as music. Building a system that takes

into account this fact of our status as animals in a physical world, one that is

informed by the linguistic clues which betray our understanding of that

situation, can open the door onto a world of expressiveness that might

otherwise remain hidden.

It is of paramount importance that one keep in mind the different stages

of mapping computer input to output, as this rapidly changing technology may


metamorphose into something that, possibly by virtue of our metaphorical

conception of it, may obscure these distinctions. Nonetheless, that metaphorical

understanding is a deep source domain which can be explored artistically to

great benefit.


6. Appendix: contents of accompanying media

video partition:

1. Combine Honnette Ober Advancer Merchantiles

2. Choices

data partition:

1. source code and documentation for “CHOAM”

2. source code and documentation for “Choices”

3. copies of open-source or free software used in performance


7. Bibliography

Adams, John D.S. “Giant Oscillators.” Musicworks 69 (1996): as reprinted at <> (15 April 2010).

Antle, Alissa N., Greg Corness, and Milena Droumeva. “Human–computer– intuition? Exploring the cognitive basis for intuition in embodied

interaction.” International Journal of Arts and Technology, 2/3 (2009): 235–


Bown, Oliver, Alice Eldridge, and Jon McCormack. “Understanding Interactive Systems.” Organised Sound 14/2 (2009): 188-196.

Brower, Candace. “Pathway, Blockage, and Containment in Density 21.5.” Theory and Practice 22–23 (1997-98): 35–54.

Brown, Chris and John Bischoff. “Indigenous to the Net: Early Network Music Bands in the San Francisco Bay Area.” 2002. < tml> (15 April 2010).

Collins, Nicolas. Handmade Electronic Music: the art of hardware hacking. 2nd ed., New York: Routledge, 2009.

Collins, Nick. “Live Coding Practice.” Paper presented at the International Conference on New Interfaces for Musical Expression, New York, USA, June 6–10, 2007.

Dahlstedt, Palle. “Dynamic Mapping Strategies for Expressive Synthesis Performance and Improvisation.” Computer Music Modeling and Retrieval:

Genesis of Meaning in Sound and Music. 5th International Symposium, CMMR 2008 Revised Papers (2008): 227-242.

Doty, David B. The Just Intonation Primer: an introduction to the theory and practice of just intonation. 3 rd ed. San Francisco: Other Music, Inc., 2002–6.

Drummond, Jon. “Understanding Interaction in Contemporary Digital Music:

from instruments to behavioral objects.” Organised Sound 14/2 (2009):



Eitan, Zohar and Renee Timmers. “Beethoven’s last piano sonata and those who follow crocodiles: Cross–domain mappings of auditory pitch in a musical context.”Cognition 114 (2010): 405–422.

Edelman, Gerald. Bright Air, Brilliant Fire: on the Matter of the Mind. New York:

Basic Books, 1992.

Hofstadter, Douglas R. Gödel, Escher, Bach: an Eternal Golden Braid. 20th anniversary ed. New York: Basic Books, 1999.

Jacob, Robert J. K., Linda E. Sibert, Daniel C. McFarlane, and M. Preston Mullen, Jr. “Integrality and Separability of Input Devices.” ACM Transactions on Computer–Human Interaction 1/1 (1994): 3–26.

Johnson, Mark and Tim Rohrer. “We are living creatures: Embodiment, American Pragmatism and the cognitive organism,” in Cognitive Linguistics Research, 35.1: Body, Language, and Mind, Volume 1: Embodiment. Edited by Tom Ziemke, Jordan Zlatev, Roslyn M. Frank. Berlin: Mouton de Gruyter, 2008.

Kaltenbrunner, Martin. “Musical Building Blocks.” Tangible Musical Interfaces. <> (15 Apr 2011).

Lakoff, George and Mark Johnson. Metaphors We Live By. Chicago: University of Chicago Press, 1980.

Lakoff, George. “The contemporary theory of metaphor.” in Metaphor and Thought, 2nd ed. Edited by Andrew Ortony. Cambridge: Cambridge University Press, 1993.

Magnusson, Thor. “Of Epistemic Tools: musical instruments as cognitive extensions.” Organised Sound 14/2(2009): 168-176.

Mead, Andrew. “Bodily Hearing: physiological metaphors and musical understanding.” Journal of Music Theory 43/1 (1999): 1–19.

Morris, Charles. Foundations of the theory of signs. Chicago: University of Chicago Press, 1938.


Mumma, Gordon. “Creative Aspects of Live-Performance Electronic Music Technology.” Papers of 33rd National Convention. New York: Audio Engineering Society, 1967.

Naumann, Anja, Jörn Hurtienne, Johann Habakuk Israel, Carsten Mohs, Martin Christof Kindsmüller, Herbert A. Meyer, Steffi Hußlein, and the IUUI research group. Engineering Psychology and Cognitive Ergonomics: Lecture Notes in Computer Science. 4562 (2007): 128–136.

O’Neill, Shaleph. Interactive Media: The Semiotics of Embodied Interaction. London: Springer–Verlag, 2008.

Overbeek, Kees and Stephan Wensveen. “From perception to experience, from affordances to irresistibles.” Proceedings of DPPI03 (Designing Pleasurable Products and Interfaces. New York: ACM, 2003.

Reddy, Michael J. “The conduit metaphor: A case of frame conflict in our language about language.” in Metaphor and Thought, 2nd ed. Edited by Andrew Ortony. Cambridge: Cambridge University Press, 1993.

Rzewski, Fredric. “Music and Political Ideals.” in Nonsequiturs: writings and lectures on improvisation, composition and interpretation. Edited by Gisela Gronemeyer and Reinhard Oehlschlägel, (Köln: MusikTexte. 2007). Shaer, Orit and Eva Hornecker. “Tangible User Interfaces: Past, Present, and Future Directions,” Foundations and Trends in Human–Computer Interaction 3/1–2 (2009): 1–137.

Shayan, Shakila, Ozge Ozturk, and Mark A. Sicoli. “The Thicknes of Pitch:

Crossmodal Metaphors in Farsi, Turkish, and Zapotec.” Senses & Society 6/1 (2011): 96–105.

Stone, Ruth M. “Toward a Kpelle Conceptualization of Music Performance,” Journal of American Folklore 94/372 (1981): 188–206.

Stratton, George. “Vision without inversion of the retinal image.” Psychological Review 4/4 (1897): 341–360.


Sweetser, Eve. “Looking at space to study mental spaces: Co–speech gesture as a crucial data source in cognitive linguistics.” in Methods in Cognitive Linguistics, Edited by Monica Gonzalez–Marquez, Irene Mittelberg, and Seana Coulson. Amsterdam: John Benjamins Publishing, 2007.

Trueman, Dan. “Why a laptop orchestra?” Organised Sound 12/2 (2007): 171-179.

Wanderley, Marcelo and Nicola Orio. “Evaluation of Input Devices for Musical Expression: Borrowing Tools from HCI.” Computer Music Journal 26/3 (2002): 62–76.

Wessel, David and Matthew Wright. “Problems and Prospects for Intimate Musical Control of Computers.” Computer Music Journal 26/3 (2002): 11–22.

Whalley, Ian. “Software Agents in Music and Sound Art Research/Creative Work: current state and possible direction.” Organised Sound 14/2 (2009):


Zibkowski, Lawrence M. “Metaphor and Music Theory: Reflections from Cognitive Science,” Music Theory Online 4/1 (1998).