Sie sind auf Seite 1von 191

Hearing Tsur’s “Poetic Mode”:

The Text/Music Relationship from Monody to Björk

by

Lorena M. Guillén

May 25, 2007

A Dissertation submitted to the


Faculty of the Graduate School of State
University of New York at Buffalo
in partial fulfillment of the requirements for the
degree of

Doctor of Philosophy

Department of Music
UMI Number: 3262037

Copyright 2007 by
Guillen, Lorena M.

All rights reserved.

UMI Microform 3262037


Copyright 2007 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company


300 North Zeeb Road
P.O. Box 1346
Ann Arbor, MI 48106-1346
Copyright by
Lorena M. Guillén
2007

ii
ACKNOWLEDGEMENTS

Several people made possible the realization of this work. First, I wish to

acknowledge my advisor Michael Long who patiently and wisely guided me through this

process. I also really appreciate the time and insightful comments provided by my

committee members—Martha Hyde, Peter Schemlz and Jeffrey Stadelman.

The last years of research and writing were made possible by The Doctoral

Dissertation Fellowship from the College of Arts and Sciences of the State University of

New York at Buffalo. I want to thank them for their support.

I want to mention Barbara Hein and Martina Anderson, who carefully read and

edited my document, Martina Möetz for assisting with translations, and William

Egginton for introducing me to some of the important linguistics literature.

I would like to thank Gloria Escobar and Esperanza Roncero for allowing me to

conduct my questionnaires in her classes, and certainly, all Hartwick College students

that volunteered their time to answer my questions.

And last, I am grateful to my family, my husband, Alejandro Rutty, and my

babies, Xul y Mora, who patiently waited for their mother to finish her dissertation.

Alejandro with his sharp comments on my topic provided a continuous dialogue that

made me build a coherent discourse to defend my arguments.

iii
TABLE OF CONTENTS

ACKNOWLEDGEMENTS…………………………………………………………….iii

LIST OF EXAMPLES………………………………………………………………….vi

LIST OF TABLES……………………………………………………………………..viii

ABSTRACT………………………………………………………………………….......ix

I. INTRODUCTION. THE POETIC MODE………………………………………..…1

The Poetic Mode of Text Perception……………………………………...8

The Perceptual Process of Text………………………………………..…12

Expressive Potential: Three Types of Poetic Modes…………………….18

Two Contributions from the Study of Prosody…………………………..19

II. MONODY, RECITATIVE AND ART SONG………………………………………...22

Intonation Analysis Terminology………………………………………..22

Low-Level Mimesis: Monody and Recitative…………………………...24

High-Level Mimesis: The 19th Century Lied............................................39

III. FURTHER EXPLORATIONS:

MEREDITH MONK AND LUCIANO BERIO…………………………………63

Volcano Songs: Meredith Monk…………………………………………70

A-Ronne: Luciano Berio…………………………………………………85

IV. THE POPULAR SONG…………………………………………………………..114

The Highly Structured Song: Tin-Pan Alley Tune……………………..120

The Narrative Type: Strophic Form…………………………………….135

Redundancy: Variation on the “Verse-Chorus” Form………………….142

iv
Empirical Data: Questionnaires’ Results………………………………157

V. CONCLUSION…………………………………………………………………….170

BIBLIOGRAPHY……………………………………………………………………………175

v
LIST OF EXAMPLES

EX. 1.1: From Caccini’s Sfogava con le stelle, mm. 5-16………………………………28

EX. 1.2: Recitative from Scene V, W. A. Mozart’s Don Giovanni……………………...32

EX. 1.3: Alto’s recitative n. 8 from Handel’s The Messiah……………………………...38

EX. 1.4: Reichardt, Kennst du das Land, 1st stanza……………………………………...46

EX. 1.5: Zelter’s Kennst du das Land, mm.1-27………………………………………...47

EX. 1.5 (cont.): Zelter’s Kennst du das Land, mm.28-53………………………………..48

EX. 1.6: Beethoven’s Kennst du das Land, mm.1-17………………………….………...51

EX. 1.6 (cont.): Beethoven’s Kennst du das Land, mm.18-43…………………………..52

EX. 1.7: Beethoven’s “Kennst du”-questions’s rhythmic pattern……………………….53

EX. 1.8: Schubert’s Kennst du das Land, mm.1-18. ……………………………………55

EX. 1.8 (cont.): Schubert’s Kennst du das Land, mm.19-40………………………….....56

EX. 1.9: Schumann’s Kennst du das Land, mm. 1-20…………………………………...59

EX. 1.9 (cont.): Schumann’s Kennst du das Land, mm. 21-41……………………….....60

EX. 2.1: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:25 to 0:35…………...74

EX. 2.2: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:05 to 0:14…………...75

EX. 2.3: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:15 to 0:20…………...76

EX. 2.4: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:36 to 0:43…………...77

EX. 2.5: Monk, Volcano Songs: Duets, “Walking Song,” min. 2:27 to 2:46…………...79

EX. 2.6: Fragment from Monk, Volcano Songs: Duets, “Lost Wind”………………….80

EX. 2.7: Fragment from Monk, Volcano Songs: Duets, “Hip Dance”………………….81

EX. 2.8: Fragment from Monk, Volcano Songs: Duets, “Cry # 1”……………………..84

EX. 3.1: Rhythmic riff played by the brass section in Fitzgerald’s version of “All the
Things You Are”……………………………………………………………………132

vi
EX. 3.2: “All the Things You Are”: Two versions and published score of the A section.
Each in the key performed or published. Trans. by L. Guillén……………………..133

EX. 3.3: Transcription of opening three measures of Dylan’s


“Simple Twist of Fate”…………………………………………………………….138

EX. 3.4: Transcription of mm.4 to 6 of Dylan’s “Simple Twist of Fate”………………139

EX. 3.5: Transcription of the refrain of Dylan’s “Simple Twist of Fate”……………...139

EX. 3.6: Transcription of all the sections of Björk’s “Isobel” …………………………144

EX. 3.7: Transcription of the opening eleven measures of Gabriel’s “Sky Blue”…….154

EX. 3.8: Layout of solo and vocals in the first chorus of Gabriel’s “Sky Blue”……….155

EX. 3.9: Layout of solo and vocals in the second chorus of Gabriel’s “Sky Blue”……156

vii
LIST OF TABLES

TABLE 1.1: Intoneme analysis of Caccini’s Sfogava con le stelle……………………...29

TABLE 1.2: Intoneme analysis of recitative from Scene V, W. A. Mozart’s Don


Giovanni…………………………………………………………………………………31

TABLE 1.3: Intoneme analysis of Alto’s recitative n. 8 from Handel’s The Messiah…..37

TABLE 2.1: Classification of performance instructions in Berio’s A-ronne…………....94

TABLE 2.2: Exchange between tenor 1 and baritone 1 in Berio’s A-ronne, 18 to 20…..99

TABLE 3.1: Song format of Björk’s “Isobel”………………………………………….143

TABLE 3.2: Layout of parallel lyrics constructions in the chorus of Björk’s “Isobel”..147

TABLE 3.3: Song format of Gabriel’s “Sky Blue”…………………………………….151

TABLE 3.4: Results from question #1…………………………………………………159

TABLE 3.5: Results from question #2…………………………………………………161

TABLE 3.6: Results from question #3…………………………………………………162

TABLE 3.7: Results from question #4 on Dylan’s “Simple Twist of Fate”……………163

TABLE 3.8: Results from question #4 on Gabriel’s “Sky Blue”………………………163

TABLE 3.9: Results from question #4 on Bjork’s “Isobel”……………………………164

TABLE 3.10: Results from question #4 on Fitzgerald’s version of “All the Things Your
Are”………………………………………………………………………………....164

TABLE 3.11: Results from question #5………………………………………………..165

viii
ABSTRACT

The pleasure we experience through hearing a song depends largely on musical

gestures—sonorous stimuli that are a complex web of musical and language parameters.

The listener’s emotional experience is mostly independent of the understanding of the

semantic meaning of the lyrics.

This dissertation looks into how people listen to song and how, consciously or

unconsciously, that affects the strategies implemented by composers and songwriters

while facing the task of creating their pieces: Which are the diverse compositional tactics

employed to manipulate the focus of the listener’s perception of the text? How can

composers and songwriters emphasize, compensate for, or oppose this sonic connection

of the listener to the song’s text?

Our first tendency as listeners is to connect musically to the popular song or

vocal “art” work. Although there is an intention of deciphering the semantic message of

the lyrics, it is only after repeated listening that the audience is able to apprehend the

piece as a cohesive discourse. Aside from paying attention to the strict musical elements

of the piece, listeners predominantly perceive the sonic or musical aspects of its lyrics:

the colors of its phonemes, the prosodic arch of intonation of its phrases, the sonic quality

of the performing voice, and the specific colors and inflections that the voice adopts at

each phrase.

In order to test the hypothesis previously proposed and then explore further

ramifications, two different research tools were implemented: direct observation of

materials (analysis of vocal pieces through listening to specific recordings and analysis of

ix
scores) and surveys of college students to observe their perception of the four popular

songs analyzed.

x
I.

INTRODUCTION

THE POETIC MODE

In many non-English speaking countries, Anglo-American popular song is

consumed as frequently and as enthusiastically as indigenous musical repertoires. Even

listeners with no command of the English language engage at some internal level with

their favorite songs. Growing up in Argentina, I was no exception, and many songs in

foreign languages marked my youth. While singing along, my peers and I often

mimicked the sound content of the words, making uip nonsense syllables that stood in for

the actual lyrics. This phenomenon has always intrigued me. How did we engage

“emotionally” with these songs while ignoring the meaning of their texts? And what sort

of pleasure resided in the singing of nonsense syllables rather than meaningful words?

Years later, while watching the 2005 Super Bowl, I witnessed a scene that

resonated with my childhood experience. During that year’s game, the halftime

entertainment was provided by Paul McCartney, who performed some of his well-known

hits, including “Hey Jude.” The audience’s participation increased dramatically at this

point of the show, demonstrating the great popularity of this song among the American

public. The crowd sang along enthusiastically with the chorus, and, surprisingly, some

audience members held up signs containing the non-sense syllabic utterance “na na na

nananana.” Was this the most memorable text phrase of this famous song? Was this event

in anyway related to the way non-English speakers experience Anglo-American song?

The fundamental questions behind this dissertation, which considers how mainstream

1
listeners “hear” song (i.e.,music-with-text), were generated by my own history but were

cast into a more generalized context by this very American musical moment.

Rethinking these issues, I came to the conclusion that the pleasure that we

experience through hearing a song depends largely on musical gestures—sonorous

stimuli that are a complex web of musical and language parameters. These “sound

patterns” interplay through repetition, contrast, reinforcement and de-emphasis. The

listener’s emotional experience is mostly independent of the understanding of the

semantic meaning of the lyrics.

The hypothesis in this dissertation proposes that our first tendency as listeners is

to connect musically to the popular song or vocal “art” work. Although there is an

intention of deciphering the semantic message of the lyrics, it is only after repeated

listening that the audience is able to apprehend the piece as a cohesive discourse. Aside

from paying attention to the strict musical elements of the piece, listeners predominantly

perceive the sonic or musical aspects of its lyrics: the colors of its phonemes, the prosodic

arch of intonation of its phrases, the sonic quality of the performing voice, and the

specific colors and inflections that the voice adopts at each phrase.

Numerous essays have been devoted to exploring the issue of the music and text

relationship, and many others have proposed insightful analysis of song and other vocal

genres, with special emphasis on observing the structuring of music around a poem.1

1
For specific articles on text and music relationship issues, see Walter Bernhart, Steven Paul Scher and
Werner Wolf, ed., Word and Music Studies 1: Defining the Field, ed. (Amsterdam-Atlanta, GA: Rodopi,
1999). This volume gathers essays written by members of the International Association for Word and
Music Studies (WMA). Of particular interest are: Steven Paul Scher, “Melopoetics Revisited. Reflections
on Theorizing Word and Music Studies,” and Suzanne M. Lodato, “Recent Approaches to Text/Music
Analysis in the Lied, A Musicological Perspective.”
Although impossible to name all, here are some texts that contain original song analysis: Charles Rosen,
The Romantic Generation (Cambridge, Massachusetts: Harvard University Press, 1995), particularly his
“Chapter Three: Mountains and Song Cycles”; Kofi Agawu, “Theory and Practice in the Analysis of the

2
However, very little has been said about how people listen to song and how, consciously

or unconsciously, that affects the strategies implemented by composers and songwriters

while facing the task of creating a song: Which are the diverse compositional tactics

employed to manipulate the focus of the listener’s perception of the text? How can

composers and songwriters emphasize, compensate for, or oppose this sonic connection

of the listener to the song’s text?

In an effort to define the nature of the relationship between the two semiotic

realms converging into song, music and language, some poststructuralist analysts and

philosophers have given special attention to the preponderant role of music in this fusion.

Thus, there is the “assimilation model” of Suzanne Langer. Although she conceives the

capacity of the poem to trigger the composer’s imagination, she admits that music

transforms “the entire verbal material, sound, meaning, and all—into musical elements.”2

She argues:

When words and music come together in song, music swallows words; not only mere
words and literal sentences, but even literary words-structures, poetry. Song is not a
compromise between poetry and music, though the text taken by itself may be a great
poem; song is music. 3

On the opposite side of the semiotic analytical arena, Lawrence Kramer conceives

of song as a structure where words and music coexist without losing their individual

Nineteenth-Century ‘Lied’” in Music Analysis 11, no.1 (Blackwell Publishing: March, 1992); Lawrence
Kramer, Music and Poetry: The Nineteenth Century and After (Berkeley: University of California Press,
1984); David B. Lewin, “Figaro’s Mistakes,” Carl Sachter, “Motive and Text in Four Schubert Songs,” in
Engaging Music: Essays in Music Analysis, ed. Deborah Stein (New York-Oxford: Oxford University
Press, 2005); Susan Youens, Schubert’s Poets and the Making (Cambridge-New York: Cambridge
University Press, 1996); John Daverio, Robert Schumann: Herald of a “New Poetic Age” (New York-
Oxford: Oxford University Press, 1997)
2
Langer, Suzanne, Feeling and Form, as quoted by Kofi Agawu in Theory and Practice in the Analysis of
the Nineteenth-Century ‘Lied’, 5.
3
Ibid., 5.

3
essences: “A poem is never really assimilated into a composition; it is incorporated, and

it retains its own life, its own body, within the body of the music.”4

Kramer describes song as “a regressive form of utterance.”5 He argues that music

alienates the singing of the words as a speech-act:

The style of the classical art song since the Renaissance heightens the tension between
words and music in two fundamental ways: first, by adopting an intonational manner that
presents the voice as a precisely tuned instrument rather than as a source of utterance, and
second, by opening the possibility of a musical response to the poetry that is complex
enough to raise questions of interpretation. Other features—the expressive forcing of high
and low tessitura, where the sound of the words inevitably fades into the effort of
attacking the pitch; the complication of rhythm and the varied movement of the voice
toward and away from speech-like patterns, the repetition, alteration, and syntactic
breakdown of the text—also contribute to alienating the singing of the words...6

According to Kramer, song undergoes deconstructive processes simultaneously in

two ways: “overvocalization” and “songfulness.” These processes create an effacement

of meaning. In technical terms, the “overvocalization” applies a kind of topological

distortion to song, dissolving the employed language into its physical origin, the

vocalization. Kramer defines “overvocalization” as the over stretching and twisting of

text to accommodate a musical setting with melismas or sustained notes. This

“purposeful effacement of text by voice” is also produced by “songfulness.”7 More than

technically, the difference between “overvocalization” and “songfulness” is found in an

almost metaphysical ground:

What separates them is a blend of purpose and circumstance. Overvocalization projects


meaning loss as the outcome of a rupture, a wrenching of song beyond the symbolizing
terrain of language and even conception, and therefore beyond the type of regulated
subjectivity mandated on the terrain by the laws of what Lacan calls the symbolic order.
Songfulness projects meaning loss as the outcome of a relative indifference of meaning, a
4
Lawrence Kramer, Music and Poetry (Berkeley: University of California Press, 1984), 127.
5
Ibid.
6
Ibid.
7
Lawrence Kramer, Musical Meaning: Towards a Critical History, (Berkeley and Los Angeles, California:
University of California Press, 2002), 63.

4
kind of higher carelessness or forgetfulness that simply does not avail itself of the
symbolic, allows the symbolic to lie unused even if its words may still be heard clearly.8

“Songfulness” refers to the transformation of song into a fusion of vocal and musical

utterance judged to be pleasurable independent of verbal content. It depends on the

enveloping effect of the human voice, which at the physical level surrounds the listener

with the fullness of its vibrations and always implies potential meaning.

Kramer purposely eludes any objective definition of “songfulness” arguing that it

is one of those aesthetic qualities that is immediately recognized but difficult to account

for. Is “songfulness” an attribute of the vocal music itself, the particular performance or

the ears of the listener? This dissertation proposes that the nature of the “songfulness”

attribute may be explained in terms of the linguistic field of cognitive poetics.

According to Reuven Tsur, a linguist in the field of poetics, poetry is considered

more “poetic” wherever it makes full use of disruptive tactics. Such tactics might create

“sound patterns”—webs of phonemes associated by alliteration, assonance and rhyme

across the poem. These sonic webs may trigger new narrative operations parallel to the

“primary narratemes” and explore the potential of phonemes to communicate different

emotional moods. The poetic text is perceived in what he calls “the poetic mode,” where

“some non-speech qualities of the signal seem to become accessible, however faintly, to

consciousness” along with “speech mode” processing.9

8
Ibid.
9
Reuven Tsur, What Makes Sound Patterns Expressive?: The Poetic Mode of Speech Perception (Durham
and London: Duke University Press, 1992), 14.

5
This dissertation proposes that the natural disruption produced by the

“overvocalization” effect is mounted over the already existent speech disruption of

poems in general. The result is what Kramer calls “songfulness.” But why do we perceive

song in this “songful” way? It is because we hear it in what Tsur calls the “poetic mode.”

Poetry is considered more “poetic” when it makes full use of disruptive tactics. In

a similar manner, music history shows that vocal setting styles have been considered

closer to a “singing quality” when “overvocalizing” takes place, and as a consequence

disruption of the natural flow of the discourse occurs. The degree of discourse

dismembering is directly proportional to the “singing” musicality of that song or vocal

piece. However, vocal setting modalities that aim for a “speech quality” try to keep their

musical elements as plain and unobtrusive as possible in order to guarantee a closer result

to the usual processing of the ”straight” syntactic flow.

The type of musical setting determines if the listeners will appreciate a song in (1)

a “poetic mode,” in which the “speech mode” of perception dominates; (2) a “poetic

mode,” in which there is a balance between the “speech mode” and the “auditory mode;”

or (3) a “poetic mode,” in which the “auditory mode” dominates. First, those settings that

aim to express through their semantic content try to preserve the integrity of the discourse

and therefore tend toward a “speech mode” of processing the text. They situate the

affective potential of language in its prosody, as found in its natural speech state. Second,

those settings that aim to express through their “songfulness” accept the new, artificial

inflections of language under the influence of a song’s melodies and a musical

arrangement’s textures and tend toward a balanced “poetic mode.” They situate the

affective potential of language in its musical-poetic qualities—sonic and semantic

6
elements intermingled. Lastly, those settings that find the distortion that the text

undergoes as irreversible further fragment language in its minimal components

(phonemes) and paralingual gestures and use them as compositional elements. They

situate the affective potential of language in its sonic qualities.

After reviewing central theoretical issues on text-processing in the introduction of

this dissertation, sections two to four are dedicated to the application of those

perspectives in the analysis of musical pieces of diverse musical styles and periods. These

serve to exemplify variations on the “poetic mode” of text perception and the various

compositional tactics that affect the way people listen to them. Thus, the selected

examples of monody and recitative represent the kind of “poetic mode,” in which the

“speech mode” dominates; the nineteenth-century German lieder, as well as the four

analyzed popular song cases, exemplify the “poetic mode,” in which there is a balance

between the “speech mode” and the “auditory mode”; finally, the two twentieth-century

avant-garde vocal selections illustrate the “poetic mode,” in which the “auditory mode”

dominates.

This selection by no means pretends to be a systematic historical overview of the

approaches to the relationship between text and music; it serves only to exemplify

different modes of perception derived from the application of Tsur’s concepts. Each

analysis presents a different methodological approach that most clearly articulates those

compositional procedures that reveal the way people perceive those kinds of settings.

In order to test the hypothesis previously proposed and then explore further

ramifications, two different research tools were implemented: direct observation of

materials (analysis of vocal pieces through listening to specific recordings and analysis of

7
scores) and surveys of college students to observe their perception of the four popular

songs analyzed. Both data collection methods take into account the listeners’ perspective.

This research has exploratory intentions. It introduces a new explanation for the

phenomenon of perceiving text set into music in a “poetic” way and, at the same time,

observes the strategies implemented by composers and songwriters to manipulate the

focus of the listener’s perception. Although preserving as its ultimate goal the creation of

a theory of perception of text in vocal music, its immediate intentions are more modest:

to achieve definitional clarity and to generate an initial hypothesis. This research is a

beginning, a pilot test of a theory imported from the linguistics of poetics—reformulated,

expanded and adapted to explain the perception of vocal music.

The Poetic Mode of Text Perception

The different degrees of linear connection and disruption of a text influence the

cognitive processing of it. Disruptive textual qualities can be produced by a divergence

between linguistic strings of arbitrary verbal signs (words) and repeated phonetic sound

clusters (produced by alliteration, assonance and rhyme), or between syntactic

organization (sentences) and the prosodic units (poetry lines or intonation phrases). These

two organizational levels, the linguistic/syntactic and the phonetic/prosodical, contrast

one other and pull the reader’s or listener’s attentions in opposite directions. Tsur argues:

In connected speech there is a tendency to proceed linearly rather than move in different
directions from the central sequence...to allow for the disruption of the linear sequencing
of speech sounds (that is, for segregating the relevant portions of the auditory stream), the
whole message must be less thoroughly organized on all levels as the linguistic stress
pattern diverges from the conventional metric pattern, and as does the syntactic unit
(clause, sentence) from the prosodic unit (line)...10

10
Ibid., 73.

8
Poetic texts usually tend to disruption, which triggers a “poetic mode of

listening,” where there is freedom for segregating or grouping portions of the sound

stream and moving back and forth between auditory and phonetic modes of listening. The

resultant “sound patterns” assume the emotive effects of non-referential sound gestures;

they are perceived as music by the brain’s right hemisphere—a process that involves the

identification of overtone structures as individual tone colors, a musical kind of listening.

More specifically, Tsur comments—based on findings by A. M. Liberman and his

colleagues of the Haskins Laboratories at Yale University11—that we have a “speech

mode” and a “non-speech mode” of listening, which follow different paths in the neural

system. The same transitions between phonemes that in the speech continuum are heard

as speech sound (because they appear to carry linguistic information), when isolated are

heard as musical sounds. But, according to Tsur, there is a third mode, the “poetic mode,”

in which “some non-speech qualities of the signal seem to become accessible, however

faintly, to consciousness” along with the speech mode processing.12 These three

“listening modes” depend on the way the acoustic signal is processed. In the “non-speech

mode” (processed by the right hemisphere of the brain), we attend away from the

overtone structure to tone color, as when we hear musical sounds or natural noises. In the

“speech mode” (processed by the left hemisphere of the brain), we process the signal

attending away from the overtone or formant structure to the phoneme, and the tone color

11
Some of the specific articles that summarized these findings are: A.M. Liberman and David Isenberg
“Duplex Perception of Acoustic Patterns as Speech and Nonspeech” in Status Report on Speech Research
SR-62 (Haskins Laboratories, 1980): 47-57.; A.M. Liberman, I.M. Mattingly and M.T. Turvey, ”Language
Codes and Memory Codes” in Coding Processes in Human Memory, A.Melton and E. Martin, ed. (New
York: Wiston, 1972); A.M. Liberman, F. S. Cooper, D .P. Shankweiler, and M. Studdert-Kennedy,
“Perception of the Speech Code” in Psychological Review 74 (1967): 431-61; B. Repp, C. Milburn and
John Ashkenas, “Duplex Perception: Confirmation of Fusion” in Perception & Psychophysics 33, no.4
(1983): 333-337.
12
Tsur, What Makes Sound Patterns Expressive, 13

9
is taken into account or almost suppressed. In the “poetic mode,” the main processing is

identical with the speech mode except that certain precategorical information (such as

tone color) enters consciousness.13 In this mode, it is possible to switch back and forth

between “auditory” and “phonetic” modes of listening, either simultaneously or in rapid

succession.

The musical setting of text produces disruption to differing degrees, posing an

irresolvable agonic relation between understandability of text and musicality of the

vocalized word. This disruption is mounted over the disruptive sound groupings produced

by rhyme schemes and phonetic interplays among words already present in any poetic

text as specified in previous paragraphs. The resulting effect could communicate two

kinds of messages: one that, according to Tsur, explores the double-edged expressive

capacity of phonetic sounds in connection with the words that contain them; and the other

concerns itself mainly with the semantic meaning of the words or the “parallel

narratemes” that words make up after being associated by phonetic similarities. These

“parallel narratemes” could oppose or complement the “primary narratemes” made up

from the “straight” syntactic flow of narrative.14

Regarding the double-edged expressive capacity of phonemes, Tsur says that

those sounds have meaning but not a specific one: “they may express vastly different or

13
The term “precategorical” refers to the “categorical perception” phenomena explored at the Haskins
Laboratories at Yale University. The human ear has the tendency to fuse the continuous variation of color
and pitch within phonetic linguistic categories (the repetition of one particular phoneme, for example,
minimal variations in the formants of the [b] phoneme). This is a similar process to the fusion of overtones
in sound stimulus. But, at the same time, the ear perceives as a quite distinctive difference the change from
one category to the other (the change from one particular phoneme to a different one, for example, from [b]
to [d]) as a quite distinctive difference. We, as humans, perceive the phonemes that make speech as
individual categories beyond their minimal formant variations. In contrast, natural noises and music are
perceived in a continuous manner, attending to every single formant variation.
14
The tern “narrateme” has been coined by Didier Coste in his book Narrative as Communication
(Minneapolis: University of Minnesota Press, 1989). Narrateme is the minimal unit of the narrative

10
even opposing qualities.”15 As its syntactic and semantic context, the “word” influences

the enhancement of different potentialities of those sounds. An abstraction or general

meaning of the combined sounds is grasped and runs parallel to the semantic abstraction

of the words united by sonorous similarities such as alliterations, rhyme schemes, etc.

Regarding the second possible message, Didier Coste, in his book Narrative as

Communication, explains that when poetry carries a narrative type of discourse, there is a

tension between verse and narrative. Besides the (usual) way of processing the straight

syntactic flow of narrative, other messages can be constructed from “narrative

operations” based on phonetic or rhyme connections between two words. In some cases,

the ambiguity of the contrast between phonetic affinity and semantic disjunction points to

the insufficiency of the “primary narratemes” to account fully for the narrative

significance of the poem.16 Poems are usually prized for their incompatibility with

straight narrative.

Narrative is not the only type of discourse used in vocal settings, although it is the

predominant type. Other kinds of non-narrative discourses—both linguistic (such as

description, definition, and injunctive17) and paralinguistic text (phonetic or

onomatopoeic)—are used in other vocal pieces, the latter mostly in 20th and 21st century

avant-garde pieces. The poetry or the song lyric format makes use of these types of

discourses.18

discourse; it is “an utterance that contains an actional predicate” (Coste, p. 36); in other words, it represents
an event.
15
Tsur, What Makes Sound Patterns Expressive?, 2.
16
Didier Coste, Narrative as Communication (Minneapolis: University of Minnesota Press, 1989).
17
The injunctive type of discourse is understood as laws, orders to somebody else, or ordering of things.
18
There are vocal pieces that use prose. In some cases, this prose has poetic tendencies (it explores the
sound patterns of the words and phonemes or interplay between words at a semantic level). But, in other
cases, straight prose has been used in chamber vocal pieces like Berio’s Sinfonia, in which some of the
sources are philosophical texts by Levi-Strauss and political speeches.

11
The Perceptual Process of Text

Both song lyrics and poems set into popular or art songs are conceived by their

creators and received by their audiences in a different manner than the common speech

text of an everyday conversation. These literary texts are more like oral poetic story

telling. But they have one characteristic in common with speech: their orality. They are

literary forms that are communicated verbally during the song performance. The

communication process of the text of the song lyrics and poems, whatever its nature,

begins at this aural dimension. At a macro-level, the communication processes of both

literary texts and everyday speech are similar. At a micro-level, however, the specifics of

the processes are different.

At the first stage (macro-level), the auditory perception takes the form of a

process of “analysis and synthesis”:

The listeners “decode” the input speech signal by using their knowledge of the
constraints that are imposed by the human articulatory “output” apparatus.19

A reference to the articulatory gestures that are involved in the production of

speech helps the listener to decode segmental phonemes, intonation and stress. The

acoustic signal may be partially ignored and filled in by the listeners’ own syntactic and

semantic knowledge of language and the social context of the communication act.

The listener apparently comprehends the message by a process of “hypothesis formation”


that involves analysis-by-synthesis where the context guides the recognition routine. The
listener may consider a comparatively large “chunk” of speech, and he is often able to
“guess” what the speech signal should be from the context that is furnished by the
“chunk.”20

19
Philip Lieberman, Intonation, Perception, and Language (Cambridge, Massachusetts: The M.I.T. Press,
1967), 162.
20
Ibid., 163.

12
At times, when the speaker realizes that the listener may infer the rest of the

message with only certain minimal information, he simplifies his articulatory control over

it. Lieberman says that a speaker may neglect to articulate a word carefully in such a

case. The listener will then create a hypothesis regarding the phonetic character of the

segments that are unrecognizable from the acoustic signal and, applying phonological and

syntactic rules, will form a hypothetical phrase. This hypothetical phrase may be

semantically reasonable and consistent with its context. If that is not the case, the listener

may try another hypothesis or simply not understand the message.

Once this primary or macro level of communication of speech is completed,

another process is triggered in which the listener tries to make sense of the whole

message at a deeper, or micro, level. The literary text and strategies used in it are simply

a starting point, from which the reader, or listener in our case, must construct for himself

the aesthetic object. The communicative act is initiated by the text but depends on the

active involvement of the reader. The texts should stimulate the individual reader’s

faculty of perception and processing.

The decoding proceeds in “chunks” rather than by single words. These chunks

correspond to the syntactic units of a sentence. These individual sentences do not directly

denote objects. Literary text does not denote empirically existing objects; although text

may select objects from the empirical world, they are depragmatized. The literary

aesthetic object is built up in such a way that these intentional “sentence correlates” join

in semantic units. Wolfgang Iser says that “the semantic pointers of individual sentences

always imply an expectation of some kind…As this structure is inherent in all intentional

13
sentence correlates, it follows that their interplay will lead not so much to the fulfillment

of expectations as to their continual modification.”21

According to Roman Ingarden—as Iser paraphrases—during the flow of thinking

a sentence, after completing the thought of that sentence, we are prepared “to think its

‘continuation’ as a new sentence, especially one that has connection with the previous

one.”22 Each of these “sentence correlates” contains what Iser calls “a hollow section,

which creates expectation pointers towards the next sentence, and a retrospective section,

which complies with the expectations of the preceding correlate, which at this point is

part of the background. According to Iser, this creates a constant “dialectic of protension

and retension, conveying a future horizon yet to be occupied, along with the past horizon

already filled…”23 What has already been heard undergoes a permanent synthesizing

process. Every sentence shrinks in the memory and becomes some sort of background,

which is constantly restructured by the correlates that evoke it by associative relations.

If the new sentence answers the expectations aroused by the previous correlate, the range

of semantic horizons narrows. Descriptive texts, especially, behave in this way in order to

individualize the particular object. But when the new sentence does not fulfill the

expectations, the resulting frustration retroactively affects what has already been read.

Although connectability is fundamental to the construction of texts in general, literary

fictional texts and pragmatic expository language behave very differently. In order to

guarantee the reception of a specific given fact, the expository text tends to stay as

cohesive as possible.

21
Wolfgand Iser, The Act of Reading: A Theory of Aesthetic Response (Baltimore and London: The Johns
Hopkins University Press, 1978), 111.
22
Ibid., 112.
23
Ibid.

14
Whenever the expository text unfolds an argument or conveys information, it
presupposes reference to a given object; this, in turn, demands a continuous
individualization of the developing speech act, so that the utterance may gain its intended
precision. Thus, the multiplicity of possible meanings must be constantly narrowed down
by observing the connectability of textual segments, whereas in fictional texts the very
connectability broken up by the blanks tends to become multifarious. It opens up an
increasing number of possibilities, so that the combination of schemata entails selective
decisions on the part of the reader.24

The involvement of the listener is necessary to “activate the interplay of correlates

prestructured by the sequence of sentences.”25 The listener reconstructs the gaps that the

text leaves from what is revealed along its development. In return, the information made

explicit is transformed when what is left open is discovered. These blanks are one of the

fundamental differences between literary language—written or oral, as in the case of

“song” —and everyday speech.

The coherence and connectability of speech also depend upon certain extra-

textual conditions that in pragmatic language are a given and in fiction have to be

recreated every time, such as a “‘non-verbal frame of action…as matrix for utterances’;

the relation between the recipient and ‘the common referential system of experiences

assumed by the speaker,’ as well as ‘the common area of perception’; and the relation

between the recipient and the communication situation, as well as the ‘speaker’s range of

associations.’”26 Certainly, since in the act of listening to songs there is no direct contact

between the songwriter or the speaker’s voice and the listener, these preconditions need

to be recreated in every piece, as in any other literary fictional language.

Either when listening to a recording or in a live concert situation, when the singer

is not the songwriter himself, an indirect communicative situation exists. The performer

24
Ibid., 184.
25
Ibid., 110.

15
is only interpreting, or creating her own reading of the text (text conceived as the already

combined product of text plus music into the song), and then communicating her version

to the audience. But even in live performances when the performer is the songwriter, an

unavoidable distance exists between the performing artist and her audience that does not

exist in a common dialogue situation. Although in these live settings the performer could

partially provide the “non-verbal frame of action,” the rest of the assumed extra-textual

conditions are a leap of faith that every songwriter makes every time that he composes a

song.

The songwriter, as well as the performer, is alone in this realm since her

knowledge of the audience is limited and general (at least more than in a direct

conversation). The relation between the recipient and “the common referential system of

experiences assumed by the speaker” is probably estimated by the songwriter based on

the audience that she has in mind at the moment of composing. “The common area of

perception” does not apply in this situation because this is not a conversation that is

developing in real time, so it cannot be modified and affected according to the

surrounding environment. In the same way, the relation between the recipient and the

communication situation, as well as the “speaker’s range of associations,” escapes this

kind of communicative situation because of the predetermined nature of the message

being delivered. Beyond the interpretative nuances that any performer may add in

reaction to a participative audience, the song is a pre-composed form not improvised

according to the audience reaction. So, in every case, the live performance as

communicative act comes closer to the way a fictional literary text reaches its recipients

26
Ibid.,183. Iser summarizes a list of factors listed by S. J. Schmidt in Texttheorie (UTB 202) (Munich:
Fink, 1973).

16
than to the way an everyday conversation does. This is even clearer in the case of

listening to a recording of a song where even the visual contact is lacking.

Concerned with the “orality” of the song’s literary medium, Roland Barthes

elaborates on the concept of what he calls “the grain of the voice”:

...we never listen to a voice en soi, in itself, we listen to what it says. The voice has the
very status of language, an object thought to be graspable only through what it transmits;
however, just as we are now learning, thanks to the notion of “text,” to read the linguistic
material itself, we must in the same way learn to listen to the voice’s text, its meaning,
everything in the voice which overflows meaning.27

It is the sound of the voice, its quality of tone and intonation patterns that offer the

context from which the listener departs in his interpretive journey of the conveyed

message. First, the quality depends on the circumstantial performer, her involvement with

the musical piece and her interpretation of what the composer/lyricist wants to transmit—

unless performer and composer are the same person. Second, the intonation patterns

depend mostly on the way in which the lyrics were set to music by the composer. But,

ultimately, it is the receiver or listener that activates the connections suggested by these

pointers.

The literary use of the text gaps challenges the listener or reader by withholding

information that could be a given in normal language, so he “…must reformulate a

formulated text if he is to be able to absorb it.”28 In pragmatic speech, this challenge does

not exist and the imagination of the listener is not tested as it is in the literary medium.

The listener may fill in the blanks in the disconnected discourse by asking the speaker.

Iser points to the natural need of language to leave holes in the continuum of any

discourse to allow real meaning to flourish. He quotes Maurice Merleau-Ponty:

17
The lack of a sign can itself be a sign; expression does not consist in the fact that there is
an element of language to fit every element of meaning…Speaking does not mean
substituting a word for every thought: if we did that, nothing would ever be said…we
would remain in silence, because the sign would at once be obliterated by
meaning…Language is meaningful when, instead of copying the thought, it allows itself
to be broken up and then reconstituted by the thought.29

Meaning comes out of the unsaid as much as out of the message carried by the spoken

words. Thus, literary text engages the listener in a constant exercise of interpretation.

Expressive Potential: Three Types of Poetic Modes

All vocal setting modalities, from those that aim to a “speech quality” to those

that aim to a “musical-poetic quality,” want to communicate some kind of affect, but they

locate that expressive potential on different aspects of the text. As a consequence , the

way the musical setting interacts with its lyrics or poem will vary.

As mentioned earlier, those settings that rely on the semantic meaning of their

lyrics for the communication of their expressive potential tend to preserve the integrity of

the speech qualities. They locate this affective potential of language in its “prosody,” as

found in its natural speech state. Other kinds of vocal settings accept the new, artificial

inflections of language under the influence of a song’s melodies and a musical

arrangement’s textures and aim to express through their “songfulness.” The

fragmentation of the sequence of the original discourse can vary to a vast degree but

tends toward a balanced “poetic mode,” which situates the affective potential of language

in its musical-poetic qualities—sonic and semantic elements intermingled. Lastly, those

27
Roland Barthes, The Grain of the Voice: Interviews 1962-1980 (Berkeley and Los Angeles: University of
California Press, 1985): 183-184.
28
Iser, The Act of Reading, 185.
29
Maurice Merleau-Ponty, Das Auge und der Geist. Philosophische Essays, trans. Hans Werner Arndt
(Reinbeck, 1967), p.73f, as quoted by Iser in The Act of Reading, 186.

18
settings that find the distortion that the text undergoes as irreversible further fragment

language in its minimal components (phonemes) and paralingual gestures and use them

as compositional elements. They situate the affective potential of language in its sonic

qualities.

Two Contributions from the Study of Prosody

In addition to the important concepts developed by Tsur in the cognitive poetics

field, linguistic studies in prosody also contribute to enlighten the intermingling of sonic

and semantic elements of language. “Prosody” is understood as the kind of shape that

each intonation unit—which corresponds to syntactic clauses or units of information in

language—takes according to the variation of the parameters that govern its contour and

dynamic: variation of intensity, segmental duration, temporal organization or rhythm, and

variation of the fundamental frequency, as the primary parameter. This is a higher level

of prosody than the one that is generally applied at the lexical level, where the word is the

unit and all the above mentioned parameters take place inside its limits. Despite complex

controversies among researchers in this field, most linguists agree that intonation systems

convey, as Dwight Bolinger says, “how we feel about what we say, or how we feel when

we say.”30 Researchers in this area, including B. Shapiro and M. Danly (1985), have

conducted tests that provide neurological evidence for those linguists, among them

Bolinger, who maintain the fundamental affectivity, rather than grammaticality, of

intonation.

30
Dwight Bolinger, Intonation and Its Uses: Melody in Grammar and Discourse (Stanford, California:
Stanford University Press, 1989), 1.

19
Two phenomena described by some linguists involved in prosody studies are

crucial to the understanding of how different vocal setting modalities act on the

perception process. The first one is found when Bolinger explains that the vocal tones

employed in language are made of overtones produced by the shaping of the different

phonemes, which carry the semantic message, and fundamental pitches that are “mostly

used for mood and punctuating effects”—what is known as prosody.31 In any case, since

the melodic manipulation is essentially the musical arrangement of successive

fundamental pitches, types of song focusing on tune over speech quality affect the

expressive/affective mood of the text. However, these kinds of musical settings do not

modify the overtones produced by the phonemes, and, as a consequence, the strict

message of the text remains intact. By modifying the intonation arches in this manner, we

are essentially presented with a new reading of the text in performative terms.

The second phenomenon is one described by Daniel Hirst. When defining the

difference between speech and song, he says that while normal speech consists of a

continuous sequence of movements from one target-point to the next, in song, the

common prosodic characteristic of these patterns is that the contour is produced as a

sequence of static level tones.32 This description will prove instrumental when analyzing

31
Ibid., Aspects of Language (New York/Chicago/San Francisco/Atlanta: Harcourt, Brace & World, Inc.,
1968), 31.
32
Daniel Hirst, “Intonation in British English” in Intonation Systems: A Survey of Twenty Languages
(Cambridge:, UK: Cambridge University Press, 1998), 71.

20
the different degrees in the spectrum of possible vocal tones employed in the following

musical examples.

21
II.

MONODY, RECITATIVE AND ART SONG

Intonation Analysis Terminology

In order to proceed with the prosodic analysis of monody and recitative in Italian,

it is necessary to provide a brief explanation of the specialized terminology of linguistic

studies as presented by Mario Rossi.1 There is a certain agreement among linguists that

an “intonation unit” is dictated by syntactic rules. Although it does not coincide with the

boundaries of a sentence, it always contains certain specific syntactic elements. The

“intonation unit” is usually a fraction of a sentence composed of a VP (verbal term) and a

NP (noun term), including its modifying adjectives. Whatever its place in the phrase, S-

ADV (sentence adverb) is separated from the other constituents of the intonation unit

during analysis.

The intonation curve at the last syllable of each intonation unit may be either a

“major ConTinuative intoneme” (CT) or a “major ConClusive intoneme” (CC). Each

intonation unit has two “internal Accents” (AC1 and AC2). These accents are at the

lexical, or word, level. As we may observe in the two first examples analyzed, the Italian

language in particular has the tendency of not synchronizing its intonemes with these

“internal Accents” (ACs) because most of the time it carries its lexical stress on the

penultimate or antepenultimate syllable, while the intoneme occurs on the final syllable.

As Rossi describes, usually the CT manifests as “a contour whose pitch is equal to or

higher than that of the AC.”2 Actually, “the pitch contour of the ‘continuative intoneme,’

1
Mario Rossi, ”Intonation in Italian,” in Intonation Systems: A Survey of Twenty Languages (Cambridge:,
UK: Cambridge University Press, 1998)
2
Ibid., p. 225.

22
after AC1, may vary freely between the two pitch extremes of AC1 and AC2, that is to

say between the Mid and the Mid High levels.”3 In contrast, CCs (“major conclusive

intonemes”) tend to be lower in pitch than the ACs, and in general a falling pitch contour

takes over the utterance, which concludes in low pitch levels.

Rossi also mentions that “the duration of the vowel under AC and the

continuative intoneme is longer than of the unstressed vowels…”4 The last intoneme

syllable is probably significantly longer than all the previous atonic prestressed vowels.

An interesting characteristic of the Italian intonation’s dynamic is that “the loci of

temporal prominence are not synchronized with those of pitch prominence.”5 In an ideal

neutral intonation expression, AC2 has the pitch prominence, higher than the rest, even

than AC1 and CT or CC. AC2 may be somewhere between the Mid High and the Mid

pitch levels, while AC1 is around the Mid to Mid Low. The temporal prominence belongs

to AC1. It is longer than any other element, even than AC2, and this factor indicates that

what follows is an intoneme of any type, a CT or a CC. The difference between the AC2

and its preceding and proceeding unstressed syllables is 3 PUs, while the difference

between the AC1 and its surrounding group of unstressed syllables is 6 PUs.6 That

difference between the two ACs and their surrounding atonic syllables indicates the

difference between a stressed group around AC2 and an intoneme around AC1. Between

AC2 and AC1 there is no intoneme. The unstressed vowels lying in between those two

3
Ibid., p. 227.
4
Ibid., p. 225.
5
Ibid. , p. 226.
6
PUs is a duration unit that is calculated as “the log of the ratio of the duration of a given vowel to that of
the vowel carrying AC in the utterance. This value is then normalized by dividing by log (1.22).” (Rossi, p.
238).

23
accents adjust their prosodic values to “the temporal and melodic continuums obtained by

linear interpolation between those two points.”7

Low-Level Mimesis: Monody and Recitative

Certain vocal styles establish a “low-level mimetic relation” with their texts, a

relation at the lexical and prosodic level of their literary sources. These styles have as a

principle the preservation of the text integrity as a logical linear discourse as far as is

musically possible. In order to allow the discourse to flow without interruption, its

practitioners look at intonation and inflections of the text as speech. If language as speech

has the ability to communicate ideas, it is by allowing the texts to behave as such that

their musical settings are able to transmit emotions. One might say that these vocal styles

and genres locate the expressive qualities of language in the intonation system. By

attempting to maintain the integrity of the discourse, these pieces trigger a “poetic mode”

of perception in which the “speech mode” dominates. Two vocal setting modalities may

serve as examples of this approach: Italian monody of the late sixteenth- and early

seventeenth-century and recitative in general.

Monody composers, such as Caccini, D’India, Galiliei, Peri and Monteverdi (in

his new monody phase), claimed to be imitating nature with the new stile rappresentativo

of their vocal compositions. This was not an original claim since earlier and

contemporary madrigalists, such as Cipriano de Rore, Marenzio, Gesualdo and Zarlino,

maintained the same about the counterpoint and “word painting” devices used in their

compositions. The stile rappresentativo or “representational style” consisted of the

mimetic imitation of speech’s rhythm and intonation. Monody composers wanted to

7
Ibid., p. 227.

24
avoid the madrigal analogies of “word painting,” which illustrated the meaning of the

words with specific harmonies created by intervals among the voices, runs up and down,

silent parts or notation devices. They based their monodic settings on the homology of

the spoken word as the closest to human nature that they could get.8

The proponents of this seconda prattica put their efforts into following the textual

rhythm, as in direct natural speech.9 They observed intonational accents, elongation or

shortening and pitch variation of morphemes as much as they musically could. But, in the

end, a general musical sense and sensitivity ruled over the strict transcription of those

intonational parameters. They also followed the phrasing of intonational units and order

of the text, only adding some repetitions toward the end of the poems. These repetitions

were meant to emphasize certain concepts and words. The restatement of words allowed

melodic embellishment, such as melismas, trillo and gruppo, without detracting from the

understandability of the semantic content. Once a term was introduced, it could be

stretched and twisted over long and elaborate ornamentation and still be present in the

short memory of the listener.

Some embellishments were conceived as imitations of the tones of the voice or

“manners of speaking,” as Caccini explains in his preface to Le nuove musiche. For

example, the esclamazione consists of “a gradual loudening of the voice on long notes

into an outcry, made more artful by first diminishing the volume before beginning the

increase…Clearly the esclamazione is the likeness of a sigh.”10 Other ornamentations

described by Caccini are the gruppo and the trillo, which imitate unsteady speech. The

8
For a detailed discussion of the monodists’ reform, see Richard Taruskin, The Oxford History of Western
Music, vol. 1 (Oxford-New York: Oxford University Press, 2005), p.797-847.

25
former is “the artfully simulated vocal tremble” made of the “rapid alternation of

contiguous notes of the scale.”11 The latter consists of the rapid repetition of a single

pitch.

In their score settings, seventeenth-century Italian monodists closely followed the

prosodic characteristics of the text phrases. On the one hand, they set the text phrases to

melodic patterns that mimicked their intonation shapes pitch-wise. On the other hand,

they also replicated the kinds of prolongation performed over accented syllables and the

shorter values of the syllables in between with similar musical rhythmic patterns.

There are two factors that are decisive in bringing monody closer to natural

speech. First, the shapes and rhythm indicated in the score served only as a point of

departure for the real interpretation of the performer. These composers allowed and

expected flexibility in the performing beat of their monodies, giving the performer the

chance to vary the articulation pace according to her or his dramatic interpretation of the

different phrases. Second, the quasi-parlato nature of the vocal sound—with less vibrato

and more continuous movements between pitches—better recreated the sound of the

speaking voice. It is only by these means that monody achieved the emulation of speech

manners.

Monodies kept the accompanying texture as simple as possible: only chords,

usually played by a string instrument—sometimes chitarrone—and notated in the short

hand method known as figured bass. Dissonance was used to highlight certain “rhetorical

9
The “second practice” referred to the stile rappresentativo, as defined by Giulio Cesare Monteverdi in the
postface “Declaration” of Claudio Monteverdi’s score of Scherzi musicali (Venice, 1607). For these
monodists, the madrigalists represented the prima prattica, the stile antico.
10
Taruskin, The Oxford History of Western Music, p. 817.
11
Ibid., p. 817.

26
effects of vocal inflection and delivery.”12 The focus is still on the speech qualities of the

vocal line; and music is only at its service.

By analyzing some examples of this repertoire, it is possible to observe several of

the features described above. In Sfogava con le stelle, Giulio Caccini sets a poem by

Ottavio Rinuccini:

Sfogava con le stelle un’infermo d’amore


Sotto notturno cielo il suo dolore;
E diecea fisso in loro,
O immagini belle del idol mio ch’adoro,
Sì come a me mostrate
Mentre così splendete
La sua rara beltate,
Così mostrate à lei
Mentre cotanto ardete
I vivi ardori miei;
La fareste col vostro aureo sembiante
Pietosa, sì come me ne fata amante.13
O. Rinuccini

Turning now to the score of Caccini’s Sfogava con le stelle, most of its musical

phrases coincide with the intonation units into which the poem could be divided. Thus,

from the third verse to the tenth verse, there are nine intonation units that correspond with

nine musical phrases.

12
Ibid., p. 829.
13
Translation: “There appeared under the stars a man sick with love, and under the night sky he disclosed
his pain; and he said, his eyes fixed on them, ‘Oh, lovely images of my adored Idol, just as you show me
her rare beauty as you shine so brightly, in the same way show her my keen pangs. Perhaps you might
make her pitiful with your golden aspect, just as you made me loving.’” Carol MacClintock, ed., The Solo
Song 1580-1730 (New York: W.W. Norton & Company, Inc., 1973)

27
EX. 1.1: From Caccini’s Sfogava con le stelle, mm. 5-16.

28
Intoneme Units Measures Intoneme Análisis
1 mm. 5 AC1 CT
E diCEa,
2 mm. 5-6 AC2 AC1 CT
FIsso in LOro,
3 mm. 6-8 AC2 AC1 ct AC2 AC1 CT
O immagini BElle del Idol mio ch’aDOro,
4 mm.8-10 AC2 AC1 CT
Sì come a ME mosTRAte,
5 mm. 10-11 AC2 AC1 CT
mentre coSI splenDEte
6 mm. 11-12 AC2 AC1 CT
la sua RAra belTAte,
7 mm. 12-13 AC2 AC1 CT
coSI mostrate à Lei
8 mm. 13-14 AC2 AC1 CT
mentre con TANto arDEte
9 mm. 14-16 AC2 AC1 CC
I VIvi ardori MIei.

TABLE 1.1: Intoneme analysis of Caccini’s Sfogava con le stelle.

As the musical excerpt and table illustrate, all of the continuative intonemes end

on the same or a higher note than their preceding AC1. On the other hand, as expected, in

measure 16, the conclusive intoneme (CC) resolves on a lower note than the AC1: from

A4 to G4 (this last pitch is the modal center of the piece). In most of the phrases, the

highest pitch of each of these units is set on the AC2, while the rest of the morphemes

move downwards and in shorter rhythmic values toward the longest note of the phrase,

on which we find the AC1. This is true at least for the first three intonation units, but

certain exceptions are found in the rest of the units in which the character of the narration

changes.

In the phrases in which the narrator evokes what the “infermo d’amore” (the man

sick with love) said, the hierarchy of the pitches is reversed—with the exception of the

two “Mentre…” phrases that are time clauses directly connected to what follows. They

tend to ascend, thus making the AC2 lower than the AC1. This kind of departure from the

29
basic pattern of rhythmic values and pitch hierarchies is a sign of highlighting pragmatic

content or a specific expressive intention. In cases like this one, when AC1 is pronounced

at the level of AC2 or higher, the speaker is trying to draw the listener’s attention to the

topic. This focus attracts the listener at the same time as it imitates the persuasive tone of

the “sick man” while looking for approval of his “Idol.” Ascending lines are actually

more persuasive and create expectation.

The last intonation unit, “I vivi ardori miei,” is even a further case of focusing.

Although it respects the pitch hierarchy of AC2 (morpheme “vi” on D5) over AC1

(morpheme “mie” on A), the penultimate syllable of “ardori” is elongated, displaying a

florid run that takes almost the whole of measure 15, except the last sixteenth-note. Thus,

an otherwise unstressed syllable of the intonation unit (although a stressed one at the

word level) gains prominence. This stretching treatment of the syllable “do” certainly

highlights the word “ardori,” enhancing with it the importance of the man’s suffering

with his “keen pangs.”

Another vocal setting style that engages in a “low-level mimetic relation” with its

text is the recitative. The operatic, oratorio or cantata recitative of all musical historical

periods continues the same kind of approach to text setting as monody, which respects

most of the intonational contours and phrasing of the text and gives the performer

flexibility on the beat for the final touch of speech-likeness. The recitative between Don

Giovanni, Dona Elvira and Leporello, before the aria “Madamina! il catalogo è questo,”

(Scene V from W. A. Mozart’s Don Giovanni) shows the same kind of close observation

of the phrasing and intonation of Lorenzo Da Ponte’s words as Caccini’s attention to

30
Rinuccini’s. From the opening seven measures of the beginning of their recitative, it is

possible to extract the following seven intonation units.

Intoneme Units Measures Intoneme Análisis


CT
1 mm. 1-3 AC1 AC1 CT
D. Elvira: Sei QUI! / MOStro! /

CT
AC1 AC2 AC1 CC
feLON! / NIdo d’inGAnni!
2 mm. 3 AC2 AC1 CC
Leporello: Che TItoli crusCANti!
3 mm. 4-5 AC2 AC1 CC
Leporello:: manco MAle che lo conosce BEne!
4 mm. 5-6 AC2 AC1 CT
D. Giovanni: VIa, cara Donna ElVIra,
5 mm. 6-7 AC2 AC1 CT
D. Giovanni: calMAte quella COllera;
6 mm. 7 AC1 CT
D. Giovanni: senTIte…
7 mm.7-8 CT
AC2 AC1
D. Giovanni: laSCIAtemi parLAR! 14

TABLE 1.2: Intoneme analysis of recitative from Scene V, W. A. Mozart’s Don Giovanni.

These intonation units set into music observe most of the prosodic rules

previously summarized. This recitative fragment has a peculiar characteristic; it is a

dialogue between three characters in the middle of an agitated discussion. It shows a wide

range of emotions and circumstances, from Donna Elvira’s insults to Leporello’s aside

comments, which translate into intonational contextual effects such as focusing

techniques on specific fragments of the speech.

14
Translation of this fragment: D.E: You are here! Monster! Traitor! Nest of deceits!; Lep: Such pure
Tuscan titles! So much the better that she knows him!; D.G: Come now, dear Donna Elvira, calm this
anger; listen…let me talk! (translation by L. Guillén)

31
EX. 1.2: Recitative from Scene V, W. A. Mozart’s Don Giovanni.

The first section of Donna Elvira’s intervention is a direct call for Don Giovanni’s

attention. “Sei qui!” not only has the expected rising contour of a continuative intoneme

(CT), but it also has the coincidence of the AC and the CT on the last syllable “qui!” The

32
exclamation mark, which indicates the surprise and anger of the character when

discovering Don Giovanni, is represented by the doubling of the rhythmic value of the

quarter-note A4 on which “qui!” is set with respect to the preceding unstressed “Sei” of

only an eighth-note. Another emphasizing factor for “qui!” is that it falls on one of the

strong beats of the measure (the fourth one). “Fellon” is set in the same way as “Sei qui.”

But the previous “mostro,” which has the accent on the penultimate syllable, shows the

splitting of attributes that is typical between AC1-CT. The former is rhythmically longer

and has the same or a higher note than the CT.

The last section of Donna Elvira’s list of insults extends beyond these three short

periods of two syllables. The next insult, “Nido d’inganni,” opens with the AC2 on “ni”

holding the normally expected higher pitch of the whole unit, an F#5. From then on, the

pitch contour mostly descends—with some brief upward deviations—toward the AC1,

which falls on a D5. This whole descending line moves a major third below the AC2’s

F#5. But contrary to the expected lengthening of the AC1 on “ga,” this one stays on the

same eighth-note value as the previous unstressed syllables. In this particular case, this

setting may try to convey the urgency and tripping of the words in an infuriating

situation, such as the one Donna Elvira is experiencing with Don Giovanni. Although in

the score “ga” appears on a D5, if the performer applies the traditional appoggiatura on

an E5 before the resolution on the D5 of “nni!” the descending contour of the AC1-CC

succession takes place, at least in performance practice. On the other hand, if the

performer does not apply this appoggiatura rule, ending the phrase with two equal D5

notes on “-ganni,” the effect produced could be that of an AC1-CT succession. Under

these circumstances, there is a sensation of continuation, as if Donna Elvira were still

33
speaking while the Leporello’s next aside to the public by takes place. The audience will

listen to Leporello while Donna Elvira keeps insulting Don Giovanni.

Turning now to the next two intonation units pronounced by Leporello, both

clearly depart from the basic pattern because of the emphasis that the character wants to

put on the sarcastic adjectives modifying the word “titoli.” Donna Elvira’s insults are

now ironically called “titles.” This is immediately followed by the double-meaning

comment of Leporello: “manco male che lo conosce bene!” (So much the better that she

knows him!). Rossi mentions that when AC1 is pronounced “at the top level of AC2 or

higher, the speaker is drawing the listener’s attention to the topic as an effect of the

pragmatic accent (PA).” The AC1 in “Che titoli cruscanti” falls on “-can” and is set on a

D4, a fourth above the AC2, which is on an A3. That setting directs all the attention of

the audience to the adjective attached to it, “cruscanti.”15

The second phrase of Leporello presents the same kind of contour and “pragmatic

accent”—AC1 is higher than the AC2 by a minor third, from F#3 to A3. The same kind

of attention-calling effect is produced by Don Giovanni’s “Via, cara Donna Elvira,”

which is achieved by the same kind of pitch contour that Don Giovanni has just used in

the previous phrase. From the last three units, one follows an unusual intonation profile

and the other two shape around the point of focus being highlighted. In “calmate quella

collera,” in order to keep attention on the verb, which embodies the order given by Don

Giovanni to Donna Elvira, Mozart respects the usual hierarchy of pitches. But the last

15
“Cruscanti” literally means “pure Tuscan.” During this period, Tuscan culture and society was
considered the richest and most refined of all the regions of Italy. Even today the Tuscan dialect is
considered the purest Italian. But in this case, Leporello is using it to make a sarcastic contrast with the
insults that D. Elvira has been enumerating, which are not very refined.

34
two phrases are conceived as intonational units of the same nature as the previous ones,

both presenting ascending melodic lines toward their AC1s.

The first two examples analyzed were in Italian. At this point of the argument, it

is necessary to expand these observations to other languages, English in particular. As the

nature of English prosody differs greatly from any other language, a general explanation

of usual intonational tendencies will clarify the subsequent analysis of the Alto’s

recitative n.8 from Georg Handel’s oratorio The Messiah. The recitative of the oratorio

also follows with close attention to the intonational nature of its text source.

Daniel Hirst mentions that, according to Jassen (1952), English speech is

organized into two kinds of rhythmic units: the Narrow Rhythm Unit, which consists of a

stressed syllable followed by a sequence of unstressed syllables, and the Anacrusis,

which consists of a sequence of proclitic16 unstressed syllables.17 In the Anacrusis, the

syllables tend to be pronounced rapidly moving toward the subsequent stressed word. In

contrast, in the Narrow Rhythmic Unit, the duration of each unstressed syllable tends to

be inversely proportional to the number of them in that unit giving the impression of

isochrony. Tonal Units group Anacrusis with the preceding Narrow Rhythm Unit. At a

higher structural level, these Tonal Units are grouped into Intonation Units.

According to David Crystal, most of the intonation units consist of five to eight

words.18 Utterances longer than this are usually broken into two or more intonation units.

16
Proclitic: adj. In Greek Gram., used of a monosyllabic word that is so closely attached in pronunciation
to the following word as to have no accent of its own; hence, generally, used of a word in any language,
which in pronunciation is attached to the following stressed word, as in an ounce, as soon, at home, for
nobody, to comprehend. (Oxford English Dictionary Online, second edition, 1989).
17
Hirst, “Intonation in British English,” p. 58.
18
David Crystal, Prosodic Systems and Intonation in English. (London: Cambridge University Press,
1969).

35
Although pragmatic or phonological reasons dominate the final decision, syntactic

criteria define where these breaks may occur.

The final accent of the Intonation Unit is usually referred to as the “nucleus.” In

defined assertions, the stressed syllables form a descending scale until the last stressed

syllable when the pitch of the voice falls abruptly to a lower level. The intermediate

unstressed pitches may actually remain more or less at the same level with some

fluctuations and do not necessarily descend toward the last stressed syllable. As Hirst

comments: “In fairly slow deliberate speech, the ‘down stepping’ effect can be quite

striking.”19 However, in spontaneous speech, the pitch drop is reduced to the point of

almost being imperceptible, “giving rise to the ‘hat’; or ‘bridge’ type of pattern of

sentences that has been described as typical of unemphatic utterances in a number of

different languages,” such as English.20 A second kind of tune, besides this descending or

“hat” one, is used for statements with implications, “Yes-No” questions, requests and

incomplete utterances. In these cases, the descending or “hat” shape is used until the

nucleus (last stressed syllable) is reached. Usually this last accent is on a low note and the

syllables that follow rise from then on. The rise of the last pitch is not a way of

transforming a statement into a syntactic question, “but rather a way of indicating that a

syntactic statement is being used pragmatically as a request for information.”21

“Incompleteness” in a sentence takes the form of rising nuclear tones; and those involve a

pragmatic evocative value. In contrast, “falling nuclear tones have proclamatory value.”22

19
Ibid., p. 61.
20
Ibid., p. 62.
21
Ibid., p. 65.
22
Ibid., p. 66.

36
Both of these contours, descending and ascending ones, indicate to the listener how the

utterance should be processed.

In emphatic statements, the final nuclear pitch accent rises to a higher level than

usual with respect to the preceding unstressed syllable. It is common practice to switch

the first accent of the intonation unit for a low accent coming from high pitch unstressed

syllables to reinforce the later high final accent and subsequent falling pitch.

The Alto’s recitative n. 8 from Handel’s oratorio The Messiah uses biblical text

from Isaiah vii: 14 and Matt. I: 23. This text is already divided into its Intonation Units.

Intoneme Units Measures Intoneme Análisis


NUC
1 BeHOLD!
2 AC1 NUC
a VIRgin shall conCEIVE,
3 NUC
and bear a SON,
4 NUC
and shall call his NAME
5 NUC
EmMAnuel:
6 AC1 NUC
GOD with US.

TABLE 1.3: Intoneme analysis of Alto’s recitative n. 8 from Handel’s The Messiah

37
EX. 1.3: Alto’s recitative n. 8 from Handel’s The Messiah

The phrasing of Handel’s settings falls into the division of the intonation units

that the text could have in normal speech, except for the explicit separation of

“Emmanuel” into an independent unit. This gesture prepares the solemn delivery of the

sacred name of the “son of God,” while also creating expectancy. All nuclei, as well as

most of the first accents of these intonation units, fall on strong beats of the 4/4 meter in

which this recitative evolves. The only exceptions are the first accents on “bear” and

“call.” This may be explained as a way of de-emphasizing the separation of the utterances

that contain them in three independent intonation units, focusing instead on the continuity

of the syntactic unit starting with the subject/noun phrase “a virgin” to the last verbal

38
unit, “and call his name…” The simple chord accompaniment also connects these four

intonation units by holding a double bass pedal on the notes D3 and D2 for three and a

half measures.

All the units end with an ascending interval toward the last note, which carries the

main and last accent, the nucleus. Since these are incomplete statements, the ascending

interval in each of these units also creates a sense of continuity. As an exception, the

abrupt drop of the pitch a fifth below (from A4 to D4) of “his name” may be explained as

preparation for the first note of “Emmanuel,” a B3. The first two syllables of the name,

“Em” and “man,” open with a solemn ascending perfect fourth interval. The second

syllable “-ma” carries the main accent of the unit, the accent of the nucleus. The closing

intonation unit, “God with us,” is the definite and final statement of this recitative, and as

such, it ends with a conclusive descending interval, a perfect fourth. The syllabic setting

of this recitative specially contrasts with the overvocalization of long melismas in the alto

aria that comes immediately after.

High-Level Mimesis: The 19th Century Lied

The 19th century Lied serves as an example of the kind of “poetic mode” in which

the listener negotiates through the song a balanced perception between the fragmented

semantic content and the sonic components of the lyrics. Nineteenth-century Lied

composers engaged at a high-level mimetic relationship with the text. In their attempts to

offer a reading of the poetic text by enhancing semantic meaning and poetic structure

through music—harmony, melodic and rhythmic patterns and form—nineteenth-century

Lied composers departed from the strict prosodic features of the text and offered an

39
interesting but distracting musical setting. This disengagement from the intonational

properties of the poetic text emphasizes the music/poetic aspects of its words and negates

the possibility of understanding the discourse in a linear way—as close as any poetic text

may get to be perceived in the “speech mode.”

Composers at the transition between the eighteenth and nineteenth century

struggled to make their musical settings a natural extension of the text. Johann Reichardt

(1752-1814) alleged that his melodies sprang automatically from repeated readings of the

poem and they were so closely interwoven with the text that they spoke and sang

pleasantly.23 Reichardt, as well as other composers belonging to the “Second Berlin

School—Carl Zelter and Johann Schulz, committed to the least musical intervention

possible and put their efforts into letting the poetic text speak for itself as conceived by its

author.

The primacy of text over music did not last long. The next generation of Lied

composers—Schubert, Schumann, Brahms—allowed their music to speak and offer a

new perspective on the poem. By doing this, they offered an alternative reading of the

text. This new reading took the form of particular musical phrasings of the text and

treatment of textures employed in the piano accompaniment. These changes challenged

the cohesion of the poem but at the same time highlighted certain text fragments.

However, beyond the difference in approaches, the end result in both casesc(earlier and

later Lied composers) is settings that are perceived in a balanced “poetic mode.” The

listener receives a re-elaborated version of the poetic text, stretched, fragmented and

webbed into melodic and harmonic treatments. Thus, new poetic images, rhymes, words,

23
J.F. Reichardt, cited by Jack M. Stein, Poem and Music in the German Lied from Gluck to Hugo Wolf
(Cambridge, Mass.: Harvard University Press, 1971), p. 34.

40
sounds that before may have been unattended are brought to the attention of the listener,

which altogether contribute to the reinterpretation of the poem.

Examining one particular poem, which was set several times by different

composers during the nineteenth century, will allow us to contrast the change in

compositional approaches and aesthetic conception of poetic text and music relationship.

Furthermore, this analysis will allow us to understand the compositional decisions that

highlight certain fragments or words of the poetic text and understate others.

From 1795 to 1907, an extensive number of songs were composed using the

poems from Johann Wolfgang von Goethe’s novel Wilhelm Meister, and more than a

hundred of these were settings of “Kennst du das Land?” Of special interest are the

settings of “Kennst du das Land?” by Johann Reichardt (1795), Carl Zelter (1795),

Ludwig van Beethoven (1809), Franz Schubert (1815), and Robert Schumann (1849).

Reichardt and Zelter intended to make their musical settings a natural extension of

the text and limit their musical intervention as much as possible. However, their settings

are songful renditions far from any speech quality of the text. Their melodies are a

genuine example of the simplicity and modesty of the Volkstümlichkeit style—in the

manner of folk melodies.24 These simple melodies—syllabic settings constrained in

register, stripped from embellishments and extended vocalizations—were respectful of

the obvious prosodic characteristics of the text, but followed a strict musical logic. The

lack of text order modification and piano interludes aims to limit the disruption of the

narrative flow. Although the repetitious nature of their strophic settings and simple chord

24
Although until the late eighteenth-century folklore was thought to belong only to the peasants and
“assigned a low cultural or intellectual prestige,” in the nineteenth-century folklore “was seen as
embodying the essential authentic wisdom of a language community or nation.” (Taruskin, The Oxford
History of Western Music, vol. 3, p. 122).

41
accompaniment allows listeners to ease their attention from the pure musical elements of

the song, they still apprehend the text in the “poetic mode” of listening. The listeners hear

the original sound patterns of the poem mounted over the “songfulness” of the sustained

tones of a voice singing a melody.

Beethoven’s, Schubert’s and Schumann’s settings of Mignon’s Lied further

emphasize the “poetic mode” of processing by unleashing their compositional creativity

in further elaborated arrangements abounding in piano interludes, more complex and

colorful accompanying textures and text modification. They built musical forms that

direct the attention of the listener to specific words or phrases in their text. Those may

synthesize ideas relevant to the reading that the composer makes of the poem. They

highlight these text fragments by: creating harmonic tension, announcing or delaying this

phrase with piano interludes, repeating it several times, detaining the flow of the piece

rhythmically, etc.

A brief explanation of the insertion of the poem in the novel and Mignon’s

previous story could help to understand some musical decisions made in the settings of

the composers under study. “Kennst du das Land?” is a poem inserted in Wilhelm

Meisters Lehrjahre (Book Three, Chapter one). The poem is actually a song that the

character Mignon sings. She is supposedly an orphan girl, whom Wilhelm rescued from

the abusive master of a circus troupe. She has neither a home nor parents that are known

at that point in the novel. Mignon’s traumatic kidnapping made her become completely

averse to remembering her past in detail. When she sings this song, she is already living

with Wilhelm, and although she addresses him as “father,” a secret passion for him as a

man has started to grow in her. At the beginning of this first chapter in the third book,

42
when Wilhelm hears her singing from his room accompanying herself with a zither, he

becomes interested in the lyrics of the songs. He asks Mignon to repeat the song; he

writes it down and translates it; however, as Goethe describes in the novel:

He found, however, that he could not even approximate the originality of the phrases, and
the childlike innocence of the style was lost when the broken language was smoothed
over and the disconnection removed. The charm of the melody was also quite unique. She
intoned each verse with a certain solemn grandeur, as if she were drawing attention to
something unusual and imparting something of importance. When she reached the third
line, the melody became more somber; the words “Do you know it, indeed?” were given
weightiness and mystery, the “There!, there” was suffused with longing, and she
modified the phrase “Let us go” each time it was repeated, so that one time it was
entreating and urging, the next time pressing and full of promise.25

This description was closely followed by some of the composers, like Zelter, when

setting this poem into music.

In Mignon’s Lied each of the three stanzas introduces a description from a

different perspective of her lost paradise. The first stanza is the description of an earthly

paradise to which she urges her “beloved” to take her. In the second stanza, the life is

gone; it is an architectural paradise, where everything is glittery and cold as the statues

that pronounce the central question of the song, which reveals Mignon’s suffering (“Was

hat man dir getan?”. Then the term “beloved” is displaced by “protector.” The third

stanza is the description of nature—similar to the first stanza—but this time it is a misty

landscape where everything is confusing and intimidating; therefore, her protector now

becomes her father. Thus, her lost paradise has a warm and voluptuous side, a cold and

glittery side, and, finally, a confusing, misty and intimidating side. For that reason,

although she wishes to go back to her homeland some day, she wants to go under the

company and protection of Wilhelm—her lover, protector and father.

25
Johann W. von Goethe, Wilhelm Meister’s Apprenticeship. Edited and translated by E.A.Blackall.
(Suhrkamp Publishers: New York, 1989)

43
Reichardt, as well as Zelter, gave preference to a strophic setting with regular

phrasing. They sought a style of setting whose clarity was close to the dignified

simplicity that they admired in folk art. As Carl Dahlhaus remarks, “Any composer who

tried to recapture the natural state of folksong had to conceal the excerptions of his art.”26

These ideals of the “Second Berlin School” are echoed by Reichardt who states, “For the

artist the supreme art lies not in the ignorance of his art but in its renunciation” (Geist des

Musikalischen Kunstmagazins, 1791).27 This composer’s renunciation of his craft to

accomplish clarity and simplicity in his compositions is in accordance with Goethe’s own

opinion about the degree of interference of the composer’s musical creativity with the

poem. Goethe uses the term “false participation” to describe any musical response to the

poem’s meaning beyond the strict accompaniment of the declamation of the poem. The

composer surrenders his creative space to the art of the poem. The composer’s duty is

only to create the appropriate musical ambiance, which is subtracted from the general

meaning of the poem. By creating this musically suggestive context, the composer helps

the audience to appreciate the richness of the meaningful inflections of the text itself.

Thus, Goethe wrote in a letter sent to Carl Zelter on May 2, 1820:

The thing to do is to place the auditor in the mood that the poem suggests, letting the
imagination then create its own figures at the instance of the text, without his knowing
anything of the how of the process...To paint tone with tones, to thunder, crash, paddle
and plash, is detestable.28

While in “Kennst du das Land?” Zelter sets the mood of the poem as Goethe

requires, Reichardt ascribes to a simpler manner and also less intrusive folk-like song

26
Carl Dahlhaus, Nineteenth-Century Music. Translated by J. Bradford Robinson. (University of
California: Berkeley, Los Angeles, 1989).
27
Ibid., p.109.
28
Carl Fredrich Zelter, J.W. Goethe: Briefweschsel (Leipzig, 1987), p. 216. Translation of the quote by
Christopher Gibbs.

44
style. His setting is strictly strophic. In the accompaniment, he harmonizes with simple

chord support and doubles the melody throughout the song. Melodic phrases are

completely regular and the setting of the words is mainly syllabic. The song has only a

short modulation to its dominant and has no instrumental interludes. Zelter’s song is a

slightly modified strophic setting; the second and third strophes show a slight harmonic

modification in the piano and the voice between mm. 11 and 15. This song presents, as in

Reichardt’s, mainly a chord accompaniment with mm. 6 through 13 arpeggiated in

triplets in each strophe.

Zelter gives careful indications of the expression for each strophe. These

expressive indications closely follow the description of Mignon’s performance in

Goethe’s novel. Thus, Zelter asks for a Pathetisch (pathetic) mood mit Anmut (with

grace), in correspondence with Goethe’s description of the solemn grandeur of the

opening of Mignon’s song. In the third line, Zelter asks for a more getragen (hesitating)

mood as Goethe talks of the melody being more somber. For the words “Kennst du es

wohl?” (“Do you know it, indeed?”), he asks for an anwachsend (crescendo in emotional

intensity), according to the weightiness and mystery of Goethe’s description. The

variation in the expressive mood of this last refrain is also closely followed by Zelter.

Even in Reichardt’s and Zelter’s settings, where most of the obtrusive musical

devices are minimized, the listeners will tend to perceive the poetic text in a “poetic

mode.” The intrinsic musicality of the poem itself mounted over a melody—although

simple, syllabic, unembellished, and constrained register-wise—draws the listeners’

attention away from the linear discourse.

45
EX. 1.4: Reichardt, Kennst du das Land, 1st stanza.

46
EX. 1.5: Zelter’s Kennst du das Land, mm.1-27.

47
EX. 1.5 (cont.): Zelter’s Kennst du das Land, mm.28-53.

48
Convinced of the power of music to communicate feelings and sensations

otherwise ungraspable in words, the next generation of Lied composers broke free from

the previous word setting’s constraints delineated by the “Second Berlin School” and

unleashed their musical voices into their compositions. They ventured into busier textures

and more elaborate melodies, which certainly fragmented the poetic discourse further. At

the same time asthey emphasized musical qualities already existent in the poem, they

focused their expressive forces on the strict musical elements of the setting. Almost as if

they were conscious of the unavoidable distortive effect that their settings inflected upon

the poem, in the music of the settings they offered their listeners a synthesis of the

feelings at play in the poem: highlighting specific words with repetition, rhythmic and

melodic procedures, or underlying accompanying textures or harmonies.

Edward T. Cone says that a composer cannot set a poem with all its connotations;

some aspects of the poem will always be left out. In a case where the composer wants to

consider all the possible readings of the poem, he should include every point of view

translated into music in order to give the total meaning of the poem.29 Otherwise, what

results is a new creation that does not show the poet’s persona but, rather, the

composer’s. Inevitably, this will be a new set reading from the composer’s point of view.

His or her particular setting will highlight certain words and sounds, which will combine

in a completely new set of images associated with different moods and ideas.

If we turn to Beethoven’s and Schubert’s settings, we are able to appreciate an

aesthetic change in the following terms: the composer’s voice increases its active

participation in the musical result of the song. As Taruskin comments “The basic vocal

29
Edward T. Cone. “Some Thoughts on ‘Erlkönig.’” In The Composer’s Voice. (University of California
Press: Berkeley, Los Angeles, 1974) p.19.

49
idiom is always that of Volkweise (folk tune), the ‘natural’ music representing the ‘We,’

inflected by eccentric details of melody, harmony, or accompaniment that at extreme

moments allow the ‘I’ to intrude.”30 Both Beethoven’s and Schubert’s voices translate

into their music the anxiety and urgency of Mignon’s request. They especially focus on

aspects of those concerns that they are interested in emphasizing from the poem. All

those become melodic, rhythmic, harmonic and textural effects at the hands of Beethoven

and Schubert.

Both composers set their song in strophic form with minor variations in the third

stanza. Both songs are in the key of A major and follow a similar tonal plan. Within this

formal frame, the contrasting textures, change of dynamics and rhythmic acceleration of

the second section of each stanza (“Dahin! Dahin!”) show more than a mere transcription

of the song that Mignon could have actually sung. Through their particular musical

treatments, Beethoven and Schubert show a reinterpretation of the unconscious concerns

of the character and her anxiety, once confronted with her critical personal situation: an

exposition of the unconscious feelings of Mignon told in the musical language of these

composers.

30
Earlier in his Chapter 35 “Volkstümlichkeit,” Taruskin explains the kinds of negotiations established
between the “I” and the “We” in previous nineteenth-century Lied, which crystallized the Volkstümlichkeit
ideals. He comments about the “impossibility of a particular ‘I’ without a particular ‘We,’” which may be
explained by a reformulated idea of cultural relativism, “the irreducible human difference”: “A human was
human only in the society of other humans, and the natural definer of societies was language. Since there
could be no thought without language, it followed that human thought, too, was a social or community
product…” In this manner extending language as expressive of all cultural aspects of a society, the concept
of a collective spirit idiosyncratic to each particular society arises. And this one was found in folklore
manifestations. So, the Lied was mainly concerned with the faithful portrayal of that “We.” (Taruskin, The
Oxford History of Western Music, vol. 3, p. 120-123).

50
EX. 1.6: Beethoven’s Kennst du das Land, mm.1-17.

51
EX. 1.6 (cont.): Beethoven’s Kennst du das Land, mm.18-43.

52
In Beethoven’s Lied, the gay anxiety of this adolescent is manifested in a playful

Piú Mosso in 6/8. This section functions as an answer to the “Kennst du es wohl?”

rhetorical question of Mignon, which together with the other “Kennst du?” questions are

the structural columns of Beethoven’s song. All these questions are set to the same

rhythmic pattern:

EX. 1.7: Beethoven’s “Kennst du”-questions’s rhythmic pattern.

This pattern contrasts with the rest of the musical phrases with the use of one long

rhythmic value opening and another closing it: a quarter note at the beginning of the

phrase and a dotted quarter, eventually extended, at the end. This produces a slow down

of the flow of the song and, as a consequence, the highlighting of these questions in the

poem. The last “Kennst du es wohl?” of each stanza is also preceded by a short piano

interlude, anticipating instrumentally with the same melody and harmony the question

that will arise afterwards. Thus, the attention of the listener is drawn inevitably towards

those questions.

Schubert also directs the flow of the first part of each strophe to the same question

(“Kennst do es wohl?”), this time set in recitative style. This offers a quasi-speech effect

in the middle of a completely “songful” melody. The expectation is built through the two

preceding measures, which serve as preparation to the D# of the French augmented sixth

chord sustained under the question, which in the next measure resolves on the dominant

chord, E Major. These procedures signal the arrival of a phrase to which the composer

53
wants the listener to pay special attention, “Kennst do es wohl?” (Do you know it

indeed?), which at the same time prolongs the expectation for an answer, musically as

well as lyrically.

In Beethoven’s setting, the answer is found in the next measure; the first “Dahin”

resolves on the tonic of the original key, A Major—after a short deviation to C Major in

the preceding section. Schubert delays the answer by displacing the clear resolution until

the end of the strophe—twenty-two measures later. The harmonic tension built measure

after measure while waiting for the final cadence parallels the frantic searching of

Mignon for the realization of her dream, to go back to her homeland, which always seems

far from concretion. While Beethoven portrays a calmer attitude on Mignon’s part—a

kind of contained anxiety—Schubert does the contrary. Mignon’s over-excitation is set

into music with unresolved harmonies. Furthermore, the frantic driving flow of triplets

from mm. 8 of the song does not stop until the end of his “Etwas geschwinder” (“A little

faster”) section. The only moment when the triplet texture is suspended is under the

question “Kennst do es wohl?” Finally, the alteration of the text, especially the desperate

repetition of “Dahin,” emphasizes her emotional state. This whole delayed answer section

lasts twenty-two measures in Schubert and fifteen in Beethoven’s setting—and only six

measures in Reichardt’s setting.

54
EX. 1.8: Schubert’s Kennst du das Land, mm.1-18.

55
EX. 1.8 (cont.): Schubert’s Kennst du das Land, mm.19-40.

56
More than any of the other settings, Schumann’s setting of “Kennst du das Land?”

achieves the new synthesis of the “I” and the “We” of Lied in romantic terms. He

approaches this poem with a strict strophic form, which contains a very slight

modification in the interlude between the second and the third stanzas—a deceptive

cadence in place of the original cadence to tonic. This cadence is as deceptive as

Mignon’s answer to the preceding question of the statues, “Was hat man dir, du armes

Kind, gethan?” (“Poor child, what have they done to you?”)—this is the central question

of the poem in meaning and placement. In his setting of Mignon’s Lied, Schumann seems

to have the same intentions as the “Second Berlin School” composers. His choice of a

strophic setting and his own indication at the beginning of the song of “Langsam, die

beiden letzten Verse mit gesteigertem Ausdruck” (Largo, the two last verses with

different expressive gesture) give that impression. But this is not the case. The melody

sung by Mignon is neither the ideal Volksweise (folk tune) nor the simple and transparent

melody of a fragile adolescent. The accompaniment, with its thick harmony, suspensions,

appogiaturas and deceptive cadences, builds a musical fabric that neither serves as an

unobtrusive support of the text nor represents the simplicity of Mignon’s zither playing.

Also, the relation between accompaniment and vocal melody with its displaced doubling

in the piano, so characteristic of Schumann’s songs, is far from the clear chord

accompaniment and strict doubling of the melody necessary for the delivery of the “only

possible reading of the poem,” according to the “Second Berlin School.” All these

features are put in place by Schumann to delineate Mignon’s psychological and

emotional state, or his interpretation of it. In the song, Mignon talks with the voice of the

composer.

57
The difference from Beethoven’s and Schubert’s settings, which are structured

around the “Kennst du?” rhetorical questions of Mignon, is that Schumann seems to drive

the flow of the song to each of the three addressing names that Mignon uses for Wilhelm.

The thick web of triplets in the two hands of the piano that starts in mm. 10 does not stop

until mm. 25, coinciding with "o mein Geliebter" (“o my beloved”). The same procedure

is repeated in the following two stanzas where the flow of the triplets ceases upon

arriving at “o mein Beschützer” (“o my protector”) the first time and “o mein Vater” (“o

my father”) the second time. Thus, the flow of the song seems to be organized around

these climactic points, which reflect one of Mignon’s major concerns: her relationship

with Wilhelm. The ambiguity of this relationship drives her to ask herself three reflexive

questions: Are you my lover? Are you my protector? Are you my father?

Furthermore, Schumann’s is the only setting that opens with an idiomatically

pianistic introduction, which hints at the chromatic world that he will develop later on in

the piece. This introduction will become the interlude played in between stanzas. When

the voice starts, this pianistic treatment gives place to a more open texture, which allows

the text to transcend and reach the listener in a relatively clear manner. But once the

audience is introduced to the landscape of each stanza, the thick web of triplets takes over

—starting in mm. 10—with its displaced doubling and complex chromatic language until

the next landmark: “o mein Geliebter,” “o mein Beschützer,” “o mein Vater.”

58
EX. 1.9: Schumann’s Kennst du das Land, mm. 1-20.

59
EX. 1.9 (cont.): Schumann’s Kennst du das Land, mm. 21-41.

60
By use of the described musical procedures, these Lied composers permeated the

poem with their musical voices, suggesting, through the accompaniment and its relation

with the vocal line, things that are not said in the words of the poem. They relied on the

music for this task because they conceived music as a language equal to the literature.

Music was capable of transmitting sensations and feelings that the audience would

appreciate only through the direct experience of listening—a type of physical connection.

Rosen says that for the nineteenth-century composers, the word is not anymore

embellished and imitated by the music.31 Now, the music becomes a language by itself, a

separate symbolic universe with its own logic and communicative-expressive power.

However, although music transmits feelings and sensations captured by the reading of the

composer, it only represents their form and not their content. The listener feels the

movement and impulses of the music conveying those feelings as a physically empty

message, which only his own imagination will fill with a determined content. In this way,

the listener will capture the structure of the composer’s personal reading and complete

the content of it with his own personal reading.

The particular dramatic point of view adopted in a song exerts an enormous

influence on the concrete musical manifestation of the special poetic features of a poem.

And the features that attracted Romantic composers dwelt at a structural level of the

poem. Since they were mainly concerned with content resulting from the elaboration of

several internal layers of meaning of the text, they molded their musical setting to portray

these emotional states or concepts. Neither narration of the dramatic events nor speech

qualities of the text were major concerns at this point in music history. The Lied was

31
Charles Rosen, The Romantic Generation (Cambridge, Massachusetts: Harvard University Press, 1995)
p. 68.

61
mainly music. The music of its poetic text runs parallel to the highlighted text fragments

and, integrated into those purely musical elements of the song, engaged the listener in a

strictly “poetic mode” of perception.

62
III.

FURTHER EXPLORATIONS:

MEREDITH MONK AND LUCIANO BERIO

As an academic and a performer of new music, I felt it was important to test my

text perception theories in “post-tonal” repertoire. I was especially interested in certain

composers of the second half of the twentieth century and the beginning of the twentieth

century, such as Meredith Monk and Luciano Berio, who have produced music that

reflects their concerns about the semiotics of paralingual vocal gestures and intonation. In

exploring these issues, they created music that is the practical representation of the way

we listen in the poetic mode taken to an extreme—a “poetic mode” with a special

emphasis on the “auditory mode.” By employing the sonic elements of language as the

structural components in their pieces, they strived to offer their audiences a direct

experience of the struggle and negotiations that text undergoes once set into music. They

wanted people to attend to those paralingual nuances that we usually disregard when

listening to speech and disregard even more when speech is set into music. At their hands

these paralingual elements become music to our ears.

Monk and Berio share concerns and interests in exploring the tensions between

text and music. They observe the fragmentation and deformation that any text set into

music undergoes and its resulting unavoidable degrees of unintelligibility. As an

offspring of these observations, Berio proposes to create a “new kind of relationship

between word and sound, poetry and music...,” the function of which “would not be the

contrasting or mixing up of two separate expressive systems but rather the creation of

complete continuity, so that the shift from one to the other would be imperceptible,

63
without drawing attention to the difference between a logical-semantic mode of

apprehension (as adopted for the spoken message) and a musical mode...”1 This kind of

word and sound relation would activate in the listeners the “poetic mode” of perception,

as defined by Reuven Tsur.

Monk and Berio are not the only composers who have introduced new

perspectives on the language and music relationship and influenced subsequent

compositional tendencies in the vocal music realm. Composers such as Pierre Boulez and

Karlheinz Stockhausen have also promoted rethinking this relationship with their writings

and compositions.2 On the one hand, Boulez developed his concept of “centre and

absence” in which the text remains at the notional center of the composition process. On

the other hand, Stockhausen envisioned a “sound-word continuum” in a perpetual

transition from listening to comprehension by softening the boundaries between these

two media. As explained later in this chapter, Berio conceives of a similar fluid

transitional process between sound and word.

It is also relevant to mention two developments of the second half of the twentieth

century, “concrete poetry” and “text-sound composition.” Both have predecessors in

movements taking place during the first two decades of the twentieth century. The former

has an earlier direct predecessor in the Dada poetry of Kurt Schwitters, Hugo Ball, and

Tristan Tzara, and the latter in the Italian Futurist experiments of Russolo and Marinetti.

The “concrete poetry” movement aims to create a new artistic reality. Without the

complete suppression of semantic meaning, this movement seeks to eradicate

1
Luciano Berio, “Poesia e musica un’esperienza,” in Incontri Musicali 3 (1959), 99.
2
An explanation of Boulez’s concept of “centre and absence” may be found in Orientations: Collected
Writings (1986). Stockhausen explains his ideas about text and music in his paper Speech and Music read
in 1959 in Darmstad Summer School and later published in “Die Reihe.” Some of the most influential

64
representation of any external reality. Thus, its focus moves towards the phonetic sounds

of words, shapes of letters, breaking of the formal semantic units and punctuation rules,

etc. “Text-sound compositions” renounce the optic dimension and concentrate on the

relationship of sound and meaning. These works exist only in recording format (sound

pieces without a written version). This branch of electro-acoustic music holds among its

more important examples compositions such as Steve Reich’s Come Out (1966), Nono’s

La fabbrica illuminata (1964) or the first region of Stockhausen’s Hymnem (1966).

The non-semantic internal reality of language or indefinable vocal sounds are of

interest to Monk and Berio, but unlike in “text-sound compositions,” these composers

explore these concepts without any electronic interventions. Their pieces may be

reproduced in live performance by one or several singers without any processing of their

voices. The fact that the full palette of vocal sounds employed by these composers

originates acoustically from the natural resources of the human voice is of special interest

to this dissertation.

As mentioned before, both Berio and Stockhausen have proposed to soften the

boundaries between speech and music, creating a “sound-word continuum.” This

continuum is created when speech approaches music and music approaches speech to the

point of the dissolution of the boundaries of sound and meaning. Berio considers the first

and primordial step in creating this “word-sound continuum” to be the dissolution of the

speech continuity as a logic/semantic discourse. Thus, he proposes to explore beyond the

natural fragmentation that any text undergoes when set to music—by breaking words into

their phonetic elements, stretching them, masking their enunciation, and mixing them

pieces in the realm of explorations of the tensions between language and music were Boulez’s Pli selon pli
(....), Stockhausen’s Gesang der Jünglinge (1955-6) and Momente (1962-4).

65
with paralingual sounds—to submerge the listener in the deepest nuances of language and

the human phonatory apparatus. In this way, he intends to dissect the elements of

language and observe their relations and tensions from inside out, while at the same time

revealing the communicative power of the sonic aspects of language beyond the

semantic-linguistic content of the syntactic units and system. He makes use of poems,

political speeches, academic texts, literary narrations and other kinds of discursive texts

in their entirety or in fragments. In A-ronne, as in many of his previous and later pieces,

Berio recreates the semiotic structural manifestation of the agony of language when set to

music. As Berio himself describes, A-Ronne is a dramatization of the sonic aspects of

language in a radiophonic theater.

In A-Ronne, as well as some of his early vocal pieces, Berio extracts the purely

musical elements from his literary textual sources and, as Osmond-Smith comments, uses

them “to explore the borderline where sound as the bearer of linguistic sense dissolves

into sound as the bearer of musical meaning: a territory that…he was to make very much

his own.”3 The words’ musical elements become structural components in his pieces.

Thus, Osmond-Smith describes the process of creating tension between the sonic

elements extracted from the words in Berio’s Thema (Omaggio a Joyce):

…he then proceeds to work in tension with it, juxtaposing and superposing phonetic
elements so as to produce consonant groupings that the human voice would normally find
hard to articulate in rapid succession (such as voiced and plosives)…Out of this
impossible vocalism, comprehensible speech…momentarily emerges, only to be
engulfed: relative comprehensibility has become a compositional parameter to be handled
in much the same way as textural density or, within a pitched context, harmonic
density…It may be achieved by the fragmentation of originally linear texts…by
superposition of texts…by dissolution of texts into their component phonetic materials, or
more usually by a combination of these.4

3
David Osmond-Smith, Berio (Oxford, New York: Oxford University Press, 1991), 62.
4
Ibid., 62-63.

66
These same kinds of procedures are explored in A-Ronne; only in this piece Berio uses

the natural voices of eight singers instead of electronic techniques.

Departing from similar observations, Monk arrives at different results. While

Berio places a magnifier on the sonic transitions of language but always in the syntactic

context of a real text, which could be stretched and deformed beyond recognition, Monk

steps out of the syntactic/linguistic frame and treats the phonemes as pure sounds. No

linguistic text of any kind precedes her pieces. Although she shares with Berio an interest

in the communicative power of the sonic aspects of vocal sounds, she specifically focuses

on their emotional communicative potential. Monk conceives of the voice as a tool for

“demonstrating primordial/prelogical consciousness...a direct line to emotions [and]

...Feelings that we have no words for.”5 In this way she exploits the potential of the

“songfulness” nature of the human voice.

For Monk the voice is in complete connection with the body. At the same time,

the physicality of the voice is one of her fundamental concerns: “The body of the voice/

the voice of the body.”6 As a consequence, in the mid-sixties, she began a methodic

exploration of the voice as an instrument that could develop its own idiosyncratic

vocabulary: “I realized that the voice could be as fluid as the spine, that it could have the

flexibility and range of the body.”7 She immersed herself in the study of vocal color,

voice placement and nuances in the articulatory/phonatory apparatus and applied her

discoveries into controlled explorations of vocal pitch, volume, speed, texture, timbre,

breath, and strength.

5
Meredith Monk, “Notes on the Voice,” In Meredith Monk, ed. Deborah Jowitt. (Baltimore, Maryland: The
Johns Hopkins University Press 1997), 56.
6
Ibid.
7
Robert Schwarz, Minimalists (London: Phaidon Press Limited, 1996), 189.

67
Berio does not ignore the physicality of the voice either and exposes the gestures

of vocal sound production through his music. Although in A-Ronne Berio makes

extensive use of paralingual sounds, such as breathing, sighs and other vocal noises, as

part of the musical process, he explains that he does not conceive of them as mere sound

effects but signs carrying meaning:

I am not interested in sound by itself—and even less in sound effects, whether of vocal or
instrumental origin. I work with words because I find new meaning in them by analyzing
them acoustically and musically, I rediscover the word. As far as breathing and sighing
are concerned, these are not effects but vocal gestures which also carry a meaning; they
must be considered and perceived in their proper context.8

By exploring in detail human vocal nuances, both Monk and Berio create proximity with

their audience. Listeners may directly relate to these tangible sounds because they are

produced by the same gestures that any human being uses in everyday normal speech.

Instead of having a passive audience admiring the virtuosic sound production of

the performer, Berio’s and Monk’s music invites its audience to experience it physically.

Richard Middleton comments that listeners “… identify with the motor structure,

participating in the gestural patterns, either vicariously, or even physically, through dance

or through miming vocal…performances.”9 Ethnomusicologists such as John Baily argue

that the movements that players perform while playing their instruments directly affect

musical structure, so “Music can be viewed as a product of body movement transduced

into music.”10 In this manner, the listener gains a firsthand experience of the gestures that

are structural to Monk’s and Berio’s pieces.

8
Rossana Dalmonte and Bálint András Varga, Two Interviews/Luciano Berio, trans. David Osmond-Smith
(New York: M. Boyars, 1985), 141.
9
Richard Middleton, Studying Popular Music (Philadelphia: Open University Press, 1990), 243.
10
John Baily, “Movement patterns in playing the Herati dutar” as quoted in R. Middleton, Studying
Popular Music (Philadelphia: Open University Press, 1990), 243.

68
Monk comments that “By working with your own instrument, you actually come

across gestures that are trans-cultural, and in certain ways you become part of the world

vocal family.”11 Her vocabulary involves human sounds that many men and women

could find natural and organic; thus, her music triggers a close emotional connection

between her audiences and her musical idiom. One could feel that those sounds are part

of our essential primal vocabulary: pre-lingual and, at the same time, beyond language.

Monk explores the communicative possibilities of vocal sound devoid of any

form of linguistic meaning—beyond semantic content. Vocal sound is presented as pure

sound. In this way, it opens to the audience the wide spectrum of potential meanings that

any sound usually evokes. The baggage of meanings that any human vocal sound carries

is not always precise and easy to define, allowing all sorts of associations. In a linguistic

communicative setting where human sound is the carrier of language, these multi-

associative meanings are usually overlooked. Monks wants to bring to her audience an

awareness of the potential multi-faceted meaning of vocal sounds.

In search of opening this kind of associative vocal sound/meaning spectrum, most

of Monk’s pieces are wordless, as she has restricted herself to moaning, shouting,

sighing, breathing, whispering, trilling, sliding, doing glottal breaks, and chanting on

nonsense syllables. This is a conscious aesthetic decision, since she departs from the

conception that it is almost impossible to comprehend text put into music, or at least to

understand it fully as in a normal colloquial situation. As a result, she attempts to direct

our attention directly to the sound of the voice without any obstacle. Even when she

composes pieces as Three Heavens and Hells (1992), where she uses exclusively and

11
Schwarz, Minimalist, 190.

69
exactly the four words of the title as the text of the piece, she almost strips the words

from their semantic/linguistic meaning. The quasi-mechanical repetition of these words

along the twenty-one minute and ten second duration of the piece produces a progressive

fade away of meaning until these words become empty vessels. These words keep their

pragmatic sense but not their semantic meaning. The effect is finally similar to the pieces

in which language is completely absent; the audience turns its attention toward the vast

spectrum of meanings of vocal sound.

In regard to her piece Atlas, Monk explains that it was meant to pass discursive

thought to “go directly to the heart.”12 She argues that in any case, she usually is not able

to understand a word in opera. Departing from the idea of language “as a screen in front

of the emotion and the action,” she prefers a direct communication that “bypasses that

step so that you’re really dealing with a very primary and direct emotion.”13

Volcano Songs: Meredith Monk

Monk’s Volcano Songs: Duets (1993) are an interesting example of wordless

songs. They explore the full potential of the “songfulness” of the voice and the pure

musicality of the human vocal gestures. As said before, Monk’s piece is a self-conscious

representation of the way we listen in the “poetic mode” with emphasis on the “auditory”

elements of perception. She chooses to make music from the stripped musical elements of

the voice that we usually unconsciously apprehend and to which we emotionally connect

when listening to any other vocal piece—whether popular song or “art” song.

12
Ibid., 191.
13
William Duckworth, Talking Music (New York: Simon & Schuster Macmillan, 1995), 359.

70
In an interview offered in 1996, Monk commented to the ethnomusicologist

David Gere that these duets were conceived as processes of nature. Each of them only

explores a single particular vocal quality. Monk preferred simplicity over compositional

fanciness; she says: “I was thinking: Why don’t you just take the purest color in each

song and only work with that. Like one brush stroke or a haiku.”14 The creation of a

particular character in each song is central to Monk. In Volcano Songs as in other pieces,

she looks for “the voice” of each piece, the one that creates a world in itself and is not

similar to any previous one.

This kind of restrained canvas that Monk self imposes in each Volcano Song is

not unusual for her music in general and appears to be an intentional procedure in other

pieces like Vessel (1971). This restraint manifests itself in two aspects of her music. First,

raw materials tend to be simple, but her controlled delivery—a certain solemnity in her

performance that creates “momentum”—transforms them into music of a universal scope

and stature. Second, she is interested in slowing down musical processes to get a slice of

them. She wants the audience to taste every single moment. The same detailed delivery

that Marcia Siegel and Kenneth Bernard have observed in her theatrical and dance

movements is present in her music.15 Most of her pieces are constructed as a succession

of single episodes that succeed one another, repetitive sequences of slow, sustained notes

or glissandi, or swirling rhythm.

Several of her compositional techniques were revealed to me in a palpable way

through my direct experience in workshops held by The Meredith Monk Ensemble. They

guided participants through similar processes that they established with Monk during the

14
Meredith Monk, interview by David Gere, Volcano Songs (CD insert), ECM, June 19, 1996.

71
creative process of some of her ensemble pieces. She usually proposes materials and

processes, and through improvisatory techniques, they mold those materials, each in their

own idiosyncratic ways. She wants to hear through their vocal sounds: their backgrounds,

their experiences, their personalities, their humanity, their imperfections. After long

sessions of experimentation, a final version is put together and fixed. For the most part,

there is a preference for the oral transmission of her pieces—although these versions

finally do get scored, which was the way in which her ensemble members taught

fragments of her repertoire to the workshop participants.

In terms of the musical processes and materials that Monk employs in her pieces,

the musical structures show a predominant horizontal conception: short cells that develop

linearly, “plain chant” style or “folk-flavor” simple melodies that succeed one another.

These sometimes undergo slow processes of gradual transformation. At other times, each

component succeeds another, but in their transition, there is a period of overlapping in

which one theme fades out and the other slips in. Monk calls this process “wash,” and

this is one of the several musical processes that are directly associated with cinematic

editing techniques. Other musical procedures, such as canonic textures, are rooted purely

in the musical tradition.

Monk’s approach to music-theater connects to a general non-narrative conception,

which she applies in her explorations across media: music, dance, theater, and video. In

terms of the structure of her staged pieces, Monk’s preference for non-narrative models

causes her to choose a more fragmented poetic style in which things happen one at a

15
Marcia B. Siegel, “Virgin Vessel” and Kenneth Bernard, “Observations On Recent Ruins” in Meredith
Monk, ed. D. Jowitt.

72
time, and it is not until the end that the spectators are able to intermingle the separate

episodes or scenes and make sense of them as a whole theater piece.

According to Deborah Jowitt, during the theatrical presentation of Volcano Songs,

Monk walks to a row of three rectangles that lie on the floor and, in a ritualistic manner,

removes the black pieces of cloth that cover these rectangles.16 After each uncovering

action, she lies in a crumpled position on each pallet, while a bright light flashes on and

off. Once she stands up, the light turns the pallet a luminous green, discovering on it a

dark imprint left by her body. Then she proceeds to the next rectangle to perform the

same task and the previous one fades away. This seemingly magical theatrical effect

transforms advanced technology, such as photosensitive paper, into a “poetic and

apocalyptic” memorial of victims of volcanic eruptions or nuclear disasters such as

Pompei or Hiroshima.17

The volcanic theme brings in the motive of “transformation” that lies under all

these songs. According to the composer, although volcanic activity implies potential for

destruction, it has also been instrumental in the creation of the Earth. Furthermore,

“volcanic land is some of the most fertile land on earth.”18 The tension between death and

destruction and rebirth and growing implies a kind of cyclic transformation, which

translates into musical processes of transformation of the vocal textures and themes that

are used throughout the Volcano Songs: Duets: morphic overlappings between materials.

The first song of the cycle, called “Walking Song,” explores the opposition of

pure vowels against a backdrop of voiceless, breathy vocal sounds. This duet

16
Deborah Jowitt, ed. “Introduction.” In Meredith Monk. ( Baltimore, Maryland: The Johns Hopkins
University Press, 1997)
17
Ibid., 15
18
Meredith Monk, interview by David Gere.

73
concentrates on the vocal color of [a] and [o] connected once in awhile by semi-vowel

consonants [n] and [l], and glide [j]. The piece evolves through a restless motif of what

could be called “a galloping rhythmic” nature: a six-eight meter made of a quarter note

followed by an eighth-note, of iambic characteristics. The melody seems to move mostly

in conjunct intervals around an F# minor tonal center. Despite this constrained melodic

beginning, throughout the duet, the pitch content and contour evolve from a very narrow

register to more than a fifth wide register and then an almost total loss of tonal center to

later return to the previous, more constrained and defined version of the motif. Departing

from this basic version, the piece explores augmentations and diminutions of the

following melody:

EX. 2.1: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:25 to 0:35

Only the last three notes, on the phonemes [o-a-jo], of this motive remain constant

along the piece. They become a sort of refrain that reappears at the end of every phrase

despite any kind of transformation that the beginning of the motive could have

undergone.

The duet may be divided into fourteen sections along which the musical processes

described in previous paragraphs take place.

74
Section # 1 - (0 to 0:15 minute):

This opening section is the introduction of the first and simplest version of the

theme, as already presented in example # 1. This theme is made of two identical melodic

phrases but with different phonetic material. At this point, each of these phrases lasts four

measures (4 seconds).

EX. 2.2: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:05 to 0:14

Section # 2 - (0:15 to 0:25 minute):

In this second section, the introductory theme is expanded in two dimensions,

pitch and duration (one vertical and the other horizontal). Regarding the former, it adds

two whole steps up, thus reaching to a C#5. For the latter, although it keeps the dynamic

of presenting two phrases, it adds a fifth measure to the first phrase, which produces

instability and breaks the balance and regularity that the theme had in its introductory

state. This means a whole second of new music and surprises the listener, refreshing the

perceptual experience. But the second motive phrase goes back to the established four

seconds—four measure period.

75
EX. 2.3: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:15 to 0:20

Section # 3 - (0:25 to 0:35 minute):

In this third section, there is another kind of expansion, this time in the texture:

the addition of a second female voice. Again unpredictability and irregularity are

emphasized by surprising the listener with the unexpected new element not at the

beginning of the phrase but by the second note (in 0:27 minute). But this second singer

only emits breathy, almost voiceless sounds without specific phonemes, which follow the

same rhythmic patterns of the leading female voice, which still carries the tune. Once

introduced, this second voice keeps singing for the two normal melodic phrases of four

measures. The melodic content of these phrases is a variation over the given pitch

spectrum until this point of the piece. In the refrain [o-a-jo], the second voice gains more

presence with faintly defined phonemes but without completely abandoning the

voiceless/breathy quality. The phonetic material is basically the same recomposed in a

different order.

Section # 4 - (0:36 to 0:43 minute):

The second voice keeps singing, and now, interspersed within its voiceless

texture, some phonemes are completely voiced—in addition to those of the refrain. The

two regular phrases are maintained. Interestingly, when the melody reaches the first

“refrain,” the second voice splits from the first one and sings it with a delay of one beat,

76
thus creating an echo-effect. But by the end of the second phrase, they are in unison

again. The melody expands its pitch range even more. The first phrase opens with a

perfect fifth, A4 to E5.

EX. 2.4: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:36 to 0:43

Section # 5 - (0:44 to 0:53 minute):

Some elements stay constant, such as the two thematic phrases and the

sporadically sung pitches of the second female voice, but the new expansion is harmonic.

Along the “refrain” the second singer adds a second parallel voice a third above.

Section # 6 - (0:54 to 1:03 minute):

Now the expansion is in the variety of vocal colors or timbres employed. Both

singers, but more prominently the second one, shake their voices, giving the impression

of trembling.

Section # 7 - (1:04 to 1:14 minute):

In this section a process of simplification begins. There is a reversal of all the

musical effects applied and compiled up until now in the piece. The second voice drops

back into more breathy sounds. The voices play with delays and anticipations of the

“refrain” both times that it appears at the end of each of the two phrases.

77
Section # 8 - (1:15 to 1:27 minute):

In this longer section, the second voice starts to drop at times, producing rests

between the breathy sounds. The first voice progressively drops the phonemes and starts

to sing bocca chiusa.

Section # 9 - (1:28 to 1:35 minute):

This section goes back to the same kind of articulation and texture as before

minute 1:14.

Section # 10 - (1:36 to 1:44 minute):

The melody loses its tonal center at the same time as its range expands. It moves

in wondrous ways out of the F# minor center that dominated until now. The trembling

timbre is more prominent and frequent, and by the end of the second phrase, before the

refrain, there is a whole second of pause in both voices—again surprising the listener

with the unexpected.i

Section # 11 - (1:45 to 2:13 minute):

The new element is the broken nature of the melody. Unexpected rests interrupt

the melody’s flow. But those rests have more of an effect of stops or suspensions in time

than silence as a product of suppression of sonorously existing material. After each

suspension, the melody reassumes its natural flow from the point before the interruption.

78
Section # 12 - (2:13 to 2:27 minute):

In this section several of the previously used musical tactics and processes are

used all together: delay/echo, harmonization in thirds, rests, etc.

Section # 13 - (2:27 to 2:46 minute):

A completely new melodic material replaces the theme of two phrases. The first

voice alone sings the following pattern based on descending ninth intervals (G#4 to F#3).

The phonetic material is limited to alternations between [a-ε], the [a] as old material and

the newly introduced [ε].

EX. 2.5: Monk, Volcano Songs: Duets, “Walking Song,” min. 2:27 to 2:46

Section # 14 - (2:47 to 2:54 minute):

The second voice returns and sings along in unison with the two complete phrases

of the theme and then a single [a] on F#3, one second rest, and the last single [ε] on G#4,

this time sung only by the first voice. Thus, the piece ends with this open ascending

ninth.

The second duet of Volcano Songs, “Lost Wind,” consists of two voices playing

with the friction of two notes a half step apart and the partials produced by this action.

The motif is introduced by one of the voices and repeated twice by this single voice. It

79
consists of two short notes on the phonemes [ni - a], on C#5, followed by a long note a

half step up (D5) on the syllable [no]. This last D note immediately decays, sliding down

a third.

EX. 2.6: Fragment from Monk, Volcano Songs: Duets, “Lost Wind”

The duration of the long note before the decay varies in every repetition. The

second voice, which appears by the third time that the motive is repeated, reproduces the

same melody but with a delay. It accentuates the friction by desynchronizing the voices

while sustaining and sliding down. Although it is very difficult because of the fine timbre

of the voices blending, one may perceive, until a certain point in the piece, that the two

voices take turns starting each repetition. The singer that follows always applies a more

distant timbre in her voice, as if the sound had a veil over it. This is achieved with a

further back placement in the vocal cavity and a certain breathy quality (in contrast to a

more forward and projected full sound). Although less veiled, the leading voice presents

to a certain degree a similar distant sound.

By these means, the piece explores distance in two dimensions: space and time. It

explores the effects of distance between the sound of the two singing voices, the singers

with respect to the performing space, or between the listeners and the singers (their

proximity or remoteness). The piece gives the impression of an echo effect, in which one

80
singer sings and her voice comes back with a far and diluted color. This illusion is

created by the fine imitation of vocal colors between the two singers. The echo suggests a

vast expanse of space. Thus, the human and her or his voice in solitude face the

immensity of nature. Nevertheless, the proximity is present in the friction produced by

the rubbing partials created by the semitone interval between the voices. At the same

time, “Lost Wind” experiments with the distance in the historical time spectrum. These

could be voices coming from the past, a prehistoric time in the evolution of the earth or

more recent events of natural or nuclear disasters.

The third duet of Volcano Songs, called “Hips Dance,” concentrates on the

proximity and mingling of the two voices. This effect is taken to a point at which the two

voices intertwine and “you can hardly tell that two different people are singing...In ‘Hips

Dance’ we push it even further by creating the illusion of more than two voices

overlapping.”19 Both voices start together from the beginning of the piece. One of them

maintains the following drone on the semi consonant [m] throughout the whole piece:

EX. 2.7: Fragment from Monk, Volcano Songs: Duets, “Hip Dance”

19
Meredith Monk, interview by David Gere. .

81
Some hard exhalations—which sound close to [ho]—are interspersed between the

two-beat long notes. The “[m] drone” is only interrupted once in the middle of the piece

by a whole section of those exhalations. At that point, the two voices exchange hard

breathy [ho]s, which create a rather percussive effect. This section acts like a rhythmic

improvisation or percussion solo in the middle of a piece, in which time stops moving

forward, and a suspended syncopation takes place. This is one of those instances when

the voices are indistinguishable from each other, thus creating the sensation of more than

two singers singing at the same time.

While the “[m] drone” continues, the other voice varies a pattern based on the

following material: a short eighth-note sung on [e] followed by four or more sixteenth

notes sung on [m –a] a minor sixth down. These sixteenth notes are interrupted in

irregular patterns by hard exhalations on [ho]. The rhythmic interventions of the

percussive [ho] in between the melodic patterns of the [m-a] and the hummed drone of

the other voice reinforce the fictional sensation of a third voice intervening. Towards the

end of the piece, the tempo accelerates and the frequency and irregularity of inserted

exhalations in between the sung notes increases, exacerbating the sensation of several

voices singing at the same time.

The last of the duets, as its own title “Cry #1” suggests, is a lament. It is not by

chance that this duet uses a phoneme of intense emotional charge. All through the duet

one of the singers vocalizes on [ηg a – ηgæ] while the other slides over a single [η]. The

[η], as other posterior nasals (or velar nasals), is one of the last phonemes to be acquired

by children in languages that employ them. Its late incorporation into the linguistic

system of arbitrary signs in language implies that the child experiments for a longer time

82
during her/his infancy with these sounds in onomatopoeia or sound-gestures. As a

consequence, this nasal-velar sound carries a heavy load of emotional connotations, for

which the child has no words. Roman Jackobson, who developed these theories in his

famous Child Language, Aphasia, and Phonological Universals (1968), argues that

“Sound-gestures, which tend to form a layer even apart in the language of the adult,

appear to seek out those sounds which are inadmissible in a given language.”20 These

sounds will coexist with those employed in the vocabulary for a long time and even be

used as expressive sound-gestures in one’s adult life. Thus, these sounds have a playful

and affective charge.

Rueven Tsur makes further observations about the nature of these nasal

phonemes. He comments that “periodic sounds” such as nasals, vowels and liquids—

those having similar structure for their recurrent acoustic signal portions—arouse a

certain relaxed kind of attentiveness, prediction and order, a quasi-hypnotic effect.21 One

may observe the extended use of these kinds of phonemes in Monk’s Volcano Songs,

which induce audiences to experience similar effects to those described above.

The musical development of “Cry #1” consists of the erratic wandering of the

voices as they slide around a half-step up and down from a central note in oblique

interweaving. The first singer’s voice glides around this pattern, which is constantly

transformed: [ηg a – ηgæ]

20
Roman Jakobson, Child Language, Aphasia and Phonological Universals. (The Hague: Mouton, 1968),
25.
21
Reuven Tsur, What Makes Sounds Patterns Expressive: The Poetic Mode of Speech Perception (Durham,
London: Duke University Press, 1992), 44

83
EX. 2.8: Fragment from Monk, Volcano Songs: Duets, “Cry # 1”

The second voice is introduced in minute 0:40 of the piece and slowly slides in a

humming manner on a sustained [η], departing from a different note than the first voice.

Similar to the previous duet, this second voice acts as a drone. From minute 0:43 in the

piece, each voice alternately moves its pivot note, but when it seems that both voices are

going to coincide, the other moves away and again produces dissonant intervals.

Progressively, after the minute 1:00, both voices accelerate the frequency of their pitch

fluctuation, becoming an undulation, and the phonetic articulation in the first voice

becomes more blurry and muddy. While the second singer keeps humming on [η], the

first one conserves only a murky [æ] from her phonetic set. A short section follows, from

minute 1:58 to 2:18, in which both voices closely and in parallel movements sing on [ηg

a – ηgæ] up and down a half step in an insistent mourning. From this point to the end,

both voices start a process of simplification and assimilation until they collide in an

undulating unison around the same pivot note on a fluctuating [æ], to finally close on the

same pivot note sustained by both voices.

The employment of the nasal phonemes on top of the musical procedures

previously described gives “Cry # 1” a relentless lament quality, which lends itself to a

variety of interpretations. With an unmistakable primary human emotional quality, it is

vague enough to relate to and make all sorts of associations around.

84
A-Ronne: Luciano Berio

In 1974 Luciano Berio composed A-Ronne for Radio Hilversum arranged for five

actors-singers, but in 1975 he revised the piece and expanded it for eight singers, in a

double vocal quartet format. The premier of this second version was made by the group

Swingle II in 1976. Berio’s usual collaborator, the poet Edoardo Sanguinetti, is the author

of the text set in the piece. As Osmond-Smith comments, A-ronne is the product of a

dynamic process of improvisation among the original five actors instigated by Berio,

which resulted in a series of fragmentary sonic dramas derived from Sanguinetti’s text.

He recorded these sessions, which after reworking them were transformed into the eight-

voice concert version.22 While this original first version profited greatly from the vivid

imagination of these five actors, Swingle II brought their imprint to the second version

through their staple sound: a kind of vocally imitated instrumental fusion of jazz and

“classical” styles.

A-Ronne, conceived as a piece of “theater for the ears,” is one the pieces resulting

from “his [Berio’s] eight years of work in the Milan radio” which added “a sharp sense of

the extraordinary flexibility of the aural imagination, where images can flow into, or

coexist with one another with an ease denied to the eye.”23 This shows Berio’s special

concern for the human listening process. A-Ronne, as many of his other pieces, acts

directly upon the different ways that we, as listeners, may apprehend text. Again, as

Monk’s Volcano Songs, A-Ronne is the concrete representation of the way we listen to

text in the “poetic mode,” with the exception that, in this case, Berio manipulates words

as well as pure paralingual gestures.

22
David Osmond-Smith, Berio (Oxford, New York: Oxford University Press, 1990), 98.
23
Ibid., 90.

85
The following analysis focuses particularly on the ways Berio operates on the

literary materials to reproduce a “poetic mode” of listening, which could alert the

listeners to the language elements that we unconsciously relate to in vocal music. This

analysis is based on the published score of the eight-singer version as well as the

recorded version of A-Ronne by Swingle II in 1976, which was conducted by Berio.24

Sanguinetti’s text quotations come from different sources: literary documents,

languages, periods, and cultural circles. Each quote appears in the actual text divided by

semicolons to denote a change from one literary source to the other.

1. The gospel of John in Latin and Greek, as well as in Luther’s German


translation with the modifications of Goethe’s “Faust I”
2. Dante Alighieri’s “Convivio” and “Divina Commedia”
3. The beginning of the Communist Manifesto of Marx and Engels
4. An essay from Roland Barthes about George Bataille
5. The old Italian alphabet25

Even before Sanguinetti’s employment and extraction of them, fragments of these

cultural documents became expressions independent from their historical context or

background. Sanguinetti takes advantage of the original fragmentation that these phrases

have gone through in the real world and deepens their fractures, presenting sections or

even single words. The colon marks clarify further the fractures between each element.

He takes these new wholes and works them through to show their original alienation.

Besides setting the quotes in disturbing contexts, the use of six different languages

highlights the split. Sanguinetti organizes the text of A-Ronne in sections, which are

themed “beginning,” “middle” and “end.” Time is also thematized in words like “run” or

24
Luciano Berio, A-ronne: documentary for 8 singers on a poem by E. Sanguinetti (Wien: Universal
Edition, 1975); Swingle II, Luciano Berio, conductor. A-ronne .London: Decca 1976.
25
List extracted from Norbert Dreßen, Sprache und Musik bei Luciano Berio: Untersuchungen zu seine
Vokalkompositionen (Regensburg: Bosse, 1982), 159.

86
“beginning,” as well as space in the parts of the body: bocca, labbro, annus, pied, etc.

Among the cultural and historical quotes, Sanguinetti interpolates an extra

literary/musical allusion to Guillaume de Mauchaut’s roundeau Ma fin est mon

commencement, which fits with the thematization of space and time and at certain point

almost articulates the dissolution of time “in my end...is my beginning.” The title A-ronne

is related to the conceptualized idea of time and the sections of the piece. “A” was the

first letter of the old Italian alphabet and “ronne” the last one. All material in between

represents the rest of the alphabet. This is the poem as Sanguinetti presents it:

1.
a:ah:ha:hamm:anfang:
in:in principio:nel mio
principio:
am anfang:in my beginning:
das wort:en arché en:
verbum:am anfang war:in principo:o lògos:è la mia
carne:
am anfang war:in principio:die kraft:
die tat:
nel mio principio:

2.
nel mezzo:in medio:
nel mio mezzo:où commence?:nel mio corpo:
où commence le corps humain?
nel mezzo del cammino:nel mezzo
della mia carne:
car la bouche est le commencement:
nel mio principio
è la mia bocca:parce qu’il y a opposition:paradigme:
la bouche:
l’annus:
in my beginning:aleph:is my end:
ein gespenst geht um:

3.
l’uomo ha un centro:qui est le sexe:
en méso en:le phallus:
nel mio centro è il mio corpo:
nel mio principio è la mia parola:nel mio

87
centro è la mia bocca:nella mia fin:am ende:
in my end:is my
beginning:
l’âme du mort sort par le pied:
par l’anus:nella mia fine
war das wort:
in my end is my music:
ette, conne, ronne:

According to Norbert Dreßen, although A-ronne continues a similar kind of

confrontation with the human voice as Sequenza III, the former “enriches the work with

the possibilities that are offered by a double quartet and through a referring system that

extends the concert frame of his Sequenza.”26 Regarding the first point, instead of the

isolation of the vocal actions of Sequenza27 performed by the solo singer, A-ronne knits

the individual vocal expressions of each of the eight voices into the full structure of the

piece: “the interpretation of each voice horizontally constantly refers to the collective

connection vertically.”28

Departing from Sanguinetti’s poem, Berio recomposes a new text for the score,

which adds sounds in phonetical writing, syllables, words, sound gestures, and a

completely new language. These new elements may or may not be related to words or

fractions of those that already are part of the poem’s text. Although not precisely in its

original order, the whole poem is repeated about twenty times in its entirety. As Dreßen

comments, in this manner Berio unravels the “meaningless sign chain” nature of this text

specifically, and any text in general, within the general state of crisis of the language.29

26
Dreßen, Sprache und Musik bei Luciano Berio, trans. L. Guillén, 157.
27
In the instructions for performing Sequenza III, Berio requests that attention be paid to the timing
indications on the score for each section to maintain the rapid succession of vocal events. He aims to create
the illusion of one voice polyphony, such a rapid articulation of diverse vocal sounds that the listener could
perceive as several voices.
28
Dreßen, Sprache und Musik, 158.
29
Dreßen, Sprache und Musik, 162.

88
The rather negative approach of Boulez to text is transformed into a more positive

conception of the same issues in Berio’s hands. Boulez asserts that any text set to music

will be inevitably destroyed. Text may be at the “centre” molding the musical piece but at

the same time “absent” or completely unintelligible as a logical discourse. In contrast,

Berio proposes a new combined and fluent media as previously commented on in this

chapter. For Berio, certain fragmented text elements may be perceived as music and still

retain some affective communicative potential. These incomplete phrases, words,

syllables and phonemes inhabit the transitional realm of his new “combined and fluent

media.” Berio enacts the language crisis and transforms it into a new communicative

experience without dwelling on the impossibility of keeping the integrity of the

discourse.

Thema (Omaggio a Joyce) (1958) was Berio’s compositional turning point with

regard to text treatment. Departing from an impressionistic reading of Sanguinetti’s

poem, which breaks the logical/semantic continuous discourse, Berio first set down only

those features of the text that could be perceived during a first reading: text fragments,

words, syllables, phonemes, paralingual vocal gestures, intonational inflections and

contour, vocal timbre modifications. His phonological studies and observations lead him

to purposely take into account the perceptual experience from the listener’s point of view.

According to his beliefs, in the already disturbing context of a musical setting, only those

features immediately perceived have a chance of being captured by the listener. This

same kind of approach is found in other pieces by Berio, such as Laborintus II and A-

ronne.

89
Laborintus II not only uses texts of the same poet, Sanguinetti, it also sets a

precedent in developing a multilayered counterpoint in a small vocal ensemble. The

theatrical element is present in the human interrelations represented in the corresponding

layers of eight singers. The following comments of the composer about his Laborintus II

(1965) find further realization in A-ronne:

The first step...was to set down some features of the text spontaneously so realizing the
polyphony attempted on the page...Nor should it be forgotten that only those features
immediately perceptible on a simple reading of the text have been taken into
consideration...30

This rather spontaneous approach materializes Berio’s conception of communication and

expressive connection between the text and its receptor through its more rudimentary and

firsthand apprehensible sonic elements. With respect to Laborintus II, A-ronne refines

and deepens the exploration of the phonetic and paralingual aspects of its text.

In A-ronne, Berio continues the process of transforming the prose into a poem

already initiated by Sanguinetti.31 By increasing the isolation—decontextualization and

recontextualization of the material—the text is opened to a variety of interpretations and

ambiguity of meaning.

Verbum caro in principio die Kraft die tat der sinn (alto 2, p. 3)

In my beginning am anfang o lògos in principio erat (tenor 2, p. 3)

Nel mezzo della mia carne: la bouche: ou commence le corps humain:[o] in my


beginning:l’anus:in medio (alto 1, p. 7)

The repetition of words or partial phrases adds to the recontextualization of the material,

assisting the transformation of prose into poetic form.

90
Nel mezzo nel mio mezzo nel mio corpo oh nel mezzo: nel mezzo del camino

nel mezzo della mia carne la bouche è la mia bocca la bouche la bouche la

bouche (alto 2, p. 22)

Neither Sanguinetti’s text source nor Berio’s final scored text preserves cohesive

blocks of discourse with patterns of ordinary speech—formed sentences that make

paragraphs. A-ronne uses two formal types of speech: disjointed and enumerative. Most

of the passages are made up of short phrases that never develop into a more fluent kind of

speech. Both types explore contrast and repetition as a source of music/ sonic interplay.

The repetition of words and phrases, as well as especially the accumulative repetition,

creates emphasis, climax and expectancy. This kind of repetition often creates points of

assonance and alliteration inside a word or among words. In several cases, in cataloguing

and enumerative passages, the words are bound together more by their similarity in sound

than by their logical connection. Berio comments, in an interview with Bálint András

Varga, that “the grammar of A-Ronne is focused on a single technique: alliteration. It is

with the help of alliteration that I musically reorganize this rather complex text.”32

The repetition of words or phrases is not an exclusive prerequisite to achieve

alliteration and assonance. The percussive effect created by the vertical superposition of

the following words is achieved by their similar phonetic content. Stop-plosives, such as

[t]-[k]-[d], in their two typographic representations, “c” or “k,” the fricatives [v]-[f] and

the flipped and rolled [r], dominate the phonetic field in this passage. The following

30
As quoted in Peter Stacey, Contemporary Tendencies in the Relation of Music and Text with Special
Reference to Pli selon pli (Boulez) and Laborintus II (Berio), (New York/London: Garland Publishing,
INC, 1989), 156.
31
Sanguinetti’s poem is made, for the most part, from quotes coming from famous texts in prose form.
32
Dalmonte and András Varga, Two Interviews, 142.

91
example comes from the three first coincidences between some of the eight singers on

page 5 of the published score:

S1) verbum ach ach

S2) ach die Tat die Tat

A1) das Wort das Wort ach

A2) die Kraft die Kraft die Kraft

B1) caro caro caro

In some instances, the phonetic material employed in the piece is textually

derived from surrounding words, but in other instances, phonemes seem not to have any

textual connection and have a purely musical function. On page 4, the sung notes in the

baritone 1 on the phoneme [o] are clearly derived from the preceding “caro” of the alto 2

and soprano 2 or from “principio” and “wort” of baritone 2 or soprano 1. Another case of

direct correspondence is found on page 6 where the baritone 2, while “stuttering,

coughing, suffocated by words and saliva” —as the composers indicates on the score—

anticipates the consonants [d]-[s]-[z]-[d] of the words he is trying to deliver: “der Sinn.”

But other episodes, such as the succession of phonemes [u]-[i]-[Λ]-[o]-[a] on page 16 and

17 in the baritone 2, are for the most part unconnected to any surrounding word. The case

of the counterpoint between tenor 1 and baritone 1—starting at the end of page 20 and

running into the first half of page 21—is a purely musical event, which focuses on the

colors and sonic quality of these phonemes. In rapid succession tenor 1 delivers a set of

consonants: [l]-[f]-[s]-[m]-[n]-[d] alternating in a kind of timbric counterpoint with a set

of vowels spoken by the baritone 1: [a]-[e]-[i]-[ai]-[e]. Both singers receive the

92
composer’s indication to perform this passage as if they were “teaching vowels” or

“teaching consonants.”

Besides the transformation of the literary text, Berio “performs a second

transformation process, by interpreting the meaning of the fragment in literature as the

‘representation of the invisible in the visible’ as the relation between the written picture

and its acoustical realization.”33 For this realization, Berio resorts to a vast and

meticulous musical vocabulary: twelve forms of notation, three procedures for time

organization of the piece, seven dynamic steps, and ninety-six performance instructions.

Dreßden makes an exhaustive list of these procedures.34 The following list shows the

range of vocal sounds that Berio requires from the performers: from the spoken to the

sung, going through a variety of physical modifications that affect the sound result.

1. written text: spoken


2. sounds in phonetic writing, each according to sounding rules
3. scratching through throat
4. breathed, almost whispered
5. singing or speaking with closed mouth
6. as high or as low as possible
7. moving the voice in that register
8. ornamentations that are easily articulated
9. bridge over an interval (by sliding of the voice)
10. spoken/singing (with the given intervals)
11. sung pitch that is to be kept to the end of the time unit
12. sung materials35

The ninety-six performance instructions gather terminology from the area of

prosody (including paralingual sounds), emotional moods, sounds produced by different

33
Dreßden, Sprache und Musik bei Luciano Berio, 162.
34
For a complete list of notation forms, time organization, dynamics, and performance instructions in A-
ronne, please refer to pages 162 to 167 of Dreßen, Sprache und Musik bei Luciano Berio , Chapter V.
35
List reproduced from Dreßen, Sprache und Musik, 162-163.

93
body parts, and musical or theatrical/acting situations—interpreting different characters

in various settings. The following table shows examples of each of those areas:

Performance Examples Page


Instructions
Prosody “highly inflected” 3

“in an explanatory manner” 3

“discontinuous with occasional questions” 3

“violent and quick” 4, 5

“Stuttering, coughing, suffocated by words and saliva” 6

6
“gasping out the words”
7
“tense murmuring”

Emotional Moods “outgoing and happy” 3

“cold” 3

“angry” 4

“sadly”
15

Sounds Produced by “flickering tongue against the upper lip” 6

Body Parts “’Pop’ sliding finger inside-out of mouth” 6

“’Chewing’ quickly-mike against the mouth” 7

Musical Situations “Singing unrelated pitches” 1, 2

“like a bumpkin’s marching song” 16,17

“Vocalizing” 27

Theatrical/Acting “like a dictator’s harangue” 4, 6

Situations “like two priests murmuring a prayer” 7

“like a drill sergeant’s questioning” 16,18

“intimidated by the sergeant” 16,18

TABLE 2.1: Classification of performance instructions in Berio’s A-ronne.

94
A variety of vocal sound production styles dominate the performance spectrum of

A-ronne:

1. Unnotated recitation (p. 7-8 every singer; p. 9-11 baritone 1 and 2; p. 34 tenor 1

and baritone 1; p. 35 tenor 2; p. 41 tenor 1)

2. Spoken short phrases, words or syllables with notated rhythms and no pitch

indication, (p. 1, p. 18-19, p.23-24)

3. Spoken short phrases with notated rhythms on a single line denoting the central

point of the speaker’s register: p. 1-3; sporadic use between p. 44–46; the scat like

unspecified melodic contours on the syllables “de” and “den” at the beginning p.

16, 17, 18, 20, and the longest episode On p. 21.

4. Singing with an unstriated pitch: notated around a central line (usually small

vocalizes on one of the pure vowels, on p. 1 alto 1, p. 2 alto 1 and tenor 1; all

these have the following performance indication on the score “(singing unrelated

pitches).” Unless this is indicated, the rest of syllables notated in this way are

spoken.

5. Singing with musical parameters (melismatic and fragmented). This piece

presents several instances of singing, which evoke different musical styles.

Among the passages sung with lyrics: p. 8-9 a Marenzio-like-madrigal among

most of the singers; p. 28 soprano 1 and alto 1 indicated “(as a folk song)”; p.12-

13 and p.39 baritone 2 sings like a double bass detached monosyllabic mixed text

“(Expressivo like an accompanying DB)”; tenor 2 starts with two fragments on p.

15, but the whole melody appears on p.16 “(like a bumpkin’s marching song)”

95
6. The other singing passages are sung in single phonetic sound that do not combine

to make any word: “(dreamy and distant)” tenor 2 melody; chords sustained by

several voices on transitional vowel phonemes p. 25; p.27-28 “(vocalizing

independently from other voices)” alto 2 sings arpeggiated tonic-dominant

seventh chord sequences on [a]; p. 29-31 singers take turns on a melismatic

vocalization with specific rhythmic notation to serve as background of folk-like

singing of mainly alto 1 and then soprano 1 in p.31; a pseudo-romantic

transformation of a Bach-oratorio-like counterpoint sung among all voices except

tenor 2 who is reciting on vowel phonemes; from p. 42 to the end, staccatos at

different rhythmic patterns superposed among all the voices, which progressively

transform in more sustained longer chords.

7. Syllabic singing (nonsense syllables), the most prominent instances are the

passages on “de-de-den” as on p. 23, 25 and then the whole p. 27-28; melodically

and rhythmically these passages imitate the jingle jazzy commercial vocal style.

8. Singing “bouche fermée” (bocca chiusa or closed mouth) appears in short

passages or as a transitional articulation mixed with phonetic or syllabic singing:

p. 13-14 in tenor 2 and baritone 1; p.17 baritone 2; p. 23-24 sustained chord by

altos and sopranos; p.31 sopranos, alto 2 and baritone 1; p. 38 everybody; p. 40

tenors and baritones.

The spoken passages, such as the “unnotated recitation” and the “spoken words

and short phrases” expose a wide range of performance styles according to the mood

indicated by the composer on the score. Since these are not musically notated, all

96
parameters are open to the performers’ interpretation. These passages have neither pitch

indication nor rhythmic notation; their performance intonation arches and speed are only

determined by the mood indicated by Berio over each event. On page 7 and 8, the

indication of “tense murmuring” and the whisper sign, o--------, turn to a very fast

masked delivery of the text that, added to the polytextuality among the singers, makes

understanding almost impossible. The pitch results in the medium-high register of each

normal speaking voice. Perhaps to avoid the natural performance tendency of associating

fast and high pitch, on page 9, the indication for baritone 1 and 2 is “Like two priests

murmuring a prayer: fast and low tones.” Superposing the other voices singing the

“quasi-motet” passage over the low pitch and volume of the baritone 2’s singing, the text

is masked, and its understanding obscured. The low pitch speaking quality is kept

through p. 10 when alto 2 is introduced in a counterpoint with baritone 2; but the pace

slows down to create the sensual and intimate ambiance suggested by the indication

“Like an intimate dialogue: with a husky and hesitating tone.” Sighs are incorporated

between this hesitant tone phrases. Intelligibility is regained because of the slower pace

and over articulation of the words as a mode of sensualizing their sound, their phonetic

coloring.

A similar situation is encountered on page 22, this time between tenor 2 and alto

2, “intimate and sensual with occasional o-------.” In this passage, all the voices, except

baritone 2, engage in an overlapped multitextual recitation “colloquial, gigglish and o-----

--.” The gigglish quality makes every voice explore the high end of its speaking register

and move in a fast manner. Right after this passage, tenor 2 and baritone 1 engage in a

process of “Stuttering, gradually faster” recitation for tenor 2 and “Very fast, gradually

97
stuttering” for baritone 1. The “faster” indication for the tenor towards the end of this

section, added to the maintained “very fast” of the baritone, builds excitement toward a

climax via a natural crescendo and acceleration of the delivery pace. Certain words

become clear at moments because of the thinner texture—only baritone on a syllabic

monotonous line and the two tenors. But the overlapping of two different texts and the

fragmentation of words produced by the stuttering diverts the listener’s attention from the

meaning of the words to the phonetic colors that constitute them.

The paralingual realm of A-ronne is highly developed, suggesting with this sonic

material a “radiophonic” drama. Nevertheless, it is rather difficult to categorize or

interpret the exact dramatic meaning of these sonic gestures. They are ambiguous and

simultaneously evocative of different scenarios. Some of them may have only a musical,

quasi-instrumental, function. The last event of page 21 is a 25-second long section where

the eight singers are instructed to perform alternately the following paralingual sounds:

gasping, cough, grunting, snorting, straining, groaning, exhaling loudly, moaning. All

these—with the exception of the two baritones—conclude in a general “breath” which

precedes the sensually charged exchange—indicated by the composer as “intimate and

sensual”—between tenor 2 and alto 2. These last two singers alternate phrases about the

human body with breathy sounds, such as “nel mezzo…nel mio corpo…nel mezzo della

mia carne…nel mezzo della mia carne…nel principio…la bouche…l’anus.”

On the one hand, altogether, this paralingual superposition does not make sense as

a cohesive action or a chain of reactions in relation to each other. On the other hand, this

episode may be seen as the deconstructing turning point of the previous authoritarian

situation in which baritone 1 “like a drill sergeant’s questioning” intimidates the

98
answering tenor 1. At the beginning of this section (page 16), tenor 1 seems to answer

imitating exactly what soprano 2 indicates to him in a whispering mode. Later, on the

second half of page 18, soprano 2 drops her indications and for the next two pages,

baritone 1 and tenor 1 continue their “angry and hysterical” (as indicated by the

composer) counterpoint. Throughout this authoritarian exchange, the delivered text is

very similar to the already quoted text of the seduction section between alto 2 and the

same tenor 2:

Tenor 1 ha un centro le sexe le phallus

Baritone 1 L’uomo? Qui est? en meso en?

Tenor 1 è la mio corpo è la mia parola etc….

Baritone 1 nel mio centro? Nel mio principio? Etc…

TABLE 2.2: Exchange between tenor 1 and baritone 1 in Berio’s A-ronne, p. 18 to 20

The meaning, however, is completely different. The semantic content of the

words is almost obliterated by the harsh delivery. And the only thing that a listener

perceives is the aggressive tone of the baritone and the intimidated tone of the tenor, who

progressively is taught (or brainwashed by the other). To a certain degree, the text being

used could have been this one or any other because what we apprehend is the aggressive

tone and procedure of forcefully imposing answers at any cost. This effect becomes

progressively clearer when, in the second half the score page 18, their phrases start to

collapse with each other. On page 19, phrases or even words are completed between each

other in a kind of “hocket,” while in other cases different words or complete phrases are

overlapped. By page 21, the only material exchanged between these two singers is a set

99
of vowels and consonants that they pretend to teach each other. Thus, we arrive at the

paralingual episode of score page 21—described above—which should be thought of as a

cathartic preparation for the sensual section in which the physicality is exacerbated and

certain body parts become the focus of sexual desire. The chaotic superposition is the

release and final step in this gradual relaxation and disintegration of language and vocal

sound expression: from phrases to phonemes, from vocal expressive intonation to the

isolated vocal gesture (paralingual gestures). Finally, although this chaos is difficult to

explain as a coherent act, each paralingual action preserves its everyday concrete

meaning.

A complete different effect is produced by the paralingual episode on page 6,

which has a purely musical function. It offers a sonic mattress, a texture, against which

the baritone 2 stutters through the phrases that he is trying to enunciate: “der sinn…o

logos…Am anfang war…die kraft…die tat.” While he fights with these words, the rest of

the seven singers engage in a loop of spoken and whispering single phonemes and

paralingual sounds, such as: “inhaling and exhaling through teeth,” “flickering tongue

against upper lip,” “whistle,” “‘Pop’ sliding finger inside-out of mouth,” “squeak.” Most

of these sounds have no immediate reference to any usual human action—with the

exception of the whistle—and are mere playful effects with a purely musical textural

function.

The three main factors that affect the intelligibility of the text in Laborintus II

may be observed in similar conditions as affecting factors in A-ronne: 1) vocal or

performance style; 2) text condition; and 3) masking. The optimal case of intelligibility is

achieved when the text is spoken, well articulated and more or less intact, in its “prime

100
condition.” When the same text or different texts are recited by two or more voices, the

intelligibility depends on how much each word or short units of text are superimposed. In

some cases, two singers alternate in a rapid, moderate, or slow paced delivery. In this

case, the text could be clearly understood unless another masking factor, such as

underarticulation (babbling, whispering, low volume combined with rapid articulation, or

monotone reciting in low volume), blurs its intelligibility. Superposition of words or

phrases, plus any of these masking factors, only intensifies the ambiguity of the text.

Whenever melody with lyrics are superimposed with a spoken text, the listener’s

attention will be drawn to the message of the latter, while the melody with text becomes

the background. This is especially true when the sung melodic passages are soft in

volume and melismatic in their lyric setting, while the spoken ones are clear and over

articulated. Only in certain circumstances, when stylistic performance effects mask the

spoken text, could the sung melody take a leading role and directly grasp the listener’s

attention. Whenever the sung passages with lyrics are the center of attention, the text

tends to be more unclear in melismatic settings than in syllabic ones. Any vocal or

performance masking effect could obviously add to clarity, or obstruct the mentioned

perceptual tendencies.

The following is a list of passages in A-ronne in which one, or more than one, of

these intelligibility affecting factors intervene, modifying the perception and

understanding of the text:

1. Page 1–3: In the first three pages of the piece, although the “prime condition” of

the text is not conserved, its fragmentation is counteracted by the repetition of the

101
same words, “in mio principio.” In this way, the listener has several opportunities

to grasp them.

2. Page 3: The first example of complete unintelligibility caused by superimposition

is found on this page. For twenty seconds, the eight singers recite different texts

simultaneously. On top of the already obstructed text, the individual recitations

are fragmented by interspersed sung pitches on the last phoneme of the previous

word. A similar situation is also encountered on page 7. Page 8 is an expanded

version of page 3.

3. Page 4: Although all the voices articulate words or short phrases at the same time,

the repetition of some of those in the vertical axis (the same word or phrase

pronounced at the same time by two singers) or in the horizontal axis (successive

repetition of the same material by the same singer or a different one) brings a

certain accessibility to the text.

4. Page 6: The vocal style employed by baritone 2, “stuttering,” offers a clear

instance of text obstruction.

5. Page 22: The “intimate and sensual” dialogue between tenor 2 and alto 2 on page

22 offers almost no intelligibility because of the masking effect of the whispering

mode and the low volume of the voices.

6. Page 9: The low volume and pitch performance style mask the two different texts

that baritone 1 and 2 speak. This effect is especially emphasized by the upfront

sounding presence of the female quartet singing a quasi-Marenzio madrigal in

German created by Berio.

102
7. Page 13: Again, Berio further alters the “prime condition” of the text. The

composer takes, from the already fragmented text compiled by Sanguinetti, only

monosyllables and mixes them in a completely incoherent succession.

8. Page 16: Baritone 1, soprano 2 and tenor 2 offer an instance of alternating phrase

delivery in the manner of a dialogue. Although certain masking factors, such as

whispering in soprano 2, are intervening, the relaxed pace, alternation with the

baritone who pronounces different text, plus the superposition of the tenor 2

repeating the same text in a different vocal style, contribute to the understanding

of the text without major difficulties. At the same time, on this same page, the

composer adds an extra textural layer, a sung melody with lyrics by tenor 2. The

melismatic and soft quality of his singing, plus the natural tendency of the listener

to be attracted by the strong shouted delivery of baritone 2, places it as a

melodious background texture

9. Page 18: The multilingual and extreme fragmentation of the text does not

contribute to the otherwise potentially understandable text of the dialogue

between tenor 1 and baritone 1.

10. Page 21: The composer pushes the fragmentation of the text of Sanguinetti to an

extreme. On page 21, Berio breaks the text into its phonetic components,

grouping vowels on one side and consonants on the other. The effort of

reconstruction is almost impossible at the time of listening; only after several

times listening or looking at the score does one realize that there are phonetic

fragments of the words immediately preceding this passage.

103
11. Page 23-26: In these pages, the process that the text undergoes transforms it from

a clear understandable dialogue between tenor 1 and baritone 1 to a broken

phonetic extraction of only the vowels of the preceding words. This again

alienates its comprehension. Only the hocket between the two singers of

consonants and vowels in their original order at the end of page 26 offers the

listener the opportunity of reconstructing the message, “in my end is my music.”

12. Page 26: In this passage, not only are tenor 1’s and baritone 1’s recitations

superimposed, but the “stuttering, gradually faster” and “very fast, gradually

stuttering” indication obliterates the comprehension of the delivered text.

13. Page 29: Another instance of sung text is found on page 29, but this time—in

contrast to what happened on page 16—the syllabic setting, successive repetition

of the same melody and lyrics by different female voices and the medium high

volume of the performance contribute to a better understanding of the text.

14. Page 35: Tenor 2’s recitation is acoustically set to the background and buried in

the thick vocal polyphonic texture of the rest of the voices. The intimate

monotone and low volume delivery of tenor 2’s recitation also does not contribute

to a clear understanding of the text, relegating it to an almost percussive

background effect: a percussive texture in a melismatic context.

15. Page 41: At the end of this page, the inaudible recitation of tenor 1 is

overpowered by the forced and over articulated whispering of alto 2, which takes

the sonic front stage over the tenor’s textural background. While alto 2’s text is

clearly understood, tenor 1’s becomes a mere incomprehensible bubbling.

104
As mentioned before, in A-Ronne, Berio recreates the semiotic structural

manifestation or dramatization of the elements of language and their agony when set to

music. In terms of its overall structure, this piece explores all the possible degrees in the

music-text continuum proposed by Berio. While traveling across the whole spectrum of

possibilities, A-Ronne contrasts and overlaps in a single texture the pure extremes, word

or music, or any intermediate scale degree of the synthesis of the two mediums.

One may divide A-Ronne into several sections according to the types of processes,

vocal styles and text materials employed.

a) Section 1: The section compressed between score pages 1 to 6 focuses especially on

the contrast between fragmented text—words that are repeated—and free vocalizations

on vowels extracted from preceding words. The spoken material is mostly set to notated

rhythms and shouted in different moods or whispered in unrelated undetermined pitches.

Among the briefly sung material, besides the vocalizations, there are also short sustained

pitches on single vowel phonemes. There are occasional paralingual vocal gestures, such

as sighs and belches, and sound effects such as bocca chiusa. But the climax of the

paralingual realm is not reached until page 6.

Most of these performing styles—part of the continuum—are mixed by the eight

voices vertically as well as horizontally, creating a chaotic textural mix of short phrases

delivered in different manners, which overlap and succeed one another. Page 3 starts

clarifying this notated sonic multi-texture and proceeds into a section of 20 seconds in

which the eight singers deliver straight spoken text in different moods and insert short

sung phonemes—extracted from preceding words—in determined pitches. Next, page 4

again concentrates on coordinated short phrases of the text, which are rushed and shouted

105
by all voices except tenor 1—who inserts his phrases in between with a prescribed

rhythm and later adds a specific pitch contour. Page 6 is mostly devoted to the

paralingual realm. All the voices—except baritone 2 who stutters over the consonants of

mixed text—loop over a delirious combination of non-linguistic sounds and vocal

gestures.

In this way, this first section opens the palette to most of the vocal styles and

textures that will be developed in the rest of the piece.

b) Section 2: Score pages 7 to 12 focus, for the most part, on the contrast between the two

vocal styles that constitute the extreme in Berio’s continuum: speaking and singing. After

the previous introduction to a big portion of the universe of vocal possibilities, now, in a

more economical manner, this section overlaps—in vertical contrast—the extremes, the

opposites: straight spoken text against sung melody—rhythm and pitch notated—in a

parody of Renaissance counterpoint.

Page 7 opens with all the voices superposing spoken complete sections of the text

of the second part of Sanguinetti’s poem. Then it proceeds to the insertion of short sung

phrases of what on page 8 will become a full contrapuntal melody. So far the text of the

sung phrases is clear and understandable.

The transitional page 8 gives way to page 9 on which four of the singers

alternately take the singing role; baritone 2 and a second singer—who changes from

voice to voice—recite passages from the first and third parts of the original poem.

106
c) Section 3: Score pages 12 to 18 are devoted to the contrast of different singing styles

with sporadic interventions of paralingual sounds—imitation of animals—and some

nonsense spoken syllables on notated rhythmic patterns—scat-like style. Besides short

“desperate” shouted phrases by alto 2, the spoken text in free rhythm is not contrasted

with singing until page 16.

This section opens at the end of page 12 with baritone 2 singing in a “basso

continuo” manner, on syllables that, although extracted from words of the poem’s

original text, are decontextualized to such a degree that the result seems a mix without

any syntactic sense. On page 13, tenor 2 performs a distant and soft contrapuntal melody,

which articulates the vowel phoneme of the immediately preceding syllable sung by

baritone 2. These two voices offer an instance of complete unintelligibility of the text,

resulting from the extreme fragmentation of the discourse. This procedure transforms the

two voices, resulting in an instrumental effect in which the vowel becomes pure sounds

with a delay or reverberant effect created by the tenor’s post articulation.

By page 14, some of the other voices intervene in the background with singing

lines in “ppp” on half-notes and slow rhythms on vowels also derived from the syllables

of the baritone. The rest of the voices perform animal sounds, scat-like quick short

phrases, or short bits of melodious popular chants—which make use of more organic

phrases of the text. All of them add to the purely musical or instrumental multiple texture.

By score page 16, the spoken word becomes dominant over any background

singing. The predominant baritone 1 shouts authoritarian commands “like a drill

sergeant’s questioning” and intimidates tenor 1 who answers with whichever phrase is

107
prompted by soprano 2. But this brief spoken passage is shortly invaded again by most of

the singing multiple textures of the previous pages of this section.

d) Section 4: This section explores the medium zone of the music-text “continuum.” Two

types of spoken words are set to musical parameters: a quasi-“Sprechstimme” and

fragmented spoken words articulated in a hocket with precise notated rhythms. Regarding

the quasi-“Sprechstimme” vocal sound production style, in these pages there several

instances in which most of the singers pronounce the syllables “den” or “my” in an in-

between speaking and singing manner. This may be described as speech with a melodic

contour of unspecified pitches.

The second procedure articulates plain spoken words or short fragments of a

sentence in hocket. Usually, it starts to alternate freely but soon the composer assigns a

very well-structured rhythmic pattern to each singer. It is at that moment that the text

becomes fragmented word by word or syllable by syllable, and the phrases are put

together only by the interplay between the two singers. The two times that the text is

broken in syllable or phonemes, those particles derive directly from the immediately

previous clearly pronounced material: page 19 “beginning” and page 20 “aleph is my

end.” Showing the word immediately before its disintegration allows the listener to

appreciate and understand its meaning at the same time that it challenges and involves

her/his musical perceptive capabilities in reconstructing the words.

This section closes with a 25-second episode of decontextualized paralingual

sounds and high pitches “imitating the ‘call’ of Algerian women” produced by the four

female vocalists. This sonic conglomerate becomes almost a mechanic symphony of

108
undistinguishable sounds or the turning of a gigantic squeaky wheel. Again the lingual, or

paralingual in this case, becomes pure music.

e) Section 5: After the isolated transitional page 22, the section between pages 23 and 28

further explores similar elements and procedures as the 4th section. But now these text

hockets and “den” passages, on the one hand, are subject to better defined musical

parameters, and on the other hand, can either be perceived more as text or music, the

extremes of Berio’s text-sound continuum.

The short “den” motifs now are rhythmically and melodically notated. From page

23 to 26, every time these motives appear, they are used as trigger of a dialogue between

tenor 1 and baritone 1, who declamate text over a sustained chord by the other voices.

The first time (p. 23-24) this chord is sung in bocca chiusa, the second time (p. 25-26) on

vowels derived from the text declamated by tenor 1 and baritone 1 on pages 23 and 24.

This purely musical texture serves as background to the dialogue made of short fractions

of sentences. Tenor 1 and baritone 1 each complete the last syllable of the last word left

incomplete by the other. This is one of the few instances in which a large section of

Sanguinetti’s poem (the whole Part 3) is delivered in its original order; from “L’uomo ha

un centro...” to “...ette, conne, ronne.”

Although in section 3 and 4 there are fractions of the text in its original state, it

has not ever been as clear as in this passage; this happens for several reasons. First, the

text is not as fragmented as before; now it is delivered in bigger chunks by each singer.

Second, the rhythmic patterns, to which the text is set, allow the logical, natural flow of

intonation—not too frantic not too slow—and do not disturb or distort its understanding.

109
Berio allows the performers to take those rhythmic patterns with flexibility as he

indicates in a footnote: “Suggested rhythm and speed; minor modifications and

adaptations are possible.” Third, the sustained chord texture accompanying this dialogue

is completely unobtrusive.

Finally, the whole section, with its harmonized “den”s and declamations, creates

the illusion of listening to a radio commercial. The melodic, rhythmic, harmonic voice

tone and performance style of the vocal ensemble resembles commercial jingles, which

generally introduce and draw the attention of the listener to the commercial selling

speech that follows. Then, usually two announcers declamate that selling speech,

alternating their voices with overarticulated inflections of their speaking voices. This is

certainly the fourth and more definitive reason of why this passage is as intelligible as it

is.

Thus, A-ronne reaches its maximum intelligibility and textual integrity exactly in

these two pages, which fall in the middle of the whole score—pages 23 and 24 of a total

of 48.

f) Section 6: This section—pages 28 to 33—is completely devoted to the singing realm.

Soprano 1 proposes the first half of a singable melody, which then is completed by alto 2.

On page 30, soprano 1 restates the same part one of the melody, but this time alto 1

repeats this first section after her and completes the rest of the melody. While these

singers perform this theme, the other voices accompany with vocalizations on vowels.

The accompanying texture thickens progressively toward page 31, on which not only all

the other voices have been introduced, but also their rhythmic activity increased. By page

110
32 and 33, the eight singers vocalize on frantic scales in triplets, quintuplets, and

sixteenth-notes. This overlapping creates polyrhythm and cacophony, since now the

vocalists perform a succession of quick repetitious syllables: “lo-go-lo-go-lo-go-etc;” “ra-

ga- ra-ga- ra-ga-etc;” “ca-ro- ca-ro- ca-ro-etc;” etc. By the end of this section—page 33—

all collide on a unison D on the [o] phoneme.

The syllabic setting and simplicity of the melody proposed by soprano 1 and the

alti allows the listener to understand the words. The fragment set to this melodic passage

is extracted in its original layout from the first half of Sanguinetti’s part 1 of the poem

without any further editing by Berio. But shortly after its introduction, the accompanying

texture gets thicker and busier, progressively obstructing the previous clarity of this text

fragment.

g) Section 7: This section—score pages 34 to 40—goes back to the contrast of the two

pure extremes of the text-music continuum. Although several of the voices have spoken

passages,36 those recitations are perceived as a percussive texture. In the beginning of

page 34, this effect is created because all the singers’ overlapped polytextuality makes the

literal understanding of the text impossible and at the same time highlights the richness of

the consonants’ phonetic colors. The rest of the recitations are either overlapped and

delivered in a fast stuttering manner, or become like a murmuring praying background to

the sung parts. In the first case, the stuttering again emphasizes the percussive effect of

the consonants of the words. In the second case, the whole text melts in a muddy and

monotonous “sonic mattress.” The text employed in these passages is drawn from the

36
At the beginning, all except baritone 2 have a spoken passage, then tenor 1 and baritone 1 and finally
only tenor 2.

111
three parts of Sanguinetti’s poem; in some cases only fragments are extracted and in

others the text is incorporated in its original order or phrase by phrase backwards.

The sung parts consist of an ornamented vocal line, usually carried by one voice,

harmonized by the others. All these lines are textless and performed on isolated vowel

phonemes. But harmonic richness and dramatic melodic content dominate this section.

h) Section 8: The last section—pages 41 to 48—opens with a similar “den” jingle-like

ensemble. But in contrast with the one in the 5th section, this one is decomposed:

rhythmically fragmented and harmonically dense (more dissonant). Also, this time, the

announcers, although they use the same text from Part 3 of the poem, do not establish the

dialogue game as they did before. Tenor one is almost inaudible, murmuring in the

background, and alto 2, who loudly whispers “in my beginning Aleph is my end” (from

the poem’s Part 2), dominates on page 41 against the murmuring and a sustained sung C4

of the other voices. By the next page, that sustained C4 breaks into quarter notes that

immediately start to unlock from their homophonic layout into a slightly displaced

polyrhythm, which creates an echo or delay effect. In the following pages, the eight

voices also progressively expand their harmonic spectrum. They mostly sing on isolated

phonemes or syllables derived from different parts of the text. The phonetic material also

increases in complexity, both horizontally—in the successive articulation of a single

voice—and vertically—in the overlapping of several voices.

In this section, the singers interrupt the singing with isolated phrases from

different parts of the poem. These phrases are mostly whispered in Sprechstimme style

112
and performed according to the different character indications: “dreamy,” “solemn,”

“urgent,” “ecstatic,” “frantic,” ”sensual,” etc.

On score page 47, the eight voices lock into a dissonant homophony that

crescendos until colliding into a held open perfect fifth, which soon after dissolves into a

fading dissonance until the closing spoken letters of the Italian alphabet: “ette, conne,

ronne.”

These detailed analyses of Volcano Songs: Duets and A-ronne reveal the

procedures through which Monk and Berio transform the sonic elements of language into

structural components of their pieces. Through these means, they explore the

communicative potential of the non-linguistic aspects of language and human vocal

gestures that are usually disregarded in text set into music and make a conscious

representation of the “poetic mode of listening.”

113
IV.

THE POPULAR SONG

The cases analyzed so far have been pieces from the “art-music” realm: art-song,

opera and oratorio repertoire. Examining this dissertation’s hypothesis in the context of

the repertoire of the “popular-music” realm—jazz tunes, folk/blue grass and pop songs—

may provide further insights. This section, therefore, will proceed with an analysis of four

popular songs from the English speaking repertoire, which serve as examples of different

approaches to the articulation between words and music: first, a highly structured Tin-

Pan Alley tune, Jerome Kern and Oscar Hammerstein’s “All the Things You Are;”

second, the narrative type in strophic form, Bob Dylan’s “A Simple Twist of Fate;” and

finally, two with “redundancy variation” in the “verse-chorus” format, Björk’s “Isobel”

and Peter Gabriel’s “Sky Blue.”

Popular song confronts us with new issues that lead us to rethink even the way we

approach “art song” analysis. From the debate over popular music studies, three

particular issues are relevant to the approach that this chapter seeks in analyzing these

four songs. These perspectives are fundamental to understanding the way in which

audiences listen to these songs and how they are conceived by their songwriters.

The first issue addresses the fact that an analysis of the musical text—the song

and its constituent parts, words and music—is not sufficient in isolation since its elements

only gain significance in relation with their context. Contexts directly influence the way

text is perceived. As David Bracket summarizes, in accordance with similar positions of

Richard Middleton and Simon Frith in the musicological debate over analysis of popular

music, “one of the most important aspects of context is that it establishes the codes that

114
listeners are most likely to apply in certain listening situations.”1 Style conventions are

indicators of which musical elements are valued in each specific popular genre. These

elements are the main focus of artists at the creative moment and of the audience in the

listening situation. The analysis of those conventions reveals the code under which

popular songs are read as “texts.” It is necessary, however, to note that more than one

perspective comes into play in the formulation of that code and that the code may be

different according to the agents interpreting the musical object. This coding is actually

the result of a diversity of discourses converging in a dialectical manner into the object of

study, the song. Cultural studies theorists of the 1970s assumed that “the meaning of

music could be deduced from its users’ characteristics…ignoring lyrical analysis

altogether,” as Simon Frith points out in a criticism of his own analyses in The Sociology

of Rock (1978).2 These “consumptionism” theories—mainly concerned with the values

that each “subcultural group” assigns to the song styles with which they identify—gave

way to studies that also took into account the “changing modes of lyrical production” in

the record industries.3

The second relevant issue is concerned with the problem that traditional score

analysis presents when dealing with popular song. Most traditional formalistic analysis

based on the visual information provided by scores may ignore important musical

aspects. This “visual” approach—concerned with the kind of musical development

known as “extensional” or “syntactical,” in terminology coined by Charles Keil and John

1
David Bracket, Interpreting Popular Music (Berkeley, Los Angeles, London: University of California
Press, 2000), p. 18. For further details over these issues see, Richard Middleton, Studying Popular Music
(Philadelphia: Open University Press, 1990), and Simon Frith, Music for Pleasure: Essays in the Sociology
of Pop (Cambridge, Oxford: Polity Press, 1988).
2
Frith, Music for Pleasure, 119.
3
Ibid. “Lyrical production” refers not only to the content and format of the lyrics but also the musical
structure of the song.

115
Shepherd—departs from scores and concentrates only on the musical syntax (mainly

harmony and melody), ignoring rhythmic nuances, texture, vocal and instrumental

arrangement, timbre nuances of the sound inflections, and sound mixing (in recording

and amplified sound in concerts).4 In contrast, a reformulated listening approach —which

focuses on the “processual” or “intensional” musical development as identified by

Charles Keil and John Shepherd—allows observation alongside the melodic and

harmonic aspects of the previously disregarded rhythmic and sound nuances.

Certainly recording technology has been partly responsible for the reconsideration

of several of these analytical perspectives. First of all, recording technology provided a

new way of registering particular performances of musical pieces: this is the third

relevant issue that contributes to the analytical approach used in this chapter.

The possibility of listening to recorded versions of performances opened a vast

corpus of new questions such as the ones mentioned in the previous paragraph.

Furthermore, Albin J. Zak III proposes that “records are not reproductions of anything;

they are ‘realities in themselves.’”5 He is appealing to the rock band leaders’ conception

of songs as only the starting point; “for them…the sound of the recording represented the

ultimate form of the artwork, and their compositional intention was to have a hand in

shaping the sonic relationships that made their identity.”6 Rock historian Carl Belz stated

as early as 1969 that although rock was not the first genre to use records and radio as its

4
For details on these terms definitions see, Charles Keil, “Motion and Feeling through Music,” The Journal
of Aesthetics and Art Criticism, 24 (Spring 1966), and John Shepherd, “Media, Social Process and Music”
in Whose Music? A Sociology of Musical Languages, John Shepherd, Phil Virden, Graham Vulliamy,
Trevor Wishart, ed. (London: Latimer, 1977) and “A Theoretical Model for Sociomusicological Analysis
of Popular Musics, “ in Popular Music 2, David Horn and Richard Middleton, ed. (Cambridge: Cambridge
University Press, 1982) as quoted by D. Bracket in Interpreting Popular Music, 21.
5
Albin J. Zak, The Poetics of Rock: Cutting Tracks, Making Records, (Berkeley, Los Angeles, London:
University of California Press, 2001), 21.
6
Ibid.

116
primary media of expression, for rock, “records became the primary, common bond

among artists and listeners.”7 Rock recording is not to be assumed as a mere “‘acoustic

presentation’ of a written text (the score). It is itself a text, a sonic one; ‘what it sounds

like’ is precisely ‘what it is.’”8 Records as well as scores are semiotically mediated texts

open to interpretation. But we must not overlook that by their own nature, records have a

material content, sound directly experienced by the listeners and that “in addition to

whatever we make them to be, they insist as well on being exactly what they are.”9

These “acoustic publications” or “electric prints,” as Richard Middleton came to

call the recordings, “represent a reified abstraction,” which include more than “musical

thought.10 They “encompass musical utterances and sonic relationships—material—

whose particularity is immutable and thus essential to the work’s identity.”11 As musical

ideas are not only expressed in sound but also become sound, we must take into

consideration new elements that are integral to the final artistic product, such as recording

tools, space and dynamics among the members involved in the actual recording and

mixing. What primarily concerns this dissertation is that the recording tracks bring

awareness that we are hearing song words in somebody’s voice and that voice delivers

linguistic meaning filtered through a particular expressive interpretation—a particular

intonation, timbre and rhythmic inflection.12

Taking into consideration the discussion of these three issues, this chapter

concentrates on the analysis of the text itself—the recorded track, the song. My work falls

among the textually oriented studies of popular music that approach lyric analysis in

7
Carl Belz, The Story of Rock, (New York: Oxford University Press, 1969) as quoted in A. Zak, 13.
8
Zak, 41. Comments in brackets and italic type by L. Guillen.
9
Ibid.
10
Middleton, Studying Popular Music, 83.

117
particular “with awareness of their function not as verbal texts but as sung words,

linguistically marked vocal sound-sequences mediated by musical conventions.”13

Popular song often offers the chance of finding the composer of the music and the writer

of the lyrics in the same person. Sometimes the songwriter is even the performer herself,

as in three of the cases under study in this chapter. This particular circumstance provides

the opportunity of observing the manipulation of the lyrics—as Middleton calls them,

those “linguistically marked vocal sound-sequences”—as well as the musical elements

(including the sound of the recordings) as part of the materials that songwriters count on

to reach and influence their audiences.

Although “people may not listen to pop songs as ‘messages,’” it is obvious that

they take them into account.14 As Simon Frith says, “So the question remains: why and

how do song words…work?” And he answers himself by saying:

In songs, words are the sign of a voice…Singers use non-verbal as well as verbal devices
to make their points—emphases, sighs, hesitations, changes of tone…(which is why some
singers, such as the Beatles and Bob Dylan in Europe in the sixties, can have profound
significance for listeners who do not understand a word they are singing).15

In approaching the task of analyzing songs to find out “why and how do song

words…work?” several points must be taken into account:16

1. The way a singer performs a song determines what the singer means to
us and our relationship to him/her as the audience.

2. “Different pop forms engage their listeners in different narratives of


desire.”17 In the process of identifying themselves with different

11
Zak, The Poetics of Rock, 42.
12
Simon Frith, “Try to Dig What We All Say,” The Listener (June 26, 1980), as cited by A. Zak, 43.
13
Middleton, ed., Reading Pop: Approaches to Textual Analysis in Popular Music (Oxford, New York:
Oxford University Press, 2000), 7.
14
Frith, Music for Pleasure, 120.
15
Ibid. Frith’s remarks resonate with my comment about consumers of Anglo-American popular song in
non-English speaking countries in the introduction of this dissertation.
16
The following three points have been adapted from Simon Frith, Music for Pleasure, 121.
17
Ibid.,121.

118
musical genres, listeners engage in fantasizing about belonging to
different sorts of communities.

3. Songs put ordinary language—common speech—into a refreshed poetic


form. “Songwriters give them a new sort of resonance” finding “the
pressure points of language…the syllables that locked a phrase up and
were begging to be prodded.”18

As already mentioned in Section 4 of this dissertation, “Expressivity Location:

Speech Mode vs. Poetic Mode,” we may find a wide range of vocal setting modalities,

from those that aim at a “speech quality” to those that aim at a “musical-poetic quality.”

Similarly to this classification, Richard Middleton describes the extremes of this

spectrum as “one characterized by verbal predominance over relatively vague musical

meanings, the other by the ‘musicalization’ of the words, often through paralinguistic

techniques.”19 Furthermore, he identifies three different approaches to setting lyrics into

music: “affect,” “story,” and “gesture.”

In the first case, the “affect” mode of setting lyrics absorbs words as expression,

merging them with the melody. Middleton explains that in this case “voice tends towards

song…intoned feeling.”20 This brings us back to Lawrence Kramer’s “songfulness”

concept, as explained in the introductory section of this dissertation. In these kinds of

settings, words are mainly perceived in the “poetic mode” of listening, in which certain

denotative aspects of words are captured parallel to their sonic qualities. This is the way

we listen to songs such as Kern/Hammerstein‘s “All the things You Are” and certain

sections of Björk’s “Isobel” and Gabriel’s “Sky Blue.”

In the second case, the “story” mode of setting lyrics retains the focus on the

denotative effect of words over the rhythmic and harmonic flow. In this case, words are

18
Clive James, “The Beatles,” Cream (October 1972), as quoted by S. Frith, Music for Pleasure, 122.
19
Middleton, Studying Popular Music, 228.

119
perceived in a “poetic mode” in which there is a preponderance of the “speech mode”—

the listener hears more of the speech qualities of the words than the sonic ones. The

straightforward discourse keeps its integrity by relegating rhythm, melody and harmony

to the background. We may find an example of this in Dylan’s “A Simple Twist of Fate.”

In the third case, the “gesture” mode, words tend to be absorbed into music at the

point of becoming sound while the voice becomes almost an instrument. In some

instances, “verbal denotations can be almost completely subordinated to musical

effects—through rhythmic ‘non-sense’ language …and the organization of

inconsequential verbal phrases into rhyming musical parallelisms.”21 In this kind of

setting, words are perceived in a “poetic mode” in which there is a preponderance of the

“acoustic mode” of listening—the listener hears more of the sonic qualities of the words

than their speech denotative content. We may find an example of this in certain sections

of Björk’s “Isobel” and Gabriel’s “Sky Blue.”

The Highly Structured Song: Tin-Pan Alley Tune

During the “golden years of the Tin Pan Alley”—1910s to 1950s in the United

States—one of the song formats most often employed by composers such as George

Gershwin, Cole Porter, Irving Berlin, and Jerome Kern was a format that opened with an

introductory section—sometimes with a quasi-recitative flavor in a “story” mode word

setting—called the “verse,” followed by what they called the “refrain,” which was the

“real” tune, in an “affect” mode setting. This is the case in “All the Things You Are,”

composed by Jerome Kern with lyrics by Oscar Hammerstein in 1939 as part of the now

20
Ibid., 231.
21
Ibid., 228.

120
rarely performed musical Very Warm For May. Larry Starr and Christopher Waterman22

indicate that the origin of this verse-refrain form is the result of the fusion of the

nineteenth-century AABA structure and the verse-and-chorus form influenced by “the

craze of ragtime and jazz music” of the early twentieth-century. After the introductory

“verse,” the “refrain” follows in AA’BA form. The A section introduces the main

melody, which is repeated with new lyrics and some slight melodic changes (A’). Then

the B section or bridge immediately follows with new musical material and lyrics. It then

finishes with the return of the A melody, usually with new lyrics and some melodic

alteration, which may include an addition or “tag,” becoming A’’.

Several talented composers explored variations on this format, but what made it

especially successful was its predictability. Peterson and Berger comment that the

oligopoly in airwaves and recording studios before the1950s demanded standardization of

the Tin Pan Alley tune formula in the market.23 Once this formula proved to be widely

accepted among audiences, its reproduction meant a guaranteed commercial success.

Before turning to the analysis of “All the things You Are,” it is necessary to point

out that this is the only example among the popular songs considered in this section that

presents different actors in the role of songwriter and performer. Although the focus of

this section is the diversity of compositional procedures used by songwriters to achieve

their desired effects on listeners, in this specific song, it may prove productive to compare

the published score—as the only document giving testimony to Kern and Hammerstein’s

compositional intentions—with two radically different renditions of the same song: the

22
Larry Starr and Christopher Waterman, American Popular Music: from Minstrelsy to MTV (New York,
Oxford: Oxford University Press, 2003), 62, 64.
23
R. A. Peterson and D. G. Berger, “Three Eras in the Manufacture of Popular Music Lyrics,” in The
Sounds of Social Change, eds. Denisoff and Peterson, as quoted in Simon Frith, Music for Pleasure, 119.

121
first one sung by Ella Fitzgerald (the version used during the listening experience) and

the second one performed by Barbra Streisand. These “metteurs en scene,” as David

Laing calls these performers, approach “a song as an actor does his part—as something to

be expressed, something to get across.24 His aim is to render the lyric faithfully. The

vocal style of the singer is determined almost entirely by the emotional connotations of

the words.”25 Working through her interpretation, each singer brings her own

idiosyncratic vocal rhythmic articulations and vocal timbre nuances to the phrases of “All

the Things You Are.” By contrasting these differences, we may observe the very moment

of “songfulness” as the personal creation of each artist.

The lyrics of this song are highly structured and abound in redundancy devices.

Mark W. Booth comments that the “repetition of phrasing in successive stanzas, where

small modifications adapt the words to a new use or effect, is the signature of the

ballad.”26 Booth considers this device not a mere stylistic convention but a mnemonic

resource related to the oral nature of the primitive ballad. Although here we are not

dealing with a traditional “oral ballad,” which resorted to redundancy in order to help the

creator and later to help singers to remember the lyrics, the internal repetition certainly

contributes to this song’s popular ballad flavor, creating a dent in its audience’s memory.

Tin Pan Alley lyrics show a prominent concern for “privacy” and “romance.” The

rapidly growing American middle-class of the first quarter of the twentieth century had

elite aspirations and cared about property ownership and privacy. These kinds of interests

are reflected in some of the lyrics of this period: romantic love, a wife, a home to share.

The third person narration of the old European ballads gave way to the first-person stories

24
David Laing, as quoted by Simon Frith, Music for Pleasure, 122.
25
Frith, 122.

122
of Tin Pan Alley. “This first-person mode of address was reminiscent of elite poetic

forms such as sonnet, but Tin Pan Alley songwriters avoided the flowery

language…opting instead for a more down-to-earth manner of speech,” which “allowed

the listener to identify his or her personal experience more directly with that of the

singer.”27

“All the Things You Are” talks about romantic love. In the first section of “the

verse,” the three lines of the first stanza tell us of longing for something still unknown;

the next three lines of the second stanza give us the answer for each of the three needs in

the same order that they were introduced. With the conflict resolved in this introductory

section, the “refrain” of the song proceeds into a more luxurious melody that almost

incorporates the lyrics as an additional colorful instrumental element.

Observing the published score, in the “verse,” Kern creates a colloquial sensation

by rhythmically moving with the intonation inflections of the text in a “quasi-recitative”

style.28 Each verse is set to the same rhythmic pattern. Its pace is rather fast; especially in

comparison with the way the lyrics in the second half of the song (the “refrain”) are set.

Each line of the first two stanzas lasts two measures of 2/2 and is ten syllables long, while

at least the first two lines of the third stanza set in the “refrain” consist of nine syllables

stretched over four measures. Melodically, each line of the “verse” opens with an

ascending perfect fourth—someitimes transformed in a perfect fifth—which will become

a motif later in the “refrain” of the song, followed by a simple arching melody. The

26
Mark W. Booth, The Experience of Songs (New Haven and London: Yale University Press, 1981), 59.
27
Starr and Waterman, American Popular Music, 67.
28
Oscar Hammerstein II and Jerome Kern, All the Things You Are (Polygram International Publishing, Inc.,
1939)

123
repetitive rhythm and unattractive melodic contour allow the lyrics to take the foreground

in this “quasi-recitative” section.

The clear and stable tonality of G major—harmonizing with a simple I–V–I on G

major without too many deviations—does not distract the listener’s attention from the

lyrics either, especially once it is compared with the sequential and modulatory nature of

the “refrain” that follows. Once the action is set and resolved, the song may self-indulge

into a busier harmony over a static descriptive text as the one found in the “refrain.” In

contrast, in the first section of the song, the “verse,” the text is highly structured, with

repetition of phrasing as explained in previous paragraphs. The predictable syntactic

structure offers the audience a grid to follow to comprehend the lyrics.

We know from several accounts of songwriter teams of this period that the music

usually came first, followed by the lyrics. The text was written to fit a previously

composed melody or as a parallel process. This is particularly evident in the “refrain” of

“All the Things You Are.” The music takes over in the second half of the song. It even

dictates the structure of the text, which molds around the musical phrasing and reinforces

certain harmonic procedures. The number of syllables changes from one verse to the

next. Also, the rhyme is loosely structured, which controls the natural tendency of

engaging with the musicality of the words in combination with the flow of the melody

and attracts more attention to a linear reading of the text. By using this tactic, the

songwriters guarantee a certain attention of the listener to the denotative content of the

lyrics.

The text remains simple and does not try to address multiple semantic levels; it

consists of a straightforward enumeration of images that reminds the song’s “persona” of

124
his or her “object of love.” The lyrics’ structure signals musical events such as the

beginning of harmonic sequences, the end of musical sections or similarities between or a

return to one of the sections.

The “refrain” has four sections: A –A’–B–A’’. Both A and A’ sections are made

of two phrases of sequences in fourths (dominant-tonic type): Fm–Bb m–Eb7–Ab7–Db7–

G7–C7, and Cm–Fm–Bb7–Eb–Ab–D7–G. The B section—or bridge—follows, taking

over the G major but breaking this sequence and still keeping the two phrase structure,

although this time the phrases are much shorter, only four measures compared to the

seven measures of the previous ones. The first phrase of the B section stays on G major

developing a typical cadential progression of I–ii–V–I to immediately modulate in the

second phrase. Although this second phrase of B parallels exactly the previous cadential

progression, now everything is in E major. The A’’ section opens with the same sequence

as the one employed at the beginning of the refrain in the A section. It even starts on F

minor, but stops halfway on the Ab major of the sequence in fourths, collapsing the

previous two phrases into one of seven measures followed by a cadential coda on the new

and final tonality of Ab major.

The text of the “refrain” signals the beginning of each sequence as well as the

parallelism between them by using the same phrase “you are” both times (mm.1 and

mm.9 of the published score). Kern achieves this focalization by carefully setting “you

are” on two long notes, a whole and a dotted half-note, which stop the flowing of tempo

in the music. The rest of the text immediately following “you are” gains a quicker pace

by fitting more syllables into each measure. Although the harmonic rhythm of the

sequence is constant—one chord per measure—the layout of the text varies in density.

125
While the two syllables of “you are” are spread over two measures, the remaining sixteen

syllables of the text (eighteen in A’), where the explanation of what “you are” takes

place, are crammed into the next five measures. In the context of a syllabic setting such

as this one, the pace and density over the measures will have a direct influence on the

perception of the text in general. The slower rhythmic pace allows a better understanding

of the contained lyrics. In contrast, a tighter layout produces a blending of syllable over

syllable and syllable and melodic line.

The result is a generalized idea of what has been said in these phrases set into the

A and A’ sections. While the listener attends to the words “you are,” the specifics of the

description of her or his “object of affection” are overlooked, and she only remembers

that this person is a series of things. The listener attends away from the precise meaning

of this description and is satisfied with the assurance that it has one without its mattering

what it is.

In this same description in the A and A’ sections, Hammerstein explores most of

the alliteration devises on hand with his lyrics. The associations between similar

phonemes that are placed close to each other create a certain sustained musicality in the

lyrics. Thus, the listener tends to attend away from the meaning of the words and listen to

their instrumental cacophony.

(A section)
You are the promised kiss of springtime
That makes the lonely winter seem long.

(A’ section)
You are the breathless hush of evening
That trembles on the brink of a lovely song.

126
The first line explores the alliteration between sibilant [s] phonemes that by the

second line make a counterpoint against the liquid [l]. The third line connects the

unvoiced fricative phonemes [θ] – [s] – [ς] in a backward movement of the point of

articulation of the tongue (upper teeth, teeth ridge, and hard palate).

All the devices described so far contribute to a fragmented grasping of isolated

words and phrases. In fact, listeners tend to grab chunks of lyrics in an imprecise non-

linear way. The usual fragmented nature of a song’s lyrics contribute to the broken way

the listener tends to grasp the text. According to Booth, these fragments behave like

“standing patterns as opposed to linear sequences of growth, evolution, discovery,

catharsis,” which is a common procedure in most of the other types of discourses.29 This

does not mean that song text is shapeless but that its elaborated patterns connect to each

other in a different manner. Booth comments on the particulars of this relation through a

quote from Edward Doughtie’s book Elizabethan Air:

In song lyric, although the images and ideas may be related to a central theme or an
obvious central conceit, they tend to be isolated from each other; they accumulate
rather than develop. Rarely, in fact, does an image or thought extend beyond two
lines…the listener is rarely able to make connections of much complexity over a
longer space of time. 30

Returning to the setting of the song’s lyrics, after twice establishing comparisons

of the “object of love” with certain pieces of nature—“You are the promised kiss of

springtime” and “You are the breathless hush of evening”—Kern and Hammerstein allow

the next “you are” to move in quarter-notes over the arpeggio of G Major. This is one of

the modulatory turns that the song takes from mm. 15 until mm. 20. Although this seems

to break with the device of focusing attention on the phrase “You are,” a couple of

127
measures later (mm. 22-23), the song returns to a modified version of the emphatic tactic.

This time, the B section closes with a palindromic effect that brings back the phrase “you

are,” although this time as a closure of the lyrics’ statement. This happens at this point of

the song for two reasons. First, because the phrase “you are” has already been well

established during its two previous appearances. By now it is only a reminder—empty of

specific meaning—that a new enumeration is starting. There are several meditational

practices, as Booth comments, which “buil[d] on the fact that any word sheds its sense

upon a small number of consecutive repetitions.”31 Nevertheless, these two words still

produce their denting effect on the listener; they keep their pragmatic value while their

semantic one is almost extinguished. Second, the change in the way “you are” is

introduced is a sign that announces the beginning of a completely different musical

section.

This section behaves harmonically differently. Instead of a sequence, a

progression modulates the second time around. Melodically it is also different; instead of

a seven measure melodic phrase, there is a four measure one. The fact that mm. 25

reintroduces the same melody of the first five measures of the refrain, together with the

fact that by the fifth measure the same “some day” of the beginning of this A’’ section is

repeated, indicates that things seem simultaneously similar but different. This is a return

of the A section but not under the same conditions as before.

A’’ establishes the same game of two words as a “motto” heading a melodic

phrase that, at least in section A, is repeated a fourth down the second time around. But

new words are now used: “some day.” This not only breaks the monotony, but also puts

29
Booth, 25.
30
Ibid., 24.

128
the listener on alert. Contrary to his or her expectations, the listener is surprised by the

sudden return of the “motto,” “some day,” set into what seems to be a variation of the

opening melody. This acceleration of events propels the listener toward the end of the

song, expressing the hopeful wish that “some day” all the things that represent him in the

song become hers: “all the things you are, are mine.” The title of the song occupies this

strategic place and has the function of summarizing the song.

(verse)
Time and again I’ve longed for adventure, (10) a
Something to make my heart beat the faster. (10) b
What did I long for? I never really knew. (10) c

Finding your love I’ve found my adventure, (10) a


Touching your hand, my heart beats the faster, (10) b
All that I want in all of this world is you. (10) c

(refrain)
You are the promised kiss of springtime (9) a
That makes the lonely winter seem long. (9) b
You are the breathless hush of evening (9) c
That trembles on the brink of a lovely song. (11) b

You are the angel glow that lights a star, (10) d


The dearest things I know are what you are. (10) d
Some day my happy arms will hold you, (9) e
And some day I’ll know that moment divine, (10) f
When all the things you are, are mine. (8) f

Before proceeding to the analysis of the two recordings, there is a final

observation to make on the choice of lyrics employed as the “motto” or heading. Both

“you are” and “some day,” are made up of what is known in linguistics as deictics.

Phrases made of words like “you are” and “some day” are semantically empty and

depend totally on the context of the utterance. They have the “function of situating the

31
Ibid., 39.

129
speaker’s utterance in a specific time and place. They do not characterize or qualify

someone or something, but ‘point to’ a person, an object, a time.”32 Deictics are used

more frequently in spoken language than in written. As Mauro Calcagno says, theater

scholars regard the high incidence of deictics in dramatic texts as one of the main factors

that distinguishes the language of theater from that of narrative or poetry. I argue in

addition that the colloquial flavor of several song lyrics is created through the extensive

use of this kind of deictic phrase.

The employment of the deictic phrases emphasizes the performative and oral

nature of the song’s texts. Deictics contribute to creating a kind of direct immediacy to

the audience during the act of communication, regardless of the context: whether a live

concert or a recording. In theater or opera the audience identifies with specific characters

on stage. However, the general public approaches song by identifying themselves with

different “personas” coexisting in it. Whether the singing voice is male or female, the

listener never identifies with the person being addressed, in this song with the “you” of

“you are.” On the contrary, the listener assumes the place of “I.” In the case of a

narration, the listener tends to assume the perspective of the narrator of the story. If the

perspective and opinion on the topic is not shared, the process of identification does not

take place. In contrast, if the song embodies bits of the ideals of the group to which the

audience belongs, the communion takes place. In both cases, when the identification is

with “I” or when it is with the narrator, it is the power of the human voice that invites the

listener to put him or herself in the place of the singing voice.33

32
Mauro Calcagno, “’Imitar col canto chi parla’: Monteverdi and the Creation of a Language for Musical
Theater.” In Journal of the American Musicological Society (Vol. 55, n. 3, Fall 2002), pp. 390.
33
See Booth, pp. 16-17 for further details of these arguments.

130
Turning now to the two recorded versions of “All the Things You Are,” these

renditions represent two very different approaches to the same song, which in turn

provoke distinctive reactions in their listeners. Of course, what is known of both artists’

careers and styles are read into these versions too. Ella Fitzgerald, diva of the “big-band-

era,” recorded “All the Things You Are” in 1963 for her album Ella Fitzgerald Sings

Jerome Kern Songbook—the seventh in a series of “Songbook” albums dedicated to big

Tin Pan Alley and Jazz songwriters such as Cole Porter, Duke Ellington, and Harold

Arlen. She initiated this series of recordings under the guidance of her manager and

producer, the owner of Verve Labels, Norman Graz. Especially with this “Kern” album,

Fitzgerald ventures outside the emblematic raw, energetic, jazzy vocal style of her

performances into the more well-polished sound of these Broadway musical tunes, which

may appeal to a broader audience. Recorded only four years later, Barbra Streisand’s

track represents the late 1960s-early 1970s style identified as “adult contemporary,”

which was an extension of the old crooner tradition.34 Streisand recorded “All the Things

You Are” on her album Simply Streisand, which was released in 1967 by Columbia

Records—and which she had been recording since 1962. This album was produced by

Jack Gold and Howard Roberts with orchestral arrangements by Ray Ellis—her long term

partner in this business.

Both versions may be classified as “vocal with orchestra,” meaning a vocal

soloist, who is clearly the leading figure, and a “more or less anonymous” orchestra just

accompanying.35 These tracks also share a moderate tempo, one-hundred-four quarter

notes in Fitzgerald’s performance and ninety-two in Streisand. But the arrangements and

34
Starr and Waterman, p. 307.
35
David Brackett, Interpreting Popular Music, p. 58.

131
sound of the orchestras are quite different in these two recordings. First of all, Fitzgerald

does not sing the “verse” at all. She delivers the entire “refrain” and then goes back and

repeats sections B and A’’. Streisand opens with the “verse” followed by the complete

“refrain” ending with the repetition of only A’’.

Fitzgerald’s version is backed by an arrangement that recreates the “Big Band”

sound. This is achieved by the prominent use of the brass section playing a swing

rhythmic riff in block—with sforzatti—as an introduction and reappearing during

interludes, or otherwise, punctuating certain beats with brief chords under the vocals. The

rhythm of that riff has a strong sense of swing in its ternary subdivided pattern, here

transcribed:

EX. 3.1: Rhythmic riff played by the brass section in Fitzgerald’s version of “All the Things You Are”

The arrangement is held together by the rhythm section: ride cymbal and bass. There is

only a touch of strings that comes briefly in the B section.

Streisand’s version recreates a Latin soft jazz ballad sound by molding the piece

around a slow bossa pattern with rim shots and triangle—the latter reminiscent of

Northeastern Brazilian music. The main difference is that there are no strong swing brass

interventions in this version. The softer timbres of the string sections (duplicated at times

by backing vocals) and woodwinds are used in textures that privilege lyrical

countermelodies over the smoother bossa beat.

132
In comparing the vocal renditions of Fitzgerald and Streisand against the

published score, we find some jazz improvisational elements. The following are the

transcriptions of the eight opening measures of the “refrain.” These have been transcribed

in the original keys in which each singer performs them. Rhythmic and melodic details

have been transcribed as faithfully as possible to show the different nuances of each

singer.

EX. 3.2: “All the Things You Are”: Two versions and published score of the A section. Each in the key
performed or published. Transcriptions by Lorena Guillen.

While Fitzgerald sings mostly on the beat—almost parallel to the score version—

with slight delays on “that” and ”the” of “the lonely” and anticipations on both syllables

133
of “winter,” by the fourth measure Streisand is already two beats behind and most of her

rhythms are slightly modified from the score. However, it is Streisand who sings the

pitches straightforwardly and will for the most part stay faithful to the melody until the

end, with minor improvised vocalizations in the last phrases. Fitzgerald’s rendition

abounds in pitch bendings in between notes and scooping in almost every attack. She also

later introduces major melodic changes, such as the three repeated G4 natural pitches,

which take the place of the upwards arpeggio of G3-C4-G4 of the original in “You are

the angel glow” and the subsequent G4-Ab4-G4-E4 on “that lights a star.”

Although Fitzgerald clearly articulates every word sound in soft crooning vocal

timbre—recorded close to the microphone—Streisand deliberately emphasizes and

prolongs certain consonants. She brings up those phonemes that are associated by

alliteration inside each verse, such as the [s] in “kiss” and “springtime” and [θ] and [s] in

“breathless,” or prolongs notes on the final [m] of certain words, such as “time” and

“seem.” These stretched out sounds are reinforced by a more prominent reverb effect

applied to the mix of her voice.

These particular vocal effects, together with the orchestral sound in each case,

contribute to conveying a specific type of sound, which triggers different emotional states

in the audiences. Fitzgerald keeps the swing big band style with an on-the-beat

articulation combined with a serene and pleasing vocal tone. Streisand emphasizes the

lounge bossa style with a simple and relaxed vocal tone that floats freely over the beat

getting behind in a lazy manner. So, while the former calls the listener to a comfortable

but punctuated and energizing sound experience, the expansive sustained sonorities of the

latter invite the audience to relax.

134
The Narrative Type: Strophic Form

American songwriter Bob Dylan provides several good examples of pure

narrative texts. In his typical “strophic song form” (A, A’, A’’, A’’’, etc.) is “A Simple

Twist of Fate,” a track on his 1975 album Blood on the Tracks. As an American urban

folk icon, he dragged this genre into the modern era of rock by introducing electric band

sound to his recordings and live performances, starting with his 1965 album Bringing It

All Back Home and following in July of that same year with his performance at the

Newport Folk Festival. Throughout his career, however, his songwriting style has

remained faithful to the early American folk tradition. Some of his songs have been

modeled “implicitly or explicitly, on the musical and poetic content of preexisting folk

material.” 36 Furthermore, his performing style has “demonstrated strong affinities to rural

models in blues and earlier country music” favoring “a rough-hewn, occasionally

aggressive vocal, guitar and harmonica style.”37

The object analyzed in this case is the recorded track itself as conceived as the

auteur—following Laing’s distinction—version in its full dimension. Since Dylan is both

the songwriter and performer of this track, its value and interest lies not in the features of

the song but in the unique way he sings it, his personal inflections. “The appeal of

auteurs is that their meaning is not organized around the words…in the situations his

songs portray, but in the exceptional nature of his singing style and its instrumental

accompaniment.”38

As Albin Zak points out in talking about Dylan’s John Wesley Harding (1967),

“against the contemporary trends in recording, which tended in varying degrees towards

36
Starr and Waterman, 281.
37
Ibid., 278.

135
the sonic opulence exemplified by Sgt. Pepper’s, and in contrast even to the ‘thin wild

mercury sound’ of Dylan’s own Blonde on Blonde album, it strips things down to an

elemental level—bass, drums, acoustic guitar, voice, harmonica, three chords, and no

obvious sonic manipulations.”39 Eight years later, “Simple Twist of Fate” returns to that

stripped sonority with his strummed acoustic guitar, bass, harmonica and a quasi-spoken

singing quality. Blood on the Tracks is a mixture of some recordings that feature this bare

sonority and others that have a more stylish band sound with arpeggios and

countermelodies between two guitars—one steel guitar played by Buddy Cage, Tony

Brown on electric bass, Paul Griffin on organ, drums, and Dylan’s own harmonica and

voice.

The lyrics of this song are made of six stanzas; each set to the same melody with a

“quasi-refrain” at the end of them. Although these last verses of the stanzas are set to the

same music and finish with the same words—which are not surprisingly the title of the

song, “A Simple Twist of Fate”—they open with a different heading each time. These

heading words state the action or verbal phrase that will affect this “simple twist of fate”:

“And watched out for a simple twist of fate;” “Moving with a simple twist of fate;” “And

forgot about a simple twist of fate;” etc.

This song employs a narrative type of discourse, which unfolds events in a linear

way. In the first four stanzas, the action takes place in the past. A third person, an

omnipresent narrator, tells about the first encounter of a man and a woman in the past.

The second part of the song—the remaining two stanzas coming after a harmonica solo

that, in a way, marks the passage of time—takes place in the present. In the last stanza,

38
Frith, Music for Pleasure, 122.
39
Zack, The Poetics of Rock, 48.

136
the narrator reveals himself as the male protagonist of the amorous encounter. A short

harmonica solo closes the song.

The strophic setting that Dylan chooses goes well with the folkish story-telling

style and allows the audience to concentrate on the details of the story. The songwriter

wants the listener to focus on the lyrics without big musical distractions or fragmentation

of the linear development of the story. The traditional heavy rhyming of the verses seems

to aim toward having the same effect on the audience.

(1st Stanza)
They sat together in the park (8) a
As the evening sky grew dark, (7) a
She looked at him and he felt the a spark (9) a
Tingle to his bones. (5) b
’Twas then he felt alone (6) b
And wished that he’d gone straight (6) c
And watched out for a simple twist of fate. (12) c

The rhyme scheme of the first three verses and the next two consecutive pairs,

although strong and attractive, does not distract from the main point of the story; instead,

it contributes to the narration. This rhyme scheme points to words that are key to the

story and creates a parallel narration that synthesizes and contributes to the essential

thread of the theme:

• The place is the “park.”

• The Opposition “dark”/”spark”: first it was “dark” but then there was a
“spark” of hope in a new relationship.

• “Bones”/“Alone” proposes the extreme loneliness and bareness


represented by the “bones,” wspecially after this casual relationship ends.

• “Straight”/”Fate” gives the unavoidable sense or direction of destiny.

137
This rhyme scheme, as well as the repetition and placement of the song title at the

end of each stanza, clarifes the poetic structural frame. This kind of predictable form

liberates the mind of the listener, who in this way can trust and concentrate on the linear

succession of events of the story being told. The rhyme scheme is also very regular and

its placement predictable (at the end of each verse). There are no further strong

alliterations or internal vowel rhymings that could deviate or offer alternative webs of

phonemes or morphemes. The rhyme moves the lines ahead, propelling the rhythm of the

stanza in a straightforward motion.

The melodic development also contributes to this sense. The melody that is

repeated for every stanza is fifteen bars long. In contrast with the way Kern and

Hammerstein approached making “All the Things You Are,” “Simple Twist of Fate”

shows evidence that Dylan may have written the lyrics first and then set them to music.

In this case, it is the music that follows the lyrics’ structure and not the other way around.

The first three verses, which are assonantly rhymed (vowel rhyme), are set to the same

musical phrase that repeats three times.

EX. 3.3: Transcription of opening three measures of Dylan’s “Simple Twist of Fate”

138
The next two rhymed verses—verse four and five—are set to a second melodic

phrase, which is also repeated twice to fit each one of the mentioned verses.

EX. 3.4: Transcription of mm.4 to 6 of Dylan’s “Simple Twist of Fate”

Again, each verse is set to a two measure melodic phrase. And the stanza will

actually keep this regular pace for the next verse to slow down only in the last one, which

is the refrain. The regular structure of the melodic phrases is evidence of the lyrics

preceding the music.

EX. 3.5: Transcription of the refrain of Dylan’s “Simple Twist of Fate”

If song lyrics resemble poetry in some way, it is in their rhyme and metric

schemes. Lyrics, as well as poetry, are created by feeling the feet—the number of accents

per verse—regardless of the number of syllables in between those accents. Dylan’s

melody tries to fit and follow the feet and accents preexisting in his lyrics. This results in

the subdivision of beats into their proportional rhythmic values to fit the extra-syllables

139
of the irregular verses. Otherwise, as happens in “All the Things You Are,” the lyrics

should have been created as well-proportioned parts to fit the music exactly. For

example, the three first verses in stanza one have, respectively, 8-7-9 syllables; the first

three verses in stanza two have 8-10-8 syllables; and stanza three has 9-9-9 syllables. All

this verses are set to the same melody as is usual in any strophic song setting—whether

popular or art song.

Fitting lyrics to music in this particular way is an indication of where the

expressive value of the song is located. In this case, it is in the semantic meaning of the

lyrics and in the communicational value of text as carrier of denotative content. This

discourse needs to be uninterrupted and fluid to make any sense. Text as acoustical

phenomena is almost ignored.

This musical setting follows only the lyrics’ main accents and shape to make it

understandable without further prosodical details. Here the strophic song tries to solve the

inconvenience of not molding exactly to the intonational arch of the text with the

repetition of the melody. This procedure is taken at the point that the melody and its

arrangement almost completely lose their ability to surprise the listener. In this way, they

release the listener’s attention to focus on the story and its logical sequence away from

the melodic swirls of the music. In sum, the strophic setting eases the ears and mind of

the listener.

As a counterpart, Dylan’s vocal delivery of the lyrics is almost spoken at times.

He breaks his sustained tone into a non-determinate pitch sound contour. This quality

gains over the singing, especially toward the end of each melodic phrase of the stanzas

and each time the refrain appears. By manipulating his voice in this way, Dylan

140
counteracts the lack of prosodical observance of his melodies to the speech intonemes of

his lyrics. The strophic repetition does not allow the flexibility of following speech

intonation arches. By speaking the lyrics, the words break free into quasi-speech.

In terms of the lyrics themselves, Dylan intends to counteract the natural tendency

of text‘s song to be processed in the “poetic mode” by: first, using predictable rhyme

schemes at the end of verses; second, avoiding further alliteration inside verses, which

could deviate the attention of the listener from the linear succession of concepts; third,

avoiding repetitive semantic schemata (beyond the repetitive refrain). Resorting to these

tactics, the songwriter minimizes the musical structures, Tsur’s “sound patterns,” of his

lyrics, and assures a propositional temporal processing similar to the one speech follows.

The left hemisphere of the brain composes speech by retrieving from memory

“several morpheme units…according to grammatical rules” and ordering them “into a

specified temporal arrangement.”40 The left side of the brain is usually associated with

propositional thought: speaking, reading, and writing. In contrast, songs or phrases of

their lyrics are remembered as wholes. As Booth says, “The parts of these units are not

pieced together tone by tone, word by word, but rather are recalled all at once as a

complete unit.”41 The appositional capacity of the right hemisphere of the brain is the

ability of “comparing perceptions, schemas, engrams…,” which are remembered and

produced as intact wholes.42 This is the way listeners retrieve fragments of song’s lyrics.

But Dylan counteracts this appositional tendency by controlling the “sound patterns” of

his lyrics and keeping the flow of the narration. He delivers the story in his usual quasi-

spoken vocal tone and idiosyncratic story-telling style.

40
Booth, 68.
41
Ibid.

141
Redundancy: Variation on the “Verse-Chorus” Form

Relying on other effects, the imprint that Björk’s “Isobel” leaves on the listener is

quite different from that of Dylan’s song. “Isobel” was written by Björk, Nellee Hooper

and Marius De Vries, with lyrics by Sjón. Nellee Hooper and Björk produced it together

and released it in 1995 on Björk’s second album Post. This song offers the unusual

opportunity of comparing the sound properties between two differently mixed versions:

the first in Post, where Björk herself participated in the mixing process; and the second

made by Eumir Deodato for Bjork’s 1996 CD Telegram. The latter is a remix made up

largely of songs from Björk’s album Post. Björk personally commissioned nine artists

and gave them complete freedom to remix her tracks. After receiving the mixes, she went

back to the studio and re-recorded the vocals to complement these artists’ versions.

Deodato’s spin on “Isobel” opens up the texture with a straightforward pop sound

version. Actually, Björk comments on her website:

For me Telegram is really Post as well but all the elements of the songs are just
exaggerated. It’s like the core of Post. That’s why it’s funny to call it a remix album, it’s
like the opposite. It’s like the-cover-of-Post-me like this [she smiles beatifically] in pink
and orange and big ribbon and it’s like a pressie for you. But Telegram is more stark,
naked. Not trying to make it pretty or peaceable for the ear. Just a record I would buy
myself. (Like a letter to yourself?) Yeah, more, sort of...fuck what people think. It’s a
truth thing. Which is maybe a contradiction because it’s other people’s remixes. (Blah
Blah Blah, December 1996)43
In her Post version of “Isobel,” Björk and De Vries play keyboard over a

rhythmic base of “ethnic” percussion also programmed by De Vries. Deodato and Björk

add a string arrangement. What is radically different between this original version of

“Isobel” in Post and the one remix by Deodato in Telegram is the levels of volume in the

mix and inclusion or suppression of certain recorded instrumental tracks. From the

42
Ibid., 69.

142
opening sustained harmonic string sequence with trumpet solo in Post, Deodato only

keeps his own arrangement of strings. The softer and diluted “ethnic” percussion is

replaced by a pop drum-set pattern that is brought quite prominently into the mix. The

original bass, which was muffled and back in the Post mix, is replaced by a “funkier”

bass that is also up front in the Telegram mix. All the programmed sequences and

keyboard sounds of Post are stripped out in Deodato’s mix. This now clean cut pop track

directly affects the way in which Björk herself interpreted her vocals when she

rerecorded them after the new mix. She goes for a less affected vocal inflection of the

lyrics. Her voice also is mixed with less processing, a more “in-your-face” sound. The

hermetic and mysterious aura of “Isobel” in Post is replaced by a banal and

straightforward sound, which contrasts, like an ironic comment, with the still hermetic

lyrics.

Isobel is a variation on the “verse-chorus” form.44 In the case of this song, it

follows this form:

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

Parts A Chorus A’ Chorus Bridge Chorus A’’ Chorus Bridge Chorus


(verse (1) (2) (with (3) (4) (with (5)
with climb at climb)
refrain) the end)

TABLE 3.1: Song format of Björk’s “Isobel”

43
See http://unit.bjork.com/specials/gh/SUB-12/index.htm. Copyright © 1995-2007 by Björk Overseas
Ltd.
44
Here the term “verse” is used to denote a complete strophe of a song. This is the most commonly used
term among popular music songwriters to refer to this part of a song. In this section of the dissertation, it is
will be introduced in quotation marks each time it is employed in this way.

143
EX. 3.6: Transcription of all the sections of Björk’s “Isobel”

Although the song has three strophes or “verses,” it repeats the chorus five times.

After its third reappearance, instead of being sung to the same lyrics, the chorus’ melody

is performed on babbling: “na, na, na…” Another device to add variation to the

traditional succession of “verse-chorus” is the insertion of “bridges” and “climbs.”45

Between the second and the third chorus the first “bridge” is introduced, which brings a

“climb” of two “verses” attached at the end. The bridge itself is 3 “verses.” The second

45
The “bridge” is a fresh new section inserted to offset the predictable verse/chorus pattern: “…a bridge
works to provide contrast in lyrical content, meter, and melody.…Lyrically speaking , your bridge doesn’t
justify its existence if it merely restates a fact we’ve been told. Ideally, a bridge adds dimension to a lyric
by expanding the content of the verse or chorus, or by giving new insight into the singer’s feelings.” Sheila
Davis. The Craft of Lyric Writing. (Cincinnati, Ohio: Writer’s Digest Books, 1985), 57.
A “climb” gives the “verse/chorus” song a fresh contour. Although it is also new material, it is usually
shorter than a “bridge” section: “…a climb is a couplet (two rhymed lines in the same meter) which pull
away from the verse both verbally and musically, and reach up toward the chorus. A climb functions as
aural foreplay, to extend and increase the song’s emotional tension by delaying the arrival of its climatic
section.” Davis, 55.

144
time that the bridge is introduced is in between the fourth and fifth chorus. The song

unfolds under a well-planned structure full of musical repetition and text redundancy.

The lyrics, although rather hermetic, describe the character of Isobel. But the

obscure images depicting her personality contrast with the colorful and well-planned use

of rhyme, alliteration, parallelism and word repetition. This gives the verse a quite

attractive musicality. This attraction does not overlook the semantic content of the words

in favor of their musical possibilities. If something detracts from this otherwise inevitable

effect, it is the repetition of whole phrases or “verses” in an insistent manner. For

example the refrain at the end of each strophe, chorus, and bridge:

In a forest pitch-dark a (6)


Glowed the tiniest spark a (6)
It burst into flame b (6)
Like me, c (2)
Like me. c (2)

(Chorus)
My name Isobel d (5)
Married to myself d (5)
My love Isobel d (5)
Living by herself. d (5)

In a heart full of dust e (6)


Lives a creature called lust e (6)
It surprises and scares b (6)
Like me, c (2)
Like me. c (2)

(Chorus)

(Bridge)
When she does it she means to f (7)
Moth delivers her message g (7)
Unexplained on your collar h (7)
Crawling in silence i (5)
A simple excuse. j (5)

145
(Chorus on “Na, na, na…)

In her tower of steel k (6)


Nature forges a deal k (6)
To raise wonderful hell b (6)
Like me c (2)
Like me. c (2)
(Chorus with lyrics)

(Bridge)

(Chorus on “Na, na, na…”)

Each of the three “verses” opens with an assonant rhyme between the two first

lines, but the third one rhymes with the third line of each “verse.” These three words

connected at a distance form an interesting imaginary scheme throughout the song. This

scheme becomes a kind of parallel or alternative sub-line plot that describes a possible

scenario for the description of Isobel: flame—scares—hell. The “verse” closes with one

of the first instances of word repetition, “like me, like me,” which is the refrain that not

only repeats twice there, but appears at the end of each “verse” in the same couplet form.

The chorus takes to an extreme all the versification devices at hand. Its four

“verses” are metrically regular: five syllables each. This gives regularity, balance and a

perfect rhythmic pattern. On top of that, the rhyme scheme is also extremely regular. Not

only do the four lines share an assonant rhyme among them, but line one has perfect

rhyme with line three in the same manner as line two with four. In addition to these

rhyming schemes, the lines present parallelism in their grammatical construction.

Similarities in ideas are brought to the surface by the similarities in sound and

grammatical construction of the two parallel phrases that make up the chorus.

146
Subject Verb Object (direct or
indirect)
Line 1 and 2 My name Isobel Married to myself

Line 3 and 4 My love Isobel Living by herself

TABLE 3.2: Layout of parallel lyrics constructions in the chorus of Björk’s “Isobel”

The first and third lines also open with an anaphora46 that starts only with the repetition

of the opening possessive pronoun “My” and, although the noun is different, is followed

by the same proper name “Isobel.”

The second “verse” follows immediately after this chorus, but the perfectly

balanced sound of the words of the chorus keeps resonating in the ears of the listener.

This is remembered as an instrumental cacophony. The parallel constructions of the lyrics

in the first and second lines and third and fourth lines are set over a melody that repeats

twice for each of these couplets.

By the time the song moves into the second “verse,” the ears of the listener are

already attuned to a musical way of listening to the lyrics. The alliteration of consonants

between the two first lines and inside the third one are more prominent than any other

effect. “In a heart full of dust/ lives a creature called lust” encloses the alliteration of

the following sounds, now quoted in their phonetic symbols: [I]-[æ]-[rt]-[u]-[l]-[Λst].

Inside the third line, the reiteration of [s] and [r] creates a sensation of smooth continuity

in these sounds and over those that are in between them: “It surprises and scares.”

46
Repetition of a word (or like-sounding words) or a short phrase at the start of successive lines or verses
(Davis, p. 143).

147
The refrain “Like me” holds our attention because of its redundancy and exact

repetition after its previous appearance. It intends to remark on the correspondence

between the persona of Isobel and the other, which maybe are one.

The second time the chorus comes around with its features and effects intact, but

is followed by the unexpected “bridge.” After two rounds of “verse-chorus,” fresh

musical and textual material is more than welcome. Usually the function of the “bridge”

is to introduce further information still not provided in the “verses,” a new perspective

and/or to offset the musical predictability. The “bridge” should also offer contrast in its

meter and versification compared to the verses and chorus. Certainly this is the case in

“Isobel.” The lines are now seven syllables long and have free rhyme. There are no new

parallel grammatical constructions or anaphoras. The previous highly structured lyrics

give way to a looser text approach.

Björk also proposes a musical change in this “bridge.” The melodic setting of

each line consists of a first phrase that carries part of the lyrics, followed by a second

period that is a vocalization of its melodic contour on a babbling “uh, uh, uh…” Attached

at the end of this bridge is a couplet that functions as a “climb” toward the next chorus.

Although climbs normally “pull away from the verse both verbally and musically,

and reach up toward the chorus,” this one is not pulling from known material, but from

the newly introduced bridge’s music.47 This couplet anticipates the five-syllable length of

the chorus lines in an open melodic phrase that does not find its harmonic and melodic

closure until the first notes of the following chorus. Thus, it creates the expectation and

tension usually ascribed to all climbs.

47
Sheila Davis, The Craft of Lyric Writing (Cincinnati, Ohio: Writer’s Digest Books, 1985), pp. 55.

148
This third chorus is sung on a babbling “nana na nana, nana na nana…” By this

time the lyrics of the chorus, which have been repeated twice, have probably been

imprinted on our memory, and the playful musicality of the text has contributed to this

sense. But more than the tangled semantic meaning, what the listener remembers is the

musicality. The babbling is an extension of that musicality, an iconic remainder of the

almost musico/instrumental characteristics of the chorus. The “nana-na-nana” represents

what is remembered of this section.

The conscious or unconscious choice of the babbling syllable “na” also has

certain emotional implications. Part of the delight that any listener feels, and even the

performer experiences while singing, is situated in the regressive value and childhood

connotations that those phonemes evoke. According to Roman Jakobson, the emotional

charge of each phoneme is proportional to the amount of time that the infant has used it

in his prelanguage babbling stage in language development.48 So, any association with

sounds out of their syntagmatic or referential relation to a linguistic sign (a word) refers

us to that period. Nasal phonemes are among the latest acquisitions of children into the

vocabulary, but are part of the mass of sounds used in their onomatopoeia and emotional

manifestations. The “nana-na-nana” certainly brings back those associations in a

pleasurable regression to an earlier age that is compelling and attractive.

Only after a close listening and analysis of the recorded material were Björk’s

comments about the story line of Isobel taken into account. And, surprisingly, as I came

to believe after listening to it, the “na na na nana” proved to be the expression of an

instinctive impulse. In an interview for MTV’s Eurotrash in 1995, Björk tells the story in

this way:

149
This is the story of Isobel; she was born in a forest by a spark, and as she grew up, she
realized that the pebbles on the forest floor were actually skyscrapers. And by the time
she was a grown-up woman and the skyscrapers had taken over the forest. She found
herself in a city, and she didn’t like all the people there so much, because they were a bit
too clever for her.

She decided to send to the world, all these moths, that she had trained to go and fly all
over the world and go inside windows of people's houses—the ones that were too
clever—and they’d sit on their shoulder and remind them to stop being clever and start to
function by their instincts. They do that by saying “Nah-nah-nan-nah-nah!” to them...
(Björk waves a finger in front of her face)

...and then they’d say “Oh! Sorry! I was being all clever there!” and start functioning on
instinct.49
The rest of the song proceeds with the same tactics described previously. Finally,

what the song offers musically—the rich realm of its text webs and musical events—to its

listener overwhelms any pretension of perceiving its discourse linearly without

distortions or fragmentations. Björk is certainly looking for this kind of musical sonic

experience.

Peter Gabriel’s “Sky Blue,” from his 2003 album Up, shares certain similarities

with Björk’s “Isobel.” Both rely on redundancy and repetition as a way of creating a grid,

which the listener can use to guide him/herself along the fragmented lyrics. But this is

approached differently in each song. Although Gabriel’s lyrics seem more accessible than

Björk’s, they are long and stretched over lengthy periods. This accessibility is due to the

fact that each fragment of Gabriel’s lyrics corresponds to a line of the strophe and a

single musical phrase. Each of these fragments carries a single idea or concept. This

procedure reestablishes some of the integrity missing in the lyrics as a whole. Bjork’s

ideas and images stretch over more than one musical phrase or longer musical phrases

and that makes them less graspable.

48
Roman Jakobson, Child Language, Aphasia, and Phonological Universals (The Hague: Mouton, 1968).
49
See http://unit.bjork.com/specials/gh/SUB-12/index.htm. Copyright © 1995-2007 by Björk Overseas Ltd,
(accessed April 16, 2007).

150
The song is almost seven minutes long. A skillfully crafted handling of the long-

term musical and lyric structure is what makes this song work. I argue that what holds

this song together are mainly its musical elements and not so much the linearity of its

narration—which, as stated before, is rather fragmentary. In a way, the personal story

being told becomes secondary. The listener probably ends up singing the vocal riff of the

chorus (described below) and the “group answer” on “sky blue.” The following diagram

shows the form of the song.

1st 2nd 3rd 4th 5th 6th 7th

Parts A A’ Chorus (1) A’’ Chorus (2) Bridge Chorus (3)


Two layers: Two layers: Only riff.
lyrics and lyrics and riff.
riff.

TABLE 3.3: Song format of Gabriel’s “Sky Blue”

At the level of the lyrics, Gabriel offers less repetition of whole sections than

Björk, with the obvious exception of the two words of the title, “sky blue,” which are

introduced every other line of the “verses.” The simple instrumentation of the band and

bare chord accompaniment of the very first “verse” provide a chance to understand the

opening lines clearly. The first and fifth lines resort to another highlighting literary

device, parallel grammatical constructions such as “Lost my time/lost my place in” and “I

know how to fly/I know how to drown in.” The first, second and third lines have internal

and final assonant rhyme.

151
(verse 1)
Lost my time lost my place in
Sky blue
Those two blue eyes light your face in
Sky blue
I know how to fly, I know how to drown in
Sky blue.

These kinds of structures keep recurring in the next verses. The second “verse”

holds a literary palindrome construction at a semantic level: “I sing through the land, the

land sings through me.” It also contains the alliteration of the phonemes [w], [m] and [n]

on “warm wind blowing.” The third verse ends each of its lines with an assonant final

rhyme: “goodbyes/ sky/denies.”

(Verse 2)
Warm wind blowing over the earth
Sky blue
I sing through the land, the land sings through me
Sky blue
Reaching into the deepest shade of
Sky blue

(verse 3)
Train pulled out said my goodbyes
Sky blue
Back on the road alone with the sky
Sky blue
There’s a presence here no one denies
Sky blue

By the second “verse,” however, all of these literary and versification devices are

almost overlooked. The only structural repetition that appeals to the listener’s attention is

the group vocalization of “sky blue.” These two words are sung by a choir that answers

the opening melody proposed by the soloist, Gabriel. The first time around it is actually

Gabriel who answers his own “calling.” From the second “verse” on, a group of voices

152
takes over the “sky blue” response. This procedure reproduces the typical “call-and-

response” form of several Afro musical styles. This “sky blue” response also acts as the

“hook” in the song. The hook is the repeated section of the song that more often than not

contains the song’s title, but it could also be a melodic phrase. In this case, the song

makes use of both things: a particular melodic snippet that answers the previous “call,”

the melodic opening phrase, the lyrics of which are the song’s title.

The hook serves the function of snagging the listener into the song and grabbing

his/her attention. The hook remains in memory even after the song is over. Booth says

“self-reference is often visible in the verbal form of…[a] hook, returning upon itself as a

paradox, or as a repetitive regression, or as an absurd phrase refusing to connect to the

expected context.”50 The lyrics of this particular hook, “sky blue,” do not hold any

paradox or absurdity but only a certain innocent redundancy. Although at the beginning

of the song, “sky blue” is the grammatical continuation of a phrase started in the

preceding line, by the second or third verse this discoursive connection is discontinued.

“Sky blue” holds a loose and indirect relation to the previous phrase: “warm wind

blowing over the earth/sky blue/I sing through the land, the land sings through me/sky

blue;” “Train pulled out said my goodbyes/sky blue.” Musically speaking, however, the

little melodic phrase of “sky blue” keeps connecting as the closing answer to the

“calling.” The following score of the first verse shows this musical procedure:

50
Booth, 179.

153
EX. 3.7: Transcription of the opening eleven measures of Gabriel’s “Sky Blue”

The chorus, a typical Western musical element of the song form, actually makes

use of another Afro musical devise. This time, the group of voices sings a four-measure

riff over which the soloist performs a quasi-parlato melody that gives the impression of

being improvised. The fragmented and scattered nature of its melodic contour provides

its improvised character. Of course, the force of tradition has some influence on what is

perceived and the fact that improvisation is what is expected in this kind of musical genre

—or at least the style that the song is trying to evoke—reinforces the way we listen to it.

The following score shows the progressive overlapping of these vocals in chorus

1 and 2.

154
EX. 3.8: Layout of solo and vocals in the first chorus of Gabriel’s “Sky Blue”

155
EX. 3.9: Layout of solo and vocals in the second chorus of Gabriel’s “Sky Blue”

As it is possible to observe in the score above, the riff lasts four 3/4 measure

pattern and is made of two phrases built on the same harmonic progression that repeats

throughout the chorus: C# minor–A major–B major–G# minor. These two phrases are

identical with the exception of their resolution, alternately the last note on G4 or F4. By

the second chorus, the increment of repetitions in the complete four-measure version and

earlier introduction of the riff contribute to the shift of the listener’s attention from the

solo voice to this group vocalization.

156
(Music) surely provides the shortest, the least arduous, perhaps even the most natural
solvent of artificial boundaries between the self and others…The words of folk song…are
not directed by one person to another or by many persons to many others; the voice is
that of the group…there is no “other being,” no mere listeners…If one member happens
to lead the chorus, his words are certainly not addressed to the others…He does not tell
them anything they don’t know; he does not speak to the others but for them.51

After the first three times that the soloist proposes the “sky blue” answer, we are

invited to join and participate in this collective act, vocalizing these words. From then on,

it becomes a habit to participate in the game through the riff of the chorus until the end,

when only the collective voice is left.

Empirical Data: Questionnaires’ Results

To supplement the observations obtained from the previous popular song

analyses, I undertook an experimental project, in which questionnaires were distributed to

a group of forty-six college students, the “listeners” referred to throughout.

Although I intend to apply my hypothesis to listeners in general, it was

necessary for practical reasons to limit my exploratory subjects to college students. These

undergraduate students of Hartwick College, a small private liberal arts college in upstate

New York, were between eighteen and twenty-three years of age (with the exception of

three older students between thirty-four and forty-five years old). Some of these students

were music majors (twenty-four people) and others were from other degrees such as

visual art (twelve people) and modern languages (ten people). Thus, the analytical group

embraces a more comprehensive universe of people and interests beyond music, and the

possibilities of obtaining a misguiding result are reduced. Otherwise, the exclusive

51
Booth, 18-19. From Victor Zuckerkandl, Man The Musician, trans., Norbert Gutterman. Bollingen Series

157
selection of music students could be questioned because of the possible conditioning by

their musical training. Interestingly enough, the results proved to be similar among all the

groups. For this reason, the data was processed and presented all together, and not

separated by group.

The stimulus was systematic and consistent in each administration of the

questionnaire. Every group listened to the same recorded songs and same versions of

them. The recordings used in this experimental project were the same songs analyzed in

the previous section: Fitzgerald’s version of “All the Things You Are;” Bob Dylan’s “A

Simple Twist of Fate;” Björk’s “Isobel” and Peter Gabriel’s “Sky Blue.”

The experience took place in a formal non-structured environment. Formal

because the listening action was not observed in its ordinary place but an artificially set

one, a classroom; and non-structured because all the variables intervening in the act of

listening were not controlled during the experience. Faced with the impossibility of

observing people in their natural environment where they perform the listening as an

everyday activity (concert, home, ambient music in cafes or any other social situation in

which music is encountered), the experience took place in classrooms where the subjects

were asked to listen to the songs without any specific instruction. Only after the pieces

were played once, they were asked to open the page in front of them and read the

questions.

At all times the intention was to minimize the artificiality of the situation and

reproduce as much as possible the casual listening that people experience in everyday

life. Having a blind first exposure to the songs and ignoring the actual goal of the

experience by hiding the particular questions from the listeners provided a chance for an

44.2 (Princeton: Princeton University Press, 1973), 51, 24-25, 26-27.

158
objective result. Although a certain kind of general attentive listening may have been

taking place, at least the subjects were not guided from the beginning of the experience to

observe and retain individual musical events in an “analyzing” or “deceptive” listening

way.

Finally, the same five questions were presented to each group. These questions

were designed in a non-systematic and open manner. The listeners did not have

formulated answers to choose from. This guaranteed that their responses were not

influenced or narrowed by any outside instruction. Each listener volunteered their free-

form responses, which I later grouped, noting the frequency of each. Thus, these

variables found in the following tables are not a mere listing of the people’s answers, but

a grouping in categories, which embrace and conceptualize their spontaneous responses.

Question 1: What do you remember from the song?

SKY BLUE SIMPLE TWIST ISOBEL ALL THE


OF FATE THINGS YOU
ARE
lyrics’ fragments 6 6 1 4
instrumental solos 1 3 1
melodies or 2 4 4
harmonization
rhythm or beat 3 2 3 1
back up vocals or 8 (It does not apply) 2 (It does not apply)
other non-word
vocalizations
band sound 8 7 4 3
(instrumentation,
mix and effects)
voice sound 8 8 5 5
performance style 1 4 2 4
character, mood 1
Other 2 (dark tone of piece; 1 (everything, I know 1 (Lion King?) 2 (enjoyable, familiar; I
overall sound) the song) knew the song)

TABLE 3.4: Results from question #1

159
The following is list of specific answers that the listeners wrote down as lyrics

fragments of lyrics that they remembered:

For Dylan’s “Simple Twist of Fate”:


• “Simple twist of fate” (3 answers)
• “twist of fate” (2 answers )
• “fate” (2 answers)
• all the lyrics, she knows the song (1 answer)

For Gabriel’s “Sky Blue”:


• “Sky blue” (4 answers)
• “Blue” (1 answer)

For Björk’s “Isobel”:


• “My name is...” (1 answer)

For Fitzgerald’s version of “All the Things You Are”:


• “You are” (x)
• “All the things you are, are mine.” (x)
• “springtime” (x)
• lyrics in general without specification (x)

During this first exposure to the four songs, the listeners showed a special interest

in the particularities of the performers’ voices and the sound of the bands of each track.

From the few phrases that they could remember, we gather that these were in strategic

places of the songs. They were either part of the songs’ refrains or part of the choruses.

These are sections that usually work through the song by melodic and harmonic

repetition, creating a dent in the listeners’ memory with their recurrent musical schemata.

Furthermore, when we cross the information obtained from the analysis of each

song with these results, it is possible to observe that the words or phrases that the

listeners remembered were highlighted by additional procedures such as parallel lyric

constructions, manipulation of the rhythmic pace of the piece over those words, or simple

lyric redundancy (recurrence of the same word). In the particular case of Gabriel’s “Sky

160
Blue,” the hypothesis proposed during the analysis was confirmed: the back up vocals

performed along Gabriel’s solo in the chorus were a main focus of the listeners’ attention.

Question 2: Why do you think you were able to remember specifically that?

SKY SIMPLE TWIST OF ISOBEL ALL THE


BLUE FATE THINGS YOU
ARE
Focused on the voice’s 2 1
sound because of its
prominence in the mix
Focused on vocal or 2 1 2 6
instrumental sounds because
the richness of colors makes
them memorable
Focused on instrumentation 3 1 1
because gave a particular
feeling and character to the
song
Focused on that word or 5 1
phrase because was repeated
several times
Focused on chorus because 7 3 3
of its strategic placement
and contrasting quality
Focused on instrumental 2
intro, interlude and end
because were playing by
themselves for a while
Focused on that line 1
because sticks out
Focused on voice quality 2 3 1
because of its compelling,
different and evocative
sound
Focused on lyrics because I 1 4
knew the song
Focused on melody because 1 2
the song is catchy, repetitive
Focused on rhythm because 2
of danceability
Focused on melody because 1 1
was repeated constantly
Focused on harmony and 1 3 1 2
rhythm because unexpected,
interesting
I liked those things, because 3 1
I liked the song, or the (annoying)
complete opposite, it was
annoying.
Other 1 2 ( folk make you focus on male 1
voice, guitar and harmonica)

TABLE 3.5: Results from question #2

The second question proved to be the most difficult to categorize and organize in

separate variables: first, because it is completely linked to the first question, and second,

161
because the many alternative combinations of factors presented a challenge at the

moment of synthesizing them into more embracing categories. But it proved useful in

confirming tendencies already marked in the answers given to the first question. The

reasons given by the listeners coincide with the information obtained in the songs’

analyses section. Thus in the particular case of “All the Things You Are,” the listeners

found themselves attracted to Fitzgerald’s vocal tone because of the richness of its color.

Dylan, Gabriel and Björk directed the attention of the listeners towards the chorus or

refrain of their songs by a crafted handling of the form which creates momentum and

expectancy, either by contrast or repetition of words and music.

Question 3: Is the song telling some kind of story? Briefly describe what is about.

SKY BLUE SIMPLE TWIST ISOBEL ALL THE


OF FATE THINGS YOU
ARE
General description of 1 4
topic in one word
Not sure, I paid 2 1
attention to the sound in
general
I do not remember; I do 4 5 7 1
not know
There is no story that I 2 1
remember
In a new song I listen to 1 1
the music and not pay
attention to words
Too abstract to grasp 1
meaning
I believe there was one, 2 2
but I do not remember
More detailed 1 5 9
description, but still
general, no specifics
Too busy listening to 2
the music
It does not tell a story, 1 2
it paints a picture,
shows emotions
Blank answer, no idea 4

TABLE 3.6: Results from question #3

162
The answers to the third question show the difficulty of grasping the meaning of

the lyrics after only one listening. Most of the listeners said that they could not remember

what the song was about or they did not know. The only two songs that seem to be more

accessible in a first time listening situation were “Simple Twist of Fate” and “All the

Things You Are.”

Question 4: How does the song make you feel? Why? Which elements of the song put
you in that mood? Musical elements, voice quality, words?

Simple content/ think/ Being


relaxed/ pensive sad Nostalgia fun Curious good reflect outdoors,
Twist of restful mood like riding
Fate on the road
not connected 1 1 1 1 1 3
to any music
element
Simplicity 1 1
catchy and 1 3
friendly beat
Interesting 1 2
chord
progression
serene tone, 2 1 1
awesome vocal
quality
soft and 2 1
smooth guitar
part
bacuase of the 3 1
lyrics, story-
telling

TABLE 3.7: Results from question #4 on Dylan’s “Simple Twist of Fate”

Sky Blue no answer reflective/ peaceful/ deep/ melanc good Yearning problematic/ uplifting
spiritual relaxed intense -olic mood melodrama-
tic
no musical 2 1 1 1 1 1 1
element related
no answer 1
soothing vocal 2
quality
beat and tempo 1 1 2
Chorus 1 1
because of the 2 2
accompany-
ment

TABLE 3.8: Results from question #4 on Gabriel’s “Sky Blue”

163
Isobel relaxing involved / attracted sad / dramatic good sleepy/ in trance angry

band/ 6 1
accompani-
ment
pulsating beat 2 1
her voice/ her 2 1 1
vocal quality
general pace of 1 1
the music
repetition of 1
sections and
melodic
material

TABLE 3.9: Results from question #4 on Bjork’s “Isobel”

All the light / happy like dancing Good like singing tunes in
Things a bar
You Are
no musical 1 1 1 1
element related
upbeat / 1 2 2
danceable
major chords 2
singers vocal 1
interpretation
musical style 1

TABLE 3.10: Results from question #4 on Fitzgerald’s version of “All the Things Your Are”

Question four required a separate table for each song. The answers regarding

moods and sensations that arose while listening to the songs were far too many and

idiosyncratic to each song. The cause-effect relationship among these variables is

individual to each piece. This question also received the most ambiguous and personal

answers.

This question had second intentions. Asking about the mood or emotional state

was only a way of obtaining the real sought after information: which elements of the song

were the listeners paying attention to? The results of this question complement those

from the first and second questions, where listeners were requested to tell what they

remembered from the heard songs and why. Seeking the same information from a new

164
angle confirmed tendencies already marked in those two previous questions. Except for

“Simple Twist of Fate,” where the lyrics seemed to be one of the elements with certain

incidence, listeners pointed that they got in certain moods while listening to: the vocal

tone of the performers, the band sound and the accompaniment, or the beat of the song.

For the most part there was no mention to the lyrics.

Question 5: Does the story of the song have a protagonist? Who is speaking to you in
this song? Who is narrating or describing the situation?

5. Does the story


of the song have a SKY BLUE SIMPLE TWIST ISOBEL ALL THE
protagonist? Who OF FATE THINGS YOU
is speaking to you ARE
in this song?
do not know 3 3 7 1
Yes 1 4 2
No 2 2 2
the narrator of the song 1
the writer of the song 2
fictional character 1
the singer 3 3 1 4
blank answer 2 2
the orchestra 1

TABLE 3.11: Results from question #5

Question number five is directly related to how much of the story or description

the listener was able to grasp. It was too complex for listeners to determine the nature and

identity of each song’s protagonist or narrator in a first listening. The fragmentary story

gathered in this brief exposure did not provide sufficient information. The answer could

be established only after a careful attentive listening to the lyrics of each song. In the act

of reading or even oral story-telling there is enough information in other sentences

165
around, time to go back mentally, and link previous statements to arrive at a satisfactory

conclusion.

Because of the nature of the question and the answers received, tables do not

clearly translate the information obtained. In this specific case, it is more appropriate to

proceed to describe and group the results in the following paragraphs and then show the

bare data in a table format.

The formulation of the question itself is vague and imprecise, but that decision

could be justified by the need to not influence or direct the answers of the listeners. Its

outcome was of special interest for this exploratory project. It actually proved how much

further attentive listening is needed to comprehend text at the level required to resolve

this issue: who is the “persona” singing through the song?

The question is ambiguous in itself. It may have several correct answers. It

engenders in itself two enigmas: first, the difference between the voice singing and the

first person in the narration; second, the difference between the protagonist of the story

and narrator. These could coexist in one person or they could be three different people.

From the people interrogated, all except one did not notice the switching of the

narrator in Bob Dylan’s “Simple Twist of Fate.” The first five verses are narrated in third

person as if the story of these two lovers was told by somebody else. The last verse

switches to the first person; the narrator becomes the protagonist (the male lover). Only

one of the listeners made a special comment about this.

Some subjects established a difference between when the singer was talking from

a personal experience and when she was interpreting and voicing a fictional character.

One could allege that such discrimination is the result of associating authenticity values

166
to certain musical styles more than others. For example, folk or grassroots influenced

musical styles such as Dylan’s song are expected to be sincere, personal and intimate

story-telling of the singer’s past experiences, while pop singers could take different

masks and become different characters.

From those songs used in the experiment, “All the Things You Are” is the only

one in which the composer and author were different from the singer: the composer is

Jerome Kern and the singer Ella Fitzgerald. But that does not mean that the author of the

lyrics and the narrator (the “I” first person of the story or protagonist) are the same

person. In the other three cases, the singers are also the composers: Bob Dylan’s “Simple

Twist of Fate,” Bjork’s “Isobel,” and Peter Gabriel’s “Sky Blue.” And again they may or

may not be the protagonists of their own stories.

The most problematic and unexpected answers were the straight “yes” and “no.”

The “yes,” besides not providing any specification of who they think is the protagonist,

does not give a precise idea if they really understood something from the story. They

could be assuming—a generalized idea—that any story has a protagonist as default. But

on the contrary, that is not the only option. The lyrics of a song could be unconnected

ideas, a description, loose words, or the perspective of who is talking in this song could

be very vague, unpredictable or completely absent. The “no” answers did not provide any

insight about the listeners’ understanding of the song.

Isobel was, according to the results, the most confusing song of all. Most of the

people did not know who was speaking or who was the protagonist, and the others

directly said that there was none. Observing the answers to some of the previous

167
questions about this same song, it appears that the listeners did not grasp the lyrics of this

particular song and their attention was mainly devoted to other aspects of the piece.

In order to arrive at a satisfactory answer in question five as well as in question

three, the listener needed at least a second listening. The second time the listeners were

prepared to pay attention to certain aspects of the lyrics. Text was listened to in an

attentive manner guided by the formulated questions.

Question 6: After listening for a second time to the same songs, do you feel you grasped
more of the meaning of the lyrics? Why? Only because you have a second chance or
because you pay more attention guided by the questions? What is the story about in every
song?”

In the answers given for question number six, the last one of this experience, the

listeners confirmed that only after this second time could they start to understand the

song’s content. They also admitted that this time they paid more attention guided by the

questions they already knew. Some of them specifically pointed out that they usually do

not pay attention to the lyrics the first time they listen to a song. They immerse

themselves in the music: the singing, the band, the melody, the instrumental solos, the

harmonies. Only after repeated listening do they feel they concentrate on the lyrics.

As the end result of this experiment, we can conclude that although listeners do

not ignore completely the songs’ lyrics, they tend to remember only certain isolated

words or short phrases. These lyrics’ fragments are usually part of choruses, refrains or

short motives that work through the song by melodic and harmonic repetition—on top of

the repetition of the words themselves. Only after listening several times in an attentive

manner, people may grasp the meaning of the lyrics. But otherwise, their attention is

168
diverted towards mostly sonic aspects of the performer’s voice, the band or the

arrangements of the songs.

169
V.

CONCLUSION

In their compositional approach to song, songwriters and composers evidence a

conscious or instinctive knowledge of how people tend to listen to vocal music. They

manipulate their textual and musical materials either to compensate, reinforce or oppose

the usual “poetic mode” of listening.

Faced with the challenge of the unavoidable fragmentation of text under any kind

of musical setting, songwriters and composers emphasize words from their lyrics or

poetic texts that they hope will help listeners to create their own narratives. Although the

songwriters and composers mold this emphasis on certain words according to their own

readings of their texts, the listener will reinterpret the text through his or her own reading,

to create a possible meaning for the song to which they are listening.

The vocal examples analyzed in this dissertation make the case for how different

compositional approaches act on the way people listen to text set to music. I start from

the idea that any musical setting of text produces a natural disruption of the discourse.

Even monody and recitative, with their bare settings, produce a certain degree of

disruption. The fact that they mount the words to sung tones already invites the listener to

shift her or his attention to the “songfulness” of the performing voice—paying attention

to the timbric color of its sustained sound.

Caccini’s monody and Handel’s and Mozart’s recitatives operate in a “poetic

mode” of perception, which privileges the “speech qualities” of the text. In their score

settings, they follow the prosodic characteristics of the text phrases as closely as possible.

On the one hand, they set the text phrases to melodic patterns that mimic their intonation

170
shapes pitch-wise. On the other hand, they also replicate the kinds of prolongation

performed over accented syllables and shorter values of the syllables in between with

similar musical rhythmic patterns. But the shapes and rhythm indicated in the score only

serve as points of departure for the real interpretation of the performer. It is in this

instance that both monody and recitative become quasi-speech. Any seventeenth-century

composer of monodies—as well as composers of recitative in any period in history—

expects flexibility in the tempo, without a steady beat in the performance of their settings.

This beat fluctuation gives the performer the chance to vary the articulation pace

according to her or his dramatic interpretation of the different phrases, creating the

definitive sense of speech so characteristic of monodies and recitatives.

Nineteen-century Lied serves as an example of the way the balanced “poetic

mode” operates. Although Reichardt and Zelter proclaimed their intentions of making

their musical settings a natural extension of the text—allowing it to speak for itself—and

limit their musical intervention as much as possible, their settings are “songful”

renditions far from any speech quality of the text. Their simple melodies—constrained in

register, syllabic, and stripped of embellishments and extended vocalizations—did not

have any speech quality but those of any other sung melody. The lack of modification in

text order and piano interludes aims to limit the disruption of the narrative flow.

Although the repetitious nature of their strophic settings and unobtrusive chord

accompanying textures allow listeners to ease their attention from the pure musical

elements of the song, they still apprehend the text in the “poetic mode” of listening. The

listeners hear the original sound patterns of the poem mounted over the “songfulness” of

the sustained tones of a voice singing a melody. Dylan’s “Simple Twist of Fate” operates

171
in a similar manner to Reichardt’s and Zelter’s settings with the strophic form holding his

own lyrics.

Beethoven’s, Schubert’s and Schumann’s settings of Mignon’s Lied further

emphasizes the “poetic mode” of processing by unleashing their compositional creativity

in further elaborated arrangements abounding in piano interludes, more complex and

colorful accompanying textures and harmonies and text modification. By the same

means, they also built musical forms that directed the attention of the listener to certain

specific words or phrases in their texts that synthesized the meaning behind the narration.

They highlighted these text fragments by creating harmonic tension, announcing or

delaying this phrase with piano interludes, repeating it several times, detaining the

rhythmic flow of the piece, etc.

Monk’s Volcano Songs as well as Berio’s A-Ronne represent the conscious

representation of the way listeners perceive text in the “poetic mode.” By

overemphasizing the sonic aspects of language, they concretely manifest what we hear

and how we hear it. In the hands of Monk, this process explores the human voice’s

timbral and gestural possibilities as deployed by any kind of syntactic text. Berio departs

from literary sources subjected to fragmentation and masking processes, which interfere

with their intelligibility. By breaking words into their phonetic components, overlapping

multi-texts, exploring different masking vocal gestures and paralingual sounds, he brings

awareness of speech elements that for the most part we unconsciously hear but do not pay

attention to in musical settings.

The four popular songs analyzed in the final section introduce four different

procedures of manipulating song structure to overcome or emphasize the “songfulness”

172
effect. What do songwriters have to do when they want to give a place to the narrative of

the lyrics? And what do they have to do when they want to indulge in this “songfulness”

effect and engage their audience in an experience of emotional and physical involvement

with their songs?

Observing the questionnaire’s results, we see that listeners do not completely

ignore text in songs, but they tend to remember only certain isolated words, mainly for

musical reasons. These words tend to occupy a prominent place in the songs, either by 1)

repetition of the word itself, 2) highlighting techniques such as slowing down the

rhythmic pace over the words, 3) repetition of the melodic motive over which the words

are set, or 4) over-articulation of the words’ phonetic components in performance.

The songwriters of the four songs used in the listening experiment—the same

songs analyzed in the previous section—show awareness of these highlighting procedures

and a crafted use of them. The highly structured song form of the Tin Pan Alley “All the

Things You Are” gives certain musical predictability to the listener but otherwise, as any

other song, resorts to musical manipulation to direct the attention of the listener toward

certain memorable phrases of the lyrics. Additionally, as we were able to observe, the

Fitzgerald’s and Streisand’s versions bring out different qualities in the same song. In

“Simple Twist of Fate,” Dylan acts over the “songfulness” effect and musicality of the

abundant sound patterns of his lyrics by using a repetitive strophic setting and a “quasi-

speech” vocal quality in his performance. Björk and Gabriel anchor their songs, “Isobel”

and “Sky Blue,” on certain phrases, such as refrains, or on engaging vocal riffs and

choruses, which for the most part are sung on nonsense syllables. The listeners are taken

through structures that build expectancy and redundancy devices.

173
Aside from paying attention to the strict musical elements of the piece, listeners

predominantly perceive the sonic or musical aspects of its lyrics: the colors of its

phonemes; the prosodic arch of its phrases’ intonation; the sonic quality of the

performing voice; the specific colors and inflections that the voice adopts at each phrase.

Composers know that audiences engage with their songs through these musical

gestures resulting from the alchemy of music and words. Even Goethe, despite his

caution against oversensitive and overcomplicated settings of his poems, had to admit

that only when words are set to music “is the poetic inspiration, whether nascent or fixed,

sublimated (or rather fused) into the free and beautiful element of sensory experience.

Then we think and feel at the same time, and are enraptured thereby.”1 Then we hear the

music of the words.

1
Goethe to Zelter, 21 December 1809; quoted in Eric Sams and Graham Johnson, “Lied (IV),” in New
Grove Dictionary of Music and Musicians, Vol. XIV (2nd ed. New York: Grove, 2001), 672.

174
BIBLIOGRAPHY

Aiello, Rita and John Sloboda, ed. Music Perception. New York, Oxford: Oxford
University Press, 1994.

Agawu, Kofi. “Theory and Practice in the Analysis of the Nineteenth-Century ‘Lied.’” In
Music Analysis 11, no.1 (March 1992): 3-36.

Barthes, Roland. The Grain of the Voice: Interviews 1962-1980. Berkeley and Los
Angeles: University of California Press, 1985.

Bauman, Richard, ed. Folklore, Cultural Performances, and Popular Entertainments: A


Communications-Centered Handbook. New York: Oxford University Press, 1992.

Berger, Karol. A Theory of Art. New York, Oxford: Oxford University Press, 2000.

Berio, Luciano. A-ronne: documentary for 8 singers on a poem by E. Sanguinetti. Wien:


Universal Edition, 1975.

———. “Poesia e musica un’esperienza.” In Incontri Musicali 3 (1959): 98-110.

——— and Swingle II. A-ronne .London: Decca, HEAD 15, 1976.

Beethoven, Ludwig V. Lieder und Gesänge mit Klavier. München: G. Henle, 1992.

Björk. Post. Elektra Entertainment Group. 61740-2, 1995.

Björk. Telegram. Elektra Entertainment Group. 61897-2, 1996.

Bolinger, Dwight. Intonation and Its Uses: Melody in Grammar and Discourse. Stanford,
California: Stanford University Press, 1989.

———. Aspects of Language. New York/Chicago/San Francisco/Atlanta: Harcourt, Brace


& World, Inc., 1968.

Boulez, Pierre. Orientations: Collected Writings. Ed. Jean-Jacques Nattiez. Cambridge,


Mass.: Harvard University Press, 1986.

Booth, Mark W. The Experience of Songs. New Haven and London: Yale University
Press, 1981.

Bracket, David. Interpreting Popular Music. Berkeley, Los Angeles, London: University
of California Press, 2000.

Calcagno, Mauro. “Signifying Nothing: On the Aesthetics of Pure Voice in Early


Venetian Opera.” In The Journal of Musicology 20, no. 4 (Autumm, 2003): 461-497.

175
———. “’Imitar col canto chi parla’”: Monteverdi and the Creation of a Language for
Musical Theater.” In Journal o the American Musicological Society 55, no. 3 (Fall
2002): 383-433.

———. “Monteverdi’s parole sceniche.” In Journal of Seventeenth-Century Music, vol.


9, no.1 (2004). Http://www.sscm-jscm.org/jscm/v9/no1/Calcagno.html

Cone, Edward. The Composer’s Voice. Berkely, Los Angeles, London: University of
California Press, 1974.

Cone, Edward. “Words into Music: The Composer’s Approach to the Text.” In Sound
and Poetry. New York , London: Coloumbia University Press, 1957.

Coste, Didier. Narrative as Communication. Minneapolis: University of Minnesota Press,


1989.

Crystal, David. Prosodic Systems and Intonation in English. London: Cambridge


University Press, 1969.

Dahlhaus, Carl. Nineteenth-Century Music, trans. J. Bradford Robinson. University of


California: Berkeley, Los Angeles, 1989.

Dalmonte, Rossana and Bálint András Varga, Two Interviews/Luciano Berio, trans.
David Osmond-Smith. New York: M. Boyars, 1985.

Dame, Joke. “Voices Within the Voice: Geno-text and Pheno-text in Berio’s Sequenza
III.” In Music/Ideology” resisting the Aesthetic, ed. Adam Krims. Amsterdam:G&B
Arts International, 1998.

Daverio, John. Robert Schumann: Herald of a “New Poetic Age.” New York-Oxford:
Oxford University Press, 1997.

Davis, Sheila. The Craft of Lyric Writing. Cincinnati, Ohio: Writer’s Digest Books, 1985.

Dreßen, Norbert. Sprache und Musik bei Luciano Berio: Untersuchungen zu seine
Vokalkompositionen. Regensburg: Bosse, 1982.

Duckworth, William. Talking Music. New York: Simon & Schuster Macmillan, 1995.

Dylan, Bob. Blood on the Tracks. CBS CDBS 69097, 1975.

Fitzgerald, Ella. Ella Fitzgeral Sings the Jerome Kern Song Book. Verve Records 314
519 847-2, 1993.

Forte, Allen. The American popular Ballad of the Golden Era, 1924-1950. Princeton,
N.J.: Princeton University Press, 1995.

176
Frith, Simon. Music for Pleasure: Essays in the Sociology of Pop. Cambridge, Oxford:
Polity Press, 1988.

———. Performing Rites: On the Value of Popular Music. Cambridge, Mass.: Harvard
University Press, 1996.

Fubini, Enrico. A History Of Music Aesthetics. London: The Macmillan Press Limited, 1990.

Gabriel, Peter. Up. Geffen Records 0694933882, 2002.

Goethe, Johann W. von. Wilhelm Meister’s Apprenticeship, ed and trans. E.A.Blackall.


Suhrkamp Publishers: New York, 1989.

Hill, Walter J. “Beyond Isomorphism Towards a Better Theory of Recitative.” In Journal


of Seventeenth-Century Music, vol. 9, no.1 (2004). Http://www.sscm-
jscm.org/jscm/v9/no1/Hill.html

Hirst, Daniel. “Intonation in British English.” In Intonation Systems: A Survey of Twenty


Languages. Cambridge,UK: Cambridge University Press, 1998.

Iser, Wolfgand. The Act of Reading: A Theory of Aesthetic Response. Baltimore and
London: The Johns Hopkins University Press, 1978.

Jakobson, Roman. Child Language, Aphasia and Phonological Universals. The Hague:
Mouton, 1968.

Jakobson, Roman. Language in Literature. Cambridge, London: The Belknap Press of


Harvard University Press, 1987.

Jowitt, Deborah, ed. Meredith Monk. Baltimore: The Johns Hopkins University Press,
1997.

Kern, Jerome and Oscar Hammerstein II. All the Things You Are. Polygram International
Publishing, Inc., 1939.

Kramer, Lawrence. Music and Poetry: The Nineteenth Century and After. Berkeley:
University of California Press, 1984.

———. Musical Meaning: Towards a Critical History. Berkeley and Los Angeles,
California: University of California Press, 2002.

Lewin, David B. “Figaro’s Mistakes.” In Engaging Music: Essays in Music Analysis, ed.
Deborah Stein. New York-Oxford: Oxford University Press, 2005.

177
Liberman, A.M., and David Isenberg “Duplex Perception of Acoustic Patterns as Speech
and Nonspeech” in Status Report on Speech Research SR-62. Haskins Laboratories
(1980): 47-57.

———, I.M. Mattingly and M.T. Turvey, ”Language Codes and Memory Codes.” In
Coding Processes in Human Memory, ed. A.Melton and E. Martin. New York: Wiston,
1972.

———, F. S. Cooper, D .P. Shankweiler and M. Studdert-Kennedy. “Perception of the


Speech Code.” In Psychological Review 74 (1967): 431-61.

Lieberman, Philip. Intonation, Perception, and Language. Cambridge, Massachusetts:


The M.I.T. Press, 1967.

Lodato, Suzanne M. “Recent Approaches to Text/Music Analysis in the Lied: A


Musicological Perspective.” In Word and Music Studies 1: Defining the Field, ed.
Walter Bernhart, Steven Paul Scher and Werner Wolf. Amsterdam-Atlanta, GA:
Rodopi, 1999.

Lyotard, Jean-François. “A Few Words to Sing.” In Music/Ideology” resisting the


Aesthetic, ed. Adam Krims. Amsterdam:G&B Arts International, 1998.

MacClintock, Carol, ed. The Solo Song 1580-1730. New York: W.W. Norton &
Company, Inc., 1973.

Menezes, Flo. Luciano Berio et la Phonologie: Une Approche Jakoksonienne de son


Oeuvre. Frankfurt, Berlin, Bern, New Cork, Paris, Wien: Petersbang, 1993.

Middleton, Richard. Studying Popular Music. Philadelphia: Open University Press, 1990.

———, ed., Reading Pop: Approaches to Textual Analysis in Popular Music. Oxford,
New York: Oxford University Press, 2000.

Minsky, Marvin. “Music, Mind and Meaning.” In Music, Mind and the Brain: The
Neuropsycology of Music, ed. Manfred Clynes. New York, London: Plenum Press,
1982.

Monelle, Raymond. The Sense of Music: Semiotic Essays. Princeton and Oxford:
Princeton University Press, 2000.

Monk, Meredith. Volcano Songs, ECM 1589 453 539-2,1997.

Mozart, W.A. Don Giovanni. Opera completa per canto e pianoforte. Milano: Ricordi,
1946.

Nattiez, Jean-Jacques. Music and Discourse: Toward a Semiology of Music. Princeton,


New Jersey: Princeton University Press.

178
Neubauer, John. The Emancipation of Music from Language: Departure from Mimesis in
Eighteenth-Century Aesthetics. New Haven, London: Yale University Press, 1986.

Osmond-Smith, David. Berio. Oxford, New York: Oxford University Press, 1991.

———. Playing with Words: A Guide to Luciano Berio’s Sinfonia. Cambridge: B.


Jordon Music Books, 1985.

Palisca, Claude V. Music and Ideas in the Sixteenth and Seventeenth Centuries. Chicago:
University of Illinois Press, 2006.

Repp, B., C. Milburn and J. Ashkenas, “Duplex Perception: Confirmation of Fusion.” In


Perception & Psychophysics 33, no.4 (1983): 333-337.

Schwarz, Robert. Minimalists. London: Phaidon Press Limited, 1996.

Tsur, Reuven. What Makes Sound Patterns Expressive? The Poetic Mode of Speech
Perception. Durham and London: Duke University Press, 1992.
i
Reichardt, Johann F. 31 Lieder, Oden, Balladen und Romanzen. Huntsville, Tex.: recital
Publications, 2000.

Rosen, Charles. The Romantic Generation. Cambridge, Massachusetts: Harvard


University Press, 1995.

Rossi, Mario. ”Intonation in Italian.” In Intonation Systems: A Survey of Twenty


Languages. Cambridge:, UK: Cambridge University Press, 1998.

Sachter, Carl. “Motive and Text in Four Schubert Songs.” In Engaging Music: Essays in
Music Analysis, ed. Deborah Stein. New York-Oxford: Oxford University Press, 2005.

Sams, Eric and Graham Johnson. “Lied (IV).” In New Grove Dictionary of Music and
Musicians, vol. XIV, 2nd edition. New York: Grove, 2001.

Schwarz, David. Listening Subjects: Music, Psycoanalysis, Culture. Durham, London:


Duke University Press, 1997.

Scher, Steven Paul. “Melopoetics Revisited. Reflections on Theorizing Word and Music
Studies.” In Word and Music Studies 1: Defining the Field, ed. Walter Bernhart, Steven
Paul Scher and Werner Wolf. Amsterdam-Atlanta, GA: Rodopi, 1999.

Schumann, Robert. Selected Songs for Solo Voice and Piano from the Complete Works
Edition. New York: Dover, 1981.

179
Stacey, Peter .Contemporary Tendencies in the Relation of Music and Text with Special
Reference to Pli selon pli (Boulez) and Laborintus II (Berio). New York/London:
Garland Publishing, INC, 1989.

Stacey, Peter. “Towards the Analysis of the Relationship of Music and Text in
Contemporary Composition.” In Contemporary Music Review. United Kingdom:
Harwood Academic Publishers GmbH, 1989.

Starr, Larry and Christopher Waterman, American Popular Music: from Minstrelsy to
MTV. New York, Oxford: Oxford University Press, 2003.

Stein, Jack M. Poem and Music in the German Lied from Gluck to Hugo Wolf.
Cambridge, Mass.: Harvard University Press, 1971.

Stockhausen, Karlheinz and Herbert Eimert. Die Reihe:Speech and Music. Bryn Mawr,
Pennsylvania: Theodore Presser Company, 1968.

Streisand, Barbra. Simply Barbra. Sony B0000024TI, 1990.

Taruskin, Richard. The Oxford History of Western Music. Oxford-New York: Oxford
University Press, 2005.

Youens, Susan. Schubert’s Poets and the Making. Cambridge-New York: Cambridge
University Press, 1996.

Zak, Albin J. The Poetics of Rock: Cutting Tracks, Making Records. Berkeley, Los
Angeles, London: University of California Press, 2001.

Zelter, Carl F. Lieder. München: G. Henle, 1995.

180

Das könnte Ihnen auch gefallen