Sie sind auf Seite 1von 37
9 Jazz Improvisation: A Theory at the Computational Level - P. N. JOHNSON-LAIRD Department of Psychology Princeton University Princeton, NJ 08544, USA INTRODUCTION There are two principal reasons for cognitive psychologists to study the improvisations of jazz musicians. First, the process of improvisation is an unusual form of expertise, and all forms of expertise are a proper concern for anyone interested in how the mind works. A better understanding of the skill might even be pedagogically useful one day. Second, jazz impro- visation depends on imagination and its study may help psychologists to understand the nature of creative mental processes. My aim in this chapter is to outline a psychological theory of improvisa- tion in modern jazz — the idiom that was developed in the 1940s by Charlie Parker, Dizzy Gillespie and their colleagues, and that has continued to be the dominant style to this day. The theory concerns what the mind has to compute in order to produce an acceptable improvisation. A theory of what is computed is not, of course, a theory of how the computation is carried out. Indeed, the distinction between the two, which was emphasized REPRESENTING MUSICAL. Copyright © 1991 Academic Press Limited STRUCTURE ISBN 0-12-357171-5 ANI rights of reproduction in any form reserved. 292 P. N. Johnson-Laird by the late David Marr (1982) in his studies of vision, is crucial to advan- cing knowledge — to understand how the mind functions, we need first a good account of what it is doing. However, I will also discuss two general approaches to how the mind may generate improvisations: one is based on the manipulation of explicitly structured symbols, which is the traditional method of modelling mental processes in computer programs; and the other is based on the manipulation of distributed representations in which explicit structure plays no part in processing, and which forms the basis of ‘connectionist’ theories (Rumelhart and McClelland, 1986; and for an introduction, Ch. 10 of Johnson-Laird, 1988a). Finally, 1 will spell out how jazz improvisation relates to the general nature of creative mental pro- cesses. The essential psychological feature of musical improvisation, whether it be modern jazz, classical music, Indian, African, or music of any other sort, is that the musicians themselves do not have conscious access to the processes underlying their production of music. A lay person may find this claim surprising, perhaps even incredible; a cognitive psychologist will find it prosaic. The fact is that human beings have conscious access to only a small part of the contents of their minds, and hardly any access whatsoever to mental processes. This point was known to Helmholtz (1897) and it is corroborated by the very existence of cognitive psychology. A direct way to convince those who may doubt it is to ask them to devise a computer program that produces a musical improvisation, or, for those who are musically naive, to devise a computer program that tells stories. If one had conscious access to the complete processes underlying such skills, the demand would be trivial. Existing programs for improvisation and telling stories, however, have only the most rudimentary abilities because pro- grammers, even if they are competent musicians or authors, cannot discern the basis of their abilities merely by introspection. The ethnomethodol- ogist, David Sudnow (1978), has written an engaging account of the phenomenology of learning to play jazz piano. The very title of his book, Ways of the Hand, indicates that as one develops the ability to improvise the skill seems to come out of the end of one’s fingertips. Its main components are profoundly unconscious. A common misconception about improvisation is that it depends on acquiring a repertoire of motifs — ‘licks’ as they used to be called by musicians — which are then strung together one after the other to form an improvisation, suitably modified to meet the exigencies of the harmonic sequence. There are even books containing sets of ‘licks’ to be committed to memory to aid the process. Surprisingly, the error has also been per- petrated by theorists. In characterizing jazz improvisations, Ulrich (1977) writes: ‘Sequences of motifs are woven together to form a melody. Rather than constantly inventing new motifs, the musician modifies old ones to 9. Jazz Improvisation: A Theory at the Computational Level 293 fit new harmonic situations.’ A similar idea has been implemented in a program devised by Levitt (1981) for improvising jazz melodies. The program takes as input a chord sequence and an existing melody. It divides the melody up into units of two bars, and then re-uses these elements in a different order and in variant forms. The program is entirely deterministic, ie. given the same input melody and chord sequence, it produces the same improvisation. This characteristic, as I shall argue, is also inappropriate for human improvisations. Why can one be confident that the ‘motif’ theory is wrong? There are three reasons. First, someone has to invent the motifs. If a musician is the first to play a particular motif, then he or she cannot merely be regurg- itating it from memory. Second, although most musicians have certain phrases — often a rhythmic pattern rather than a melodic motif — to which they are addicted, an analysis of corpora of the musician’s improvis- ations yields many phrases that occur only once. A sceptic might say that an analysis of every single improvisation made by a musician would falsify this claim. Yet, there would still be many possible phrases characteristic of the musician’s style even if they never occurred in the corpus. Third, the labour of committing to memory a sufficient number of motifs to guarantee the improvisation of complete solos is altogether too large to be prac- ticable. Of course, there are jazz musicians who are not good at improvis- ation, and there were some in traditional jazz who never improvised a solo or else merely replayed one that they had worked out in the past. But, any competent jazz musician will tell you that it is far easier to make up new phrases than to try to learn a vast repertoire of them for use in solos. An apt analogy is speech: discourse would be intolerably difficult if it consisted solely in stringing together remarks that one had committed to memory. It is this sort of stilted jumble of phrases that one is forced to produce in a foreign language where one’s only guide is indeed a book of ‘licks’,‘i.e. a phrase book. RHYTHM AND TIMING In modern jazz, an improvisation consists of an extemporized melody that fits a tonal chord sequence of, say, 32 bars in length, which is repeated as many times as necessary. The rhythm section, which typically consists of piano, double bass and drums, provides the accompaniment. The drums state the basic metrical pulse, usually four beats to the bar, emphasizing the weak second and fourth beats, and playing rhythmic figures to accompany and to stimulate the improvising soloist. The bass player improvises a base line to the chord sequence, and also helps to maintain the metrical pulse at a fixed tempo. The pianist improvises a statement of the chord sequence, 294 P. N. Jonnson-Laird varying the choice of chords and their voicings, and again providing rhythmic figures to help to create a feeling of relaxed ‘swing’. Although there are big bands that play modern jazz, the genre was originally created by small groups containing two or three brass instruments — trumpet and saxophone — and the rhythm section. All the musicians in the group may take turns to improvise solos of several choruses, i.e. to the 32-bar chord sequence. A performance of a particular piece usually begins and ends with an ensemble statement of the melodic theme, which the front line instru- ments may play in unison or else in a simple arrangement. The improvised choruses occur between these two statements. The chord sequences favoured by modern jazz musicians derive from popular songs, typically by such composers as George Gershwin and Cole Porter, from compositions of their own, or from the ubiquitous ‘twelve-bar blues’. A fragment from a typical improvisation is shown in Figure 1. It is a transcription that I made of a melody improvised by the the late Bill Evans, an outstanding modern jazz pianist, on his record, Explorations (Riverside RLP 351). He was improvising to a chord sequence based on the harmonies of the popular song, ‘How Deep is the Ocean’, and the figure displays these chords in a conventional notation, which I will explain below. What the transcription fails to make explicit is the particular rhythmical quality of modern jazz. This failure is, in part, because there is no precise account of this style: musicians acquire it by listening to virtuosos and seeking to emulate them, and, though they develop a discriminating ear for what ‘swings’ and what does not, they are unable to explain the underlying rhythmic principles. Even if such an account existed, it is an open question Dm m 75 ar Dm7 B78) & ahr ¢ ie Fim? Bb7 Ebm7 nor” one emzsb Ebr Emzsb AT Figure 1. An improvised melody by the late Bill Evans on a chord sequence based on ‘How deep is the ocean’. 9. Jazz Improvisation: A Theory at the Computational Level 295 a _ mun p | Sood) ly | weowwee ne | Be zs Me wr Be 7 | ee 96. foentisecsy PR | aR Figure 2. The rhythm of the repeated phrase in Charlie Parker’s theme, ‘Now’s the time’. whether its description in conventional European musical notation would be particularly informative. The problem of characterizing the rhythmic style of modern jazz can be illustrated by some empirical observations that I have made concerning the performance of the rhythm shown ir Figure 2. Jazz enthusiasts will recog- nize it as that of Charlie Parker’s twelve-bar blues theme, ‘Now’s the Time’, which repeats the phrase shown in the figure several times. If a classically trained musician piays the phrase as it is notated, it will Jack its essential jazz flavour. One approach to pinning down the nature of the rhythmic component is to measure the actual onsets and offsets of the notes from a recording of them by a jazz musician. Here, however, one runs into the problem of determining the precise points to measure. The start of a musical note turns out to be a somewhat indeterminate notion: should one measure from the point on the oscillograph at which the first vibration occurs? Presumably not, because this point will certainly not correspond to the perceived onset of the note. Rather than struggle with this problem, I used two complementary procedures. With a simple computer system for generating music, I manipulated the durations of notes until the output began to resemble the sound of an authentic performance. Figure 2 displays one set of durations that produced a satisfactory performance. Another set of observations was collected through the good offices of Carol Krumhansl at Cornell University: I played a simple blues theme by Milt Jackson, ‘Bag’s Groove’, and her computational set-up recorded the onsets and offsets of notes from the keyboard. An example of the resulting durations are shown along with the conventional notation of the theme in Figure 3. What is striking in both of these cases is that there appears to be no simple durational principles that are responsible for producing the timings of jazz performance. Readers who have never consciously listened to modern jazz are advised to listen to, say, the original Bill Evans recording, if they wish to get a feel for these tacit conventions of modern jazz. Although there is no explicit account of them, they undoubtedly exist, and they take some years of assiduous practice to acquire, 296 P..N. Johnson-Laird Wy \s 735 45 6412 24444 13524 4 5 3 145 90 10 Figure 3. The timing of a rendition of the opening phrase of ‘Bag’s groove’. Lacking more comprehensive information, I will turn from the temporal conventions governing performance to the conception of the rhythms used in improvising melodic phrases, and I will assume that if and when the temporal conventions are satisfactorily described, they can be treated as a kind of ‘filter? into which are fed rhythmic patterns, ‘as conceived by the musician, to emerge with actual durations specified in real time. A similar conception of performance can be adopted for classical music: the score captures the conception of the rhythmic structure of the piece. Its realiza- tion in an actual performance depends on further tacit interpretative conventions that have been acquired by performers (Longuet-Higgins and Lee, 1984). A melodic jazz improvisation is made up from phrases, which can vary in length from half-bar interpolations (see the third phrase in Figure 1) to lengthy phrases that spread over several bars. A phrase resembles the utter- ance of a sentence in a natural language except that, in music, a phrase does not refer to a state of affairs. It has no meaning other than its intrinsic musical meaning. At its highest level of organization, an improvised solo normally consists of a sequence of phrases. The sequence may itself have an intrinsic organization. Certainly, musicians aim for a variety of phrases, but what probably holds a lengthy improvisation together is not a precisely articulated musical structure, such as sonata form in classical music, but the repeating harmonic sequence on which the improvisation is based. No sophisticated musical plan appears to govern the structure of an improvis- ation above the level of individual phrases. In order to devise a computer program that improvises musical phrases, it is sensible to divide the task into several relatively independent modules corresponding to different elements of performance. A note in a musical phrase has five main components: 1, a pitch, which in jazz may be bent, i.e. changed slightly during its performance; 2. an onset time with respect to the metrical structure of the bar; 3. a duration; 4. an intensity, i.e. a volume, which again changes during its perform- ance; 5. a manner of articulation: it may be staccato, legato, slurred, ghosted, and so on, depending on the particular musical instrument. 9. Jazz Improvisation: A Theory at the Computational Level 297 A phrase may also contain rests, i.e. silences that play a particular role in the musical shape of the phrase (see the second phrase in Figure 1). Phrases themselves are generally separated by rests, which in jazz are typically longer than the mean duration of the notes in the phrase. A rest has two components: 1. an onset time with respect to the metrical structure of the bar; 2. a duration defined by the onset of the next note in the phrase. The specification of a phrase is complete when every note and rest in the phrase has been defined for all of these components. Undoubtedly, how- ever, most of the work has been done when for each element in the phrase one has specified its onset time and, if it is a note, its pitch. A computer program, like a musician, is therefore principally concerned with two tasks: the generation of a rhythmic pattern for the phrase, i.e. a sequence of onsets of notes and rests, and the generation of a correlated sequence of pitches for the notes in the phrase. These two tasks are not completely independent of one another, but.they can be separately analysed to some extent. A reasonable strategy for a computer program — and one that [ have adopted in the programs to be presently described — is to generate the next onset in a phrase, and then to select its pitch (if it is a note). In the rhythmic pattern of a jazz phrase, the duration of the notes is not as important as the sequence of their onset times. Thus, the rhythm of the second phrase in Figure 1 can be represented thus: 3 3 My)+him If you clap this rhythm, you convey its essentials. Clearly, the mind of a jazz musician must contain a tacit procedure that can generate a large variety of different rhythms. One way in which to specify what this pro- cedure computes is to define a grammar that will generate all the possible rhythmic phrases within the musician’s competence. Before I can sketch the sorts of rules likely to be needed in such a grammar, I need to say a little about different sorts of formal grammar. GRAMMAR AND COMPUTATION A grammar is a set of rules, which, in themselves, can do nothing. Pro- cedures can be devised, however, which can use a grammar to produce an actual output. Grammars differ in their power, and in particular in their 298 P. N. Johnson-Laird ‘weak’ generative power, i.e. in what sentences they can be used to gen- erate. Hence, grammars restricted to rules of a certain form are. unable to generate certain sorts of sentences. There is a well-known hierarchy, known as the Chomsky hierarchy, from the most powerful grammars to the weakest. The most powerful are unrestricted transformational grammars, which have rules that in effect allow one sequence of symbols to be transformed into another. As various constraints are placed on the form of grammatical rules, then so the generative power of the grammar is reduced. For example, if, instead of transforming a sequence of symbols, rules can rewrite only one symbol at a time, then the power of the grammar is severely curtailed. The crucial point about the power of a grammar is that it has direct con- sequences for the demands on working memory made by the use of the grammar to generate or to parse symbols. This fact is highly pertinent to the psychology of improvisation. But, to demonstrate its relevance, I will first use some examples from mental arithmetic. Suppose I write two numbers on a blackboard: 123 948 and I ask you to make a mental addition of them. You can say aloud each of the relevant digits as you proceed from right to left. Hence, you could perform as follows: Add 3 and 8 together, which equals 11. Say aloud the far right digit: ‘1’. Make a mental note of the carry of 1. Add 2 and 4 and the current carry together, which equals 7. Say aloud the digit: 7. Make a mental note of the carry of 0. Add 1 and 9 and the current carry together, which equals 10. Say aloud the digit: 0. Make a mental note of the carry of 1. There are no more columns to be added, but you have a carry. Say aloud the current value of the carry: 1. All you have to hold in working memory as you do the calculation is the current value of the carry, i.e. whether it is 1 or 0. : Now, suppose I ask you to multiply the two numbers togetlier. The task is significantly more demanding because of the load it places on working memory: Multiply 100 times 948: 94800. Make a mental note of 94800. Multiply 20 times 948: Multiply 2 times 8, and so on. 9. Jazz Improvisation: A Theory at the Computational Level 299 Although there exists an alternative algorithm for multiplication that allows you to speak out aloud each of the resulting digits from right to left, it nevertheless calls for an arbitrarily large amount of information to be stored in working memory depending on the particular numbers to be multiplied. The use of the weakest possible grammar, a so-called ‘regular’ grammar, resembles mental addition. It places a minimal load on working memory for the results of intermediate calculations. The use of the strongest possible grammar, an ‘unrestricted transformational’ grammar, resembles mental multiplication. There is no limit to the amount of working memory that may have to be used during the course of intermediate calculations. Between these two extremes in the Chomsky hierarchy lie several other sorts of grammar. As we shall see, a set of simple, but plausible, assump- tions will enable us to make a motivated choice of different sorts of grammar for characterizing what is computed by the different sorts of processes underlying musical improvisation. I emphasize again that this enterprise is concerned with what is being computed by the mind rather than with how the process is computed. Hence, the use of grammars does not necessarily imply that musicians themselves rely on grammars in order to improvise. They may do, or they may use an entirely different sort of algorithm. I will take up this issue later in the chapter. Grammars contain two main sorts of symbols: the terminal symbols that occur in the actual output that the grammar is used to generate (e.g. the symbol ‘the’ is a terminal symbol for a grammar of everyday English); and the non-terminal symbols that occur only in the course of generating sen- tences and that are not part of the actual language that the grammar can be used to generate (e.g. the symbol ‘NP’, which denotes a noun phrase, is not part of everyday English), Each rule in a regular grammar — the least powerful sort of grammar — has one of only two possible forms. The first form is exemplified by: 1, NP > John which states that a non-terminal symbol, NP, can be rewritten as a terminal symbol, John. The second form is exemplified by: 2. NP > the N which states that a non-terminal symbol NP can be rewritten as a terminal, the, followed by a non-terminal, N (for noun). Only these two forms of rule are allowed, and as a consequence the structures that can be generated by a regular grammar are very simple. They consist solely of binary branch- 300 P. N. Sohnson-Laird ings of the following sort: NP. N bs the man I™ Regular grammars can also be based on the convention that the terminal symbols follow the non-terminals on the right-hand side of rules. A so-called ‘context-free’ grammar is more powerful than a regular grammar, but less powerful than a transformational grammar. It can have rules of the following sort: S— NP VP NP > ART N VP V NP V > loves ART > the N~ woman Hence, unlike a regular grammar, a rule may have more than one non- terminal on its right-hand side. As a consequence, the grammar can gen- erate richer branching structures than those of a regular grammar. Indeed, some linguists have argued that the grammar of natural language may not be much more powerful than a context-free grammar. The grammar above, for example, can be used to generate the following structure: a oN ON ART N v NP L™ ART ] The woman loves the __— child This ‘tree’ diagram is merely a graphical way of illustrating a labelled bracketing of a string of symbols: (S (NP (ART "*)(QN "°™@"y) (VP (V 15) (NP (ART. "(Ney that specifies how the symbols should be grouped together. The grouping may play a crucial role in the semantic interpretation of sentences. 9. Jazz Improvisation: A Theory at the Computational Level 301 A context-free grammar can contain recursive rules, which have the same symbol on both sides of the rule. For example, a rule for possessive noun phrases might have the form: NP > NP-POS NP where the non-terminal, NP, occurs on both sides of the rule. It can be used to generate such structures as: NP NP- Kv ao NP-POS NP N N N N woman’s child’s friend’s toy which in theory could be indefinitely long. The use of a grammar to generate sentences calls for a modicum of working memory in order to retain the non-terminal symbols that have yet to be re-written as terminals in a sentence. A working memory that suffices for context-free languages takes the form of a stack — rather like a pile of plates — in which each input item goes on the top of the stack, and each time an item is recalled it is taken from the top of the stack. With access only to the item at the top of the stack, the system lacks the power of random access memory in which any item anywhere in memory can be freely accessed. In a musical improvisation, a musician has to generate notes in real time, and has no opportunity to go back to revise them. Hence, an optimal system will be one that operates highly efficiently and without the need for complex intermediate computations (Johnson-Laird, 1988b). It will place a minimal demand on the processing capacity of working memory. Such a system, of course, corresponds — in terms of characterizing its output — to a regular grammar. But, what evidence could in principle determine whether a regular grammar suffices to characterize the members of some corpus? There are, in fact, two principal considerations. The first concerns the weak generative capacity of the grammar. Imagine an abstract language that contains two symbols, say, a left bracket and a right bracket, along 302 P.N. Johnson-Laird with a number of other symbols. Imagine further that the only well-formed expressions in this language are like those of arithmetic, and so the brackets must match one another. For example, the expression: ((a+b) xc) is well-formed, but the following string: (at+bxo) is not well-formed. There is no way in which all and only the well-formed expressions of this language can be captured by a regular grammar: such a grammar lacks the generative power to ensure that the brackets match. The rules that are needed, such as: S> var S- (S operator S) operator > + operator x var > a,b, ¢,... must be recursive with more than one non-terminal on the right-hand side. In short, the grammar has to be context-free. The second consideration is the structure of expressions in the language, particularly if it is to guide the process of semantic interpretation. A string of symbols can be grouped together in various ways, e.g.: (the (man (laughed))) or: ((the man) (laughed)) Plainly, the second grouping is likely to be necessary for the proper interpretation of English sentences. I will take into account both of these considerations in developing a grammar for characterizing jazz improvisations. THE RHYTHMIC STRUCTURE OF IMPROVISED PHRASES The development of a grammar for the rhythms of musical phrases calls for the analysis of a large body of data. The nature of the exercise can be illus- trated by considering the grammar for the phrases of Christmas carols — a genre that rhythmically speaking is much simpler than modern jazz. l analysed a corpus of Christmas carols, which were all in common time, with a view to writing a grammar that would generate their rhythms. Table 1 presents a grammar that captures all the rhythms of the phrases in the 9. Jazz Improvisation: A Theory at the Computational Level 303 . Table 1. A regular grammar for the rhythms of the first two bars of a corpus of Christmas carols (in common time) Bar |: Bar 2: Beat-1 > J Beat-2 Beat-1 > J Beat-2 J. Beat-2.5 > J. Beat-2.5 Beat-2 > } Beat-2.5 > d Beat-3’ ) Beat-3 Beat-2 > J Beat-3 Beat-2.5 > } Beat-3' > | Beat-3" Beat-3 > } Beat-3.5 Beat-2.5-> } Beat-3’ Beat-4 Beat-3 > | Beat-4 +. Beat-4.5 Beat-3’ > J Bar 3... Beat-3' > + Beat-3.5 Beat-4 > J Bar 3... Beat-4 Beat-3.5—> } Beat-4 Beat-4 + | Bar-2 Beat-I Beat-4.5 > } Bar-2 Beat-1 An example of a rhythm generated by the grammar: ddd | dsa | corpus. Fhe reader will observe that many possible combinations did not occur in the corpus. At this point, however, we run into the main problems of developing grammars (and computer programs) to characterize corpora. A theory may make errors of two sorts. On the one hand, it may fail to generate sequences that do in fact occur: it may undergenerate. On the other hand, it may generate sequences that would never in fact occur: it may overgenerate. A regular grammar can generate a potentially infinite number of sequences: it merely has to incorporate at least one recursive tule of the form: AvzaA which can be used to generate strings of any arbitrary length. Musical phrases, however, are always finite in length, and characteristically are seldom more than a few bars long. There are therefore only a finite number of possible rhythms for them, but that number is vast. The corpus of carols contains certain phrases, but it does not include all possible rhythms for carols. Hence, a grammar based only on a corpus will undergenerate. The theorist’s task is to go beyond the data, and to base a grammar on a plaus- ible extrapolation from them. The concomitant risks are to overgenerate if the grammar is too bold, or to undergenerate if it is too close to the data. The principles in an individual jazz musician’s head at a particular point in time are determinate, and so there is a correct grammar for character- izing the rhythmic patterns improvised by the musician. Unfortunately, at 304 P. N. Jobnson-Laird Figure 4. Part of a transition diagram, equivalent to a regular grammar, for generating the rhythms of improvised phrases. present, no method for checking a grammatical account of these mental principles currently exists. What we can assess, however, is whether there is any evidence for rules that are more powerful than those of a regular grammar. Figure 4 presents part of a transition diagram that generates the rhythms improvised by Charlie Parker. Such a diagram is equivalent to a regular grammar: the different transitions from. a node correspond to different rules for rewriting the same symbol. Thus, the two alternative transitions from the initial note correspond to the rules: Som +S, So } So If a probability is assigned to each link, then the device, which is a finite- state automaton, generates a Markovian sequence of symbols. In the pro- gram, each link is assumed for simplicity to be equi-probable. A striking feature is the great variety of rhythmic patterns to be found even in a relatively small corpus, of Parker’s work. A more appropriate description might take a more abstract form. For example, one can distinguish between phrases that start just after the first beat of a bar, e.g. 3 —_™~ ly 1 ITI] and phrases that start just after the fourth beat of the bar, e.g. —> 1) (JT J 9, Jazz Improvisation: A Theory at the Computational Level 305 A simplification of the overall transitions could be made if it were the case that these two sorts of phrases were treated in the same way: they have the same rhythm, but start at different points in the bar. Another feature of the grammar, which it has in common with the grammar for carols, is that it generates many possible combinations that did not occur in the corpus. No doubt had I examined a larger corpus some of these possibilities would have occurred. A theorist is bound to exercise intuition — a tacit knowledge of the style of the musician — and to use such judgements to flesh out the grammar in a more complete form. Whether the grammar under- or overgenerates is not as critical as whether the general claim that the rhythms of improvised phrases can be characterized by regular grammars. An examination of the corpus provides no evidence of either strong constraints of one part of a phrase on another — analogous to the matching of parentheses, or of a need for a complex internal structure. Hence, the conjecture that modern jazz rhythms are gen- erated by processes that place a minimal load on working memory appears to be borne out, and it should be possible to characterize the complete set of such phrases using a regular grammar. Only a drummer is likely to be satisfied with an improvised rhythmic pattern devoid of melody, and even drum solos introduce certain differ- ences in pitch and timbre. The extemporization of a melody in modern jazz, however, is strongly constrained by the chord sequence, and so I will now turn to an analysis of jazz harmonies. MODERN JAZZ CHORD SEQUENCES Jazz is a tonal music, and the constraints on melody include as a major component the particular chords in the harmonic sequence. A knowledge of harmony can be divided into those principles that are accessible to con- sciousness and that can be described verbally, and those further principles, particularly important to the creation of chord sequences, that lie outside conscious awareness. If composers had introspective access to all the principles that guide the sequence of chords in their compositions, then the nature of harmony would not be controversial. Most jazz musicians have a conscious knowledge of the particular notes comprising the main types of chords, and of the sequences of chords com- prising the themes in their repertoire. Any type of chord can be realized in many different voicings: there are many ways in which to play the chord of F dominant 7th, but most of them will include at least one occurrence of the notes F, A, C, Eb. In modern jazz, the pianist may decorate this chord with many additional notes, and might even play it as an inversion in which Eb is the root, omitting F altogether. Taking such variations into 306 P. N. Johnson-Laird account, there are six principal chords used in modern jazz, which are described here with roots of C, though they may occur in any key. The parenthesized symbols are the conventional abbreviations for the chords: C major 7th (mj7): C,E,G,B C dominant 7th (7): C,E,G,Bb C minor 7th (m7): C, Eb, G, Bb C minor 7th with 5b (m7.5b): C, Eb, Gb, Bb C minor perfect 7th (m.per7): C, Eb, G,B C diminished 7th (dm7): C, Eb, Gb,A These chords may be played with added 6ths and 9ths of various sorts, and in certain inversions. Figure 5 shows a typical realization of a modern jazz harmonic accompaniment. The chord sequence for an improvisation will be familiar to all musicians in its conceptual form, which can be symbol- ized by designating the root and type of chord that occurs in each bar. Here, for example, is the chord sequence of a variant of the twelve-bar blues popularized by Charlie Parker. The Roman numerals designate the roots of the chord where I is the keynote, V its dominant, and so on: | Imj7 | vom? UI7 | Vim7 17 | Vm7Vvb7 | | IVmj7 | TVm7 Vilb7— | Um7 V7 | Lfbm7 VIb7 | | m7 | Vibm? Wb7 | Imj7 Tb7 | Vibmj7 1167 | Of course, the rhythm section may depart from this sequence in various ways, and the actual choice of voicings for the chord is extemporized by the instrumentalists providing the accompaniment. The most important point about an underlying chord sequence is that it is not improvised. It is composed by whoever was responsible for the original theme, though jazz chord sequences are often modified by other musicians as they evolve during the development of the music. The original twelve-bar blues, for example, goes back to the early history of jazz and in comparison to the variant above is a rudimentary affair: Jo | 1 roi | |) m7 oj) w fr | | | > > > Fenjz-s ~ i tT Figure 5. The harmonic voicings of a typical modern jazz accompaniment. 9, Jazz Improvisation: A Theory at the Computational Level 307 The fact that chord sequences are composed rather than improvised has a crucial corollary: there is no need to minimize the complexity of inter- mediate computations in producing them. The composer can try one idea and then another, and other musicians can modify the sequence. Notation makes the process possible because a good notation is a substitute for working memory. Hence, the computational power that goes into the making of chord sequences is likely to be far greater than the compu- tational power that goes into the improvisations based on them (Johnson- Laird, 1988b). In order to test this conjecture, I have examined a corpus of modern jazz chord sequences in order to devise a grammar that could be used to generate such sequences. This project was inspired in part by an intriguing paper by Mark Steedman (1982). He developed a set of rules that took as their input a simple chord sequence, such as the traditional blues sequence above, and generated modern elaborations of it, such as the variant played by Parker. It was an open question whether a grammar that could generate the initial sequences for itself would still need the sort of rules postulated by Steedman for interpolating new chords, and for substi- tuting one sort of chord for another. More recently, Conrad Cork’s (1988) pedagogical notion of teaching beginners a repertoire of harmonic ‘building blocks’ has also contributed to the present theory. The first assumption of the theory is that underlying the superficial variants of a modern jazz chord sequence there is a tonal chord sequence. These sequences, in fact, are often remarkably similar to those of European classical music. Here, for example, are two tonal sequences that are very similar: Loft {mu vo ojimy { 1 vs {1 J}u vo jw nm fv I | and: 2 |. |u ov jt | 0 v | | 17 | IV | VI IL Iv I { The first example is the opening of Mozart’s Clarinet Quintet in A major. (K 581); the second example is a perennial favourite of jazz musicians, George Gershwin’s ‘I got rhythm’. Many theories of tonal harmony are stated informally in terms that imply that acceptable chord sequences could be generated by a regular grammar (Forte, 1979). Such grammars, as we have seen, are not powerful enough to capture any internal structure other than binary divisions. They would assign the opening bars in the examples above with the following 308 P. N, Johnson-Laird structure: Opening Middle False-end ‘Opening I IE Vv 1 This structure does not accord with musical intuition, which suggests that the first two bars are a cadence from tonic to dominant. The structure is more accurately represented as: " Opening-sequence First Cadénce Second Cadence Toriic Prepare-Dominant Tonic | uv | Such a structure, of course, calls for the power of a context-free grammar. In at least one respect, modern jazz chord sequences differ from the sequences of classical music: modern jazz employs a much greater use of modulation. Modulations occur in two principal forms, either from one major section of a chord sequence to another or else within such sections. A typical example of modulation between sections occurs in the theme ‘Joyspring’ by Clifford Brown. The first seven bars are as follows: Imj7 | Um7 v7 | Imj IIm7 | IVm7 bVII7_ | | m7 bIII7 | Um7_ bII7 | Inj7 ae but at this point the theme modulates to a key one semitone up, e.g. from F to Gb, and the modulation is effected by a typical manoeuvre in the eighth bar: | bllIm7 bVI7 | 9, Jazz Improvisation: A Theory at the Computational Level 309 which prepares the way for a chord of bIImj7, which is the new tonic. The next eight bars begin with exactly the same chords but based on the new tonic. This sort of modulation is also very common for the so-called ‘bridge’ in pieces based on the standard 32-bar sequence of the form AABA, i.e. the first eight bars, A is repeated, and then leads into the bridge of eight bars, B, prior to the final reprise of A. There appear to be few, if any, constraints on the nature of the modulation: it can be to any new key, though modulations to a key a flattened fifth away from the original are rare (Cork, 1988). Modulations also occur within sections. The ‘locus classicus’ for such effects, combined with a modulation between sections, is Jerome Kern’s song, ‘All the things you are’, which contains chords based on all 12 poss- ible roots. The opening sequence of the theme begins in a modern jazz variant as follows: ) Vim? | Im? | bVIm7 bII7 | Inj? | and immediately proceeds to modulate to a tonic a major third above using as an ambiguous pivot a chord that is IVmj7 of the original key and a substitute dominant (bIImj7) of the new key: fup WN) | bilmj7 | m7 v7 | Imj7) | m7 | The next eight bars repeats the sequence but modulated upwards by a minor third. Hence, the first eight bars modulates from Ab to C, and the second eight bars modulates to Eb and thence to G. Modulations in modern jazz are effected by two main devices, either an immediate transition to a major seventh on the tonic of the new key or else, as in the examples above, by interpolating chords backwards from the new tonic according to the cycle of fifths: ... Iim7 V7 Imj7. Because these inter- polations can be handled by the sorts of rules invoked by Steedman (see below), the only rule that is needed for modulation is one that signifies a change in tonic, e.g. Imj — IIImj, which, as I have mentioned, can be to any new key. The basic building blocks of tonal chord sequences fall into three main categories, which can be exemplified as follows: 1. Cadences from tonic to dominant 2. Cadences from dominant to tonic 3. Cadences from tonic to dominant and back to tonic In the case of modern jazz, however, the use of the word ‘dominant’ here is slightly misleading because the relevant chord need not be based on V or on substitutions for it. For example, a chord based on IV, or even a chord 310 P. N. Johnson-Laird based on III, can serve the function of a dominant as in the following case: | Imj7 | vim7 =| WI7 | where the VIIm7 is derived by interpolation according to the cycle of fifths. Indeed, in the case of the third cadence above, many alternative roots, including bVI7, can function as the temporary resting point of the dominant. Table 2 presents a context-free grammar for generating simple tonal chord sequences, and Table 3 presents some typical examples of its output. A grammar for modern jazz would allow alternatives to V to serve as a dominant. Given an output from such a grammar, then the sorts of sequences that actually occur in modern jazz can be derived, as Steedman argued, by rules that act as transducers. They take a chord sequence as input and produce an enriched sequence as their output. A major manipu- lation in jazz is the interpolation of chords according to the so-called ‘cycle of fifths’. For example, given the opening of ‘I got rhythm’: {I [movi Table 2. A context-free grammar for generating eight-bar tonal chord sequences, The rules all concern variants on opening cadences (from tonic to dominant). Other rules in the complete grammar generate closing cadences (dominant to tonic) and complete cadences (tonic to dominant to tonic) Eight-bars > First-four Second-four First-four. > Opening-cadence Opening-cadence > Opening-cadence’ Opening-cadence Second-four > Middle-cadence Opening-cadence Opening-cadence > I ] 1 of} bf ov Opening-cadence’ > | I I II | > | ot } ow) Middle-cadence > | I | Iv | en ee ee > | Ww ol ot | Table 3. Some sample outputs generating by a program using the grammar in Table 2 fr fwiofroyv jw oyroeqyryv | ,r | m ;ro[v jt fw fr fr fj jr fw jrjv fr Jw jr yr 4 Proj. Pro ofr jw jt Proof rf Projve fro ofv fl fv fr [rf 9. Jazz, Improvisation: A Theory at the Computational Level 311 an interpolation of this sort would lead to the VI chord prior to the II chord: | Iw | m@ ov 4 The step from VI down to II is indeed a fifth. Likewise, the development of Parker’s blues sequence from the simple underlying blues: Pr Jt jr tuo | calls for a series of such interpolations, working backwards from the dominant seventh at the end of the fourth bar: | vm7 17 | then: 7 | V7 17 | then: Vim7 7) | Vm7 17 | then: VIim7 17 | Vm? 17 { and so on: 7 | Vim7 7) | Vm7 17 {1 | vilm7 17 | Vim7 7) | vm7 17 | It only remains to substitute a more complex chord for the opening tonic, and to substitute for the final chord one that is a flattened fifth away: | Imj7 | Vilm7 17 {| Vim7 117 | Vm7 Vb7_— | The grammatical rules for making the interpolations and substitutions cannot be exercised with complete freedom. The substitutes of one domi- nant seventh by another a flattened fifth away, for instance, is permissible if the next bar in the sequence begins with a chord that is a fifth away (as in the substitution above). Likewise, the interpolation of chords according to the ‘cycle of fifths’ should not continue to the point where the opening tonic is eliminated. The rules are therefore sensitive to the context of the symbols to be modified. Instead of context-free rules like those used to generate the underlying sequence, it is necessary to use ‘context-sensitive’ rules, such as: I>1m7 /_V which specifies that { can be re-written as IIm7 provided that it occurs in the context specified after the slash, i.e. prior to V. In fact, one would need 312 P..N. Johnson-Laird a large number of such specific rules to do justice to all the possible interpolations that can be made according to the cycle of fifths. A simpler solution, which I adopted in the program for testing the grammar, is to use meta-rules that capture a whole set of such rules (see Gazdar ef al., 1985, for an account of meta-rules). In the program, the context is specified as follows: Current Previous Next I not I xdom where I is the symbol to be rewritten, the previous chord must vot be I, and the next chord can be any chord that will be ultimately realized as a dominant seventh. Where a chord satisfies this context, then the value of x, which is the root of the next chord, is bound to an expression that generates the chord that can be interpolated according to the cycle of fifths. In fact, there are several alternative rewritings, including: Imj (Fifth x)m In this case, given an input sequence: Previous Current Next Vdom I Vdom... the rule yields the output: Vdom Imj Im Vdom... The chord symbols in this output do not yet specify sevenths. The reason is that there remains a third stage in the generation of a final chord sequence, One reason for the third stage is that there is a form of interpolation that can occur after interpolations according to the cycle of fifths. A sequence produced by the second stage can have the following form: | [dom | Vidom | Idom | Vdom | into which can then be interpolated the following minor sevenths: | U7 | Ilm7 VI7 |VIm7 117 | Um7 v7 | The third stage also depends on context-sensitive rules. _ In characterizing what has to be computed in an improvisation, I have argued that the production of suitable chord sequences calls for consider- able computational power. The three-stage program requires a memory for the results of a large amount of intermediate computations, because the first stage requires at least context-free power, and the interpolations according to the cycle of fifths must be made one at a time in an interde- pendent way. If rules sensitive to the context of a symbol are used to gen-

Das könnte Ihnen auch gefallen