Sie sind auf Seite 1von 19

568881

research-article2015
POM0010.1177/0305735614568881Psychology of MusicLu and Greenwald

Article

Psychology of Music

Reading and working memory


2016, Vol. 44(3) 369­–387
© The Author(s) 2015
Reprints and permissions:
in adults with or without formal sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0305735614568881
musical training: Musical and pom.sagepub.com

lexical tone

Ching-I Lu and Margaret Greenwald

Abstract
Studies of working memory for musical tone are seldom reported, and verbal working memory
experiments have not focused on the tonal aspects of a language such as Mandarin Chinese. We
examined the relationships among musical experience, tonal language processing, and working
memory in adult reading of musical notation and Mandarin Chinese. We hypothesized that 30 adults
with formal musical training trained in translating print to sound in sight-reading would have an
advantage over 30 adults without formal musical training in converting print to lexical tone in
reading a tonal language. Using n-back reading tasks, we found that the adults with formal musical
training were better able to extract lexical tone information from print than the adults without
formal musical training, or to maintain it in working memory. Even in a Mandarin homophone
task, requiring phonological judgments of print, adults with formal musical training demonstrated
superior performance. We discuss possible reasons why musical experience facilitates processing of
phonology and lexical tone in reading tasks.

Keywords
lexical tone, musical notation, n-back paradigm, reading, sight-reading, working memory

In studies of music and language, visual language has received relatively little attention (e.g.,
Schendel & Palmer, 2007; Sloboda, 1976). Auditory similarities between music and language
have been described, including complex sound analyses (e.g., Delogu. Lampis, & Belardinelli,
2010) and brain regions that support them (e.g., Zatorre, Belin, & Penhune, 2002; Wong et al.,
2007). Participation in musical activities may enhance performance in multiple cognitive
domains (e.g., Bugos, Perlstein, McCrae, Brophy, & Bedenbaugh, 2007), which may reflect
brain plasticity and cross-modal transfer of association learning (Wan & Schlaug, 2010).

Department of Communication Sciences and Disorders, Wayne State University, Detroit, MI, USA

Corresponding author:
Ching-I Lu, Department of Communication Sciences and Disorders, Wayne State University, 207 Rackham Bldg, 60
Farnsworth Street, Detroit, MI 48202, USA.
Email: chingilu@gmail.com
370 Psychology of Music 44(3)

Sight reading of musical notation and oral reading of words are similar in that they require
translation of print to sound. Bialystok and DePape (2009) suggested the greatest effect of
musical experience on other cognitive domains will likely be in tasks similar to the activity
involved in the experience itself. Based on this and the similarity of musical reading to word
reading, we studied classically trained musicians with extensive experience in reading musical
notation. Given their experience with musical tone, we hypothesized that musicians (defined
here as adults with formal musical training) may have an advantage over non-musicians
(defined here as adults without formal musical training) in converting print to lexical tone in
reading a tonal language.
Working memory for musical tone is rarely studied (e.g., Berz, 1995; Ockelford, 2007;
Williamson, Baddeley, & Hitch, 2010), and there is a lack of information on how working mem-
ory for lexical tone operates in a tonal language. Therefore, we selected reading tasks that allowed
us to assess musical, verbal, and tonal working memory in musicians and non-musicians.

The Chinese reading and writing system


The modern Chinese writing system is logographic, in that the basic unit (i.e., the character) is
associated with a unit of meaning (i.e., morpheme) in the spoken language (Weekes, Chen, &
Gang, 1997). The square shape of the written character conveys two parts of phonological
information: phonetic segments (consonant and vowel) and a suprasegmental phonological
feature (tone) (Leong, 2002; Tong, Francis, & Gandour, 2007). Each character corresponds to
one syllable and one tone, making up a Chinese monosyllable. The typical monosyllable con-
sists of three elements: 1) the onset (initial consonant preceding the vowel), 2) the rime (at least
one vowel and any consonant sounds coming after the vowel), and 3) the tone (Siok & Fletcher,
2001).
Word meanings in tonal languages vary by the tone associated with each syllable. Lexical
tone includes many phenomena determining patterns of pitch rises and falls. In Mandarin
Chinese, four tones are used as suprasegmental phonological features that change the syllable
pitch and provide lexical contrast (Siok & Fletcher, 2001): 1) high-level tone; 2) mid-rising tone;
3) mid-falling-rising tone; and 4) high-falling tone (Lin, Wu, Ting, & Wang, 1996).
Weekes and colleagues (1997) proposed a Chinese reading model based on the English dual-
route model (i.e., lexical whole-word or sublexical grapheme-to-phoneme translation of print
to sound). Chinese radicals (i.e., components of Chinese characters) are represented in this
model as independent orthographic units in the Chinese word recognition system. Accurate
tone processing eliminates confusions between phonological and morphological representa-
tions (Li & Ho, 2011).
In oral reading of Chinese or English, working memory is used to hold phonological or tonal
information in an active state temporarily to support pronunciation or other tasks such as judg-
ing word rhymes. Reading musical notation, like oral word reading, requires activation of mul-
tiple cognitive components as the visual stimulus is converted to sound codes (pitch or note
name) and held in working memory during vocal and/or instrumental sound production.

A theoretical model of reading music


Reading musical notation (Western music) involves figure-ground integration or perception.
The ground reveals a stave, whereas figures are other symbols, including clefs, time signatures,
types of notes, types of rests, other symbols (e.g., sharp, fermata, repeat), loudness level of notes
(e.g., ff), terms affecting notes (e.g., legato), and terms affecting speed of notes (e.g., rit.). Most of
Lu and Greenwald 371

these symbols are in the time-based domain and the pitch-based domain, whereas spatial rep-
resentation indicates pitch height. The pitch-based domain maps vertically organized spatial
location, whereas the time-based domain maps horizontally to a sequence of musical events
from left to right on a stave (Stewart, Walsh, & Frith, 2004). Deriving pitch from reading musi-
cal notes requires translation of the visual stimulus into pitch (i.e., musical tone). The Western
tonal music system is highly structured, containing 12 semitones in an octave. Pitches sepa-
rated by an octave are heard as very similar and are typically given the same name, referred to
as the pitch class (e.g., all the notes called ‘A’ on a piano keyboard) (Krumhansl & Toiviainen,
2003; Patel, 2008).
Schon, Anton, Roth, & Besson (2002) offered a model of music sight-reading that includes
three phases: visual encoding, transcoding, and production. However, this model does not
include detailed description of abstract cognitive representations involved in transcoding print
to sound. Gudmundsdottir (2010) stated that sight-reading required general mental capacities
such as working memory and mental speed. Lehman (2007) also suggested that music reading
is demanding of working memory. For singing or instrumental production, sound information
would need to be held in working memory.

Working memory for musical tone


Berz (1995) presented a theoretical model of musical working memory including a central
executive controller with ‘loops’ of information interacting with it: multiple sensory loops, a
phonological loop (verbal component), a visual-spatial sketchpad (visual component), and a
music memory loop (musical component). Berz assumed the central executive is a key compo-
nent of musical working memory, and that two different loops support language and music: a
phonological loop and a musical loop. The musical memory components from Berz (1995)
were very similar to the phonological loop of Baddeley’s verbal working memory model (e.g.,
Baddeley & Hitch, 1974) with a musical store (i.e., similar to the phonological store) and an
articulatory mechanism based on musical inner speech. However, Berz did not provide empiri-
cal evidence to support this model.
There are some parallels in how performance patterns observed experimentally may relate
to musical or verbal working memory. The effects of phonological similarity, serial position,
irrelevant sound and concurrent articulation in verbal working memory are similar to the
effects of pitch similarity, serial position, irrelevant sound, and concurrent articulation in musi-
cal working memory. In contrast to Berz (1995), some authors have described musical working
memory as involving similar working memory processes as many other types of auditory infor-
mation (e.g., Salamé & Baddeley, 1989).

Central executive function in musical working memory


Because musical working memory appears similar to verbal working memory in having a stor-
age component and an articulatory mechanism, one can hypothesize that musical working
memory also includes a central executive. The central executive is thought to be a cognitive
control function that regulates encoding, retrieval, and integration or manipulation of infor-
mation entering working memory from different sensory storage systems or from long-term
memory (Miller & Kupfermann, 2009). Unlike previous studies (Berz, 1995), Ockelford (2007)
described a ‘musical executive’ component of musical working memory. He hypothesized that
this musical executive could be related to the central executive (Baddeley, 2003). On this
account, the musical executive processes perception and strategic encoding of notes in
372 Psychology of Music 44(3)

Table 1.  Demographic data of adults with versus without formal musical training.

With musical traininga Without musical trainingb

  Mean SD Mean SD
Age 22.27 3.81 22.97 4.86
Years of education 15.63 2.28 15.58 2.87
Years of learning music* 16.53 3.64 0.97 1.90
MMSE 29.07 1.62 29.53 0.78
Digit span 25.20 3.07 24.70 3.50

Significant at the *p < .05 level; an = 30; bn = 30.

memory. However, Ockelford (2007) described one individual able to ‘listen and play’ chromatic
blues. Due to the structure and improvisation of blues music, there may be little we can infer
about the Western musical system from this case observation.
Musicians have exhibited superior performance in verbal tasks compared to non-musicians
(e.g., Schellenberg, 2011). The reason for these differences is not clear, but possible explana-
tions include: 1) musical training results in expanded working memory capacity; 2) musical
training provides long-term memory (LTM) support for normal working memory; and 3) musi-
cal training results in improved executive function. Each of these is discussed below.
There has been little examination of how working memory operates for musical tone, and
studies of verbal working memory have not focused on tonal aspects of a language such as
Mandarin Chinese. In assessing reading in musicians and non-musicians, we compared the
effects of high versus low working memory demands on reading of Chinese and musical nota-
tion. We also examined whether musicians perform better than non-musicians in translating
tonal information from print to sound in Chinese reading, or in translating phonological infor-
mation from print to sound in Chinese reading.

Method
Participants
A total of 60 participants from Taipei, Taiwan, able to read the Western musical notation sys-
tem, completed the study voluntarily. All participants were native speakers of Mandarin
Chinese, had normal vision (with or without correction), and normal hearing, motor and cog-
nitive abilities. Thirty participants (6 male, 24 female; 26 right-handed) self-reported no musi-
cal instrumental lessons. The other 30 participants (30 females) were professional, classically
trained musicians with at least 11 years of musical instrumental training including reading of
standard musical notation. They reported a mean of 16.53 years of music instrumental train-
ing (range = 11–23 years). Screening measures included a questionnaire for self-report of
demographic information. For musicians, the questionnaire included additional self-report
about years learning music and major instruments. All participants completed vision screen-
ing, speech discrimination screening, Mini-Mental State Examination (MMSE; Folstein, Folstein,
& McHugh, 1975), which screened for cognitive impairment, and digit span task (Wechsler
Adult Intelligence Scale-III; Wechsler, 1997), which measured working memory storage capac-
ity. There were no significant differences between the two groups in MMSE and digit span. The
demographic data and screening measures for both groups are listed in Table 1.
Lu and Greenwald 373

Table 2.  Response patterns of 1-back and 2-back tasks.

1-back ‘YES’ response 2-back ‘YES’ response ‘NO’ response


Homophone If a word is homophonic with the If a word is homophonic with Others
one that came before it the one that came 2 before it
Mandarin tone If a word has the same tone as the If a word has the same tone as Others
one that came before it the one that came 2 before it
Music If a note has the same pitch class If a note has the same pitch class Others
as the one that came before it as the one that came 2 before it.

N-back task design


The goal of the current study was to examine the translation from print to sound of the tonal
language Mandarin and musical notation stimuli. The effects of increasing working memory
load on reading performance across tasks were also examined. The n-back task (Kim and col-
leagues, 2002) was used to measure working memory and executive function. In this task,
participants are presented with a series of stimuli and instructed to indicate whether the cur-
rent stimulus matches the stimulus presented n stimuli back in the series, where n equals a
number between 0 and 3 (Simmons, 2000). The n-back task requires on-line monitoring,
continuous updating, and temporal order of remembered information, and is assumed to
place great demands on executive processing (Beneventi, Tonnessen, Ersland, & Hugdahl,
2010). In the current study, each reading task was presented in both the 1-back and 2-back
paradigms.
In both 1-back and 2-back tasks for all three reading experimental tasks (musical nota-
tion/Mandarin tone / homophone), participants were instructed to press the ‘yes’ button if
the target stimulus sounded in the same pitch name or the same pitch class/the same
Mandarin tone/homophone as the probe stimulus. Otherwise, participants were instructed
to press the no button. Descriptions of the response patterns for the 1-back and 2-back tasks
are given in Table 2.

Stimuli
Musical notation task.  All musical notation stimuli (n = 280; 40 practice and 240 experimental
trials) were created by the first author based on the Western musical system (12 keys) pre-
sented on the g-clef and the f-clef visually. Stimuli were divided into 1) notes in the same loca-
tion and having the same pitch names, 2) notes in a different location on different clefs but with
the same pitch name (a pitch class), 3) notes in the same location with different pitch names,
and 4) notes in different locations with different pitch names. Participants were instructed to
judge whether it was the same pitch class. Examples and abbreviation of the musical notation
stimuli are presented in Figure 1.

Mandarin tone task.  Mandarin tone judgment stimuli (n = 280) were chosen to control for pho-
nological similarity (homophone versus nonhomophone) and type of linguistic tone: 1st tone
(level), the 2nd tone (rising), the 3rd tone (dipping), or the 4th tone (falling). Participants were
instructed to judge tone similarity. Stimuli were divided into homophones with the same tone,
homophones with different tones, non-homophones with the same tone, and non-homophones
with different tones.
374 Psychology of Music 44(3)

Figure 1.  Example stimuli for the Musical Notation task. Left to Right: notes are in the same location with
the same pitch name (LSPS), different locations (i.e., different clefs) with the same pitch name (LDPS) (AKA
a pitch class, which means a set of all pitches that are a whole number octaves apart), the same location
with different pitch names (i.e., different clefs) (LSPD), and different locations with different pitch names
(LDPD).

Mandarin homophone task.  Stimuli in the Mandarin homophone judgment task (n = 280)
were carefully controlled for two variables: phonological similarity and visual-orthographic
similarity. Participants were instructed to judge phonological similarity. Four different cate-
gories of target stimuli were included: orthographically similar/phonologically similar;
orthographically similar/phonologically dissimilar; orthographically dissimilar/phonologi-
cally similar; and orthographically dissimilar/phonologically dissimilar. This manipulation
of experimental stimuli was included so that the effect of visual-orthographic similarity on
task performance could be assessed. Examples and abbreviations of Mandarin stimuli are
presented in Figure 2.

Procedure
Each participant was tested individually in a quiet room. The screening tasks and experimental
tasks were administered to each participant during a total of two sessions of approximately 1.5
hours each for a total of 3 hours during 1 month. To ensure that participants understood how
to do each experimental task, all participants were asked to complete three preliminary reading
tasks of Mandarin homophone discrimination, Mandarin tone discrimination, and note dis-
crimination (musical notation task). In each of these tasks, two visual stimuli were presented
simultaneously, one above the other on the computer screen for 1500 msec with 500 msec
blank screen between each pair. Participants had to decide whether these two stimuli sounded
the same in homophone/tone. The musicians performed significantly better (95.7% accurate)
than non-musicians (87.3% accurate) in discriminating Mandarin tone, t(58) = 2.62, p =
.011, and in the music task, t (58) = 5.88, p < .001, but not in the homophone task.
A total of six experimental tasks were administered (i.e., each of the three experimental
tasks presented in the 1-back and the 2-back paradigms; examples of 1-back vs. 2-back
Lu and Greenwald 375

Figure 2.  Example stimuli for two Chinese tasks. Left: Mandarin homophone task included four
categories: orthographically similar/phonologically similar (OSPS), orthographically similar/phonologically
dissimilar (OSPD), orthographically dissimilar/phonologically similar (ODPS), and orthographically
dissimilar/phonologically dissimilar (ODPD). Right: Mandarin tone task also included four categories:
homophones with the same tone (HTS), homophones with different tones (HTD), non-homophones with
the same tone (nHTS), and non-homophones with different tones (nHTD).

paradigms are shown in Figure 3A and 3B). The duration of each task was 8 minutes (1500
msec with 500 msec blank screen × 240 items). To reduce fatigue, each task was divided into
2 blocks (4 minutes each) with a short (30 second) break in between. Thus, each reading task
took a total of 8 minutes, 30 seconds. Stimuli were presented electronically using E-Prime
Professional 2.0 software (Psychology Software Tools, Pittsburgh, PA) presented on an IBM
ThinkPad R60e laptop with a 15" screen size (13.1” × 10.6”) monitor. An external number
pad was connected with the laptop, providing ‘Y’ as ‘yes’ and ‘N’ as ‘no’ response keys.
Participants sat facing the screen and were instructed to use only the right index finger to
press the buttons. Participants were instructed to press any key when they were ready to start
a trial and to respond within 2 seconds of each stimulus or their response for the particular
trial would not be recorded. They were instructed that if they were unable to respond within
2 seconds, they should skip the immediate stimulus and focus on the next one.

Analysis
This study used a three-factor mixed design: 2between (group: adult with versus without
formal musical training) × 3within (homophone versus Mandarin tone versus Musical nota-
tion) × 2within (task difficulty: easier versus difficult), with accuracy rate (AR) and reaction
time (RT) as the dependent variables. The number of correct acceptance, correct rejection,
incorrect acceptance, and incorrect rejection responses was calculated. To fulfill the
requirements of proportion data (Correct N/Trial N) and equality of variances for a General
Linear Model repeated-measure, accuracy data for each task were transformed with 2 ×
arcsin (sqrt(x)), which is the angular transformation. This transformation is most often
employed in analysis of a dependent variable in General Linear Modeling, when the raw
values are proportions (Kirk, 1995). Thus, for accuracy data, we ran repeated-measures
GLM on angular transformation. Speed (RT) was computed the same way as the accuracy
rate. According to the Shapiro-Wilk test, reaction time violated the normality assumption
of general linear model/ANOVA (p < .05). However, after using logarithm transformation,
376 Psychology of Music 44(3)

Figure 3.  Examples of n-back paradigm shown in a graphic format. (A) 1-back paradigm. (B) 2-back
paradigm.

the RT data were normally distributed. For RT, we ran repeated-measures GLM on Log10
(Kirk, 1995). Since transformed variables are harder to interpret, we only used transformed
data on inferential statistics, and non-transformed data on descriptive statistics. Also, we
controlled for the family-wise type I error by specifying the significant threshold at p < .017
(Bonferroni).

Results
The mean accuracy scores, mean RT, and standard deviation in each 1-back and 2-back task
were computed for each group. See Table 3 for descriptive statistics.

Correlation analyses
To assess correlation between standard assessments and experimental tasks, data from digit
span and MMSE were collected. There were no significant differences between participant
Lu and Greenwald 377

Table 3.  The mean accuracy scores, mean reaction time, and standard deviation on all tasks for musicians
and non-musicians.

Tasks Accuracy Reaction time

Musiciansa Non-musiciansb Musiciansa Non-musiciansb


1-back Mandarin homophone .96 (.02) .94 (.04) 735 (83) 779 (94)
1-back Mandarin tone .96 (.02) .92 (.04) 876 (131) 934 (133)
1-back music .94 (.05) .70 (.10) 901 (134) 943 (159)
2-back Mandarin homophone .91 (.06) .88 (.09) 835 (128) 891 (120)
2-back Mandarin tone .81 (.09) .76 (.10) 972 (193) 1076 (154)
2-back music* .79 (.10) .50c (.09) 1013 (178) 888c (188)
an = 30; b n = 30; c n = 27.
*We adopted performance criteria of 80% accuracy in each 1-back task and 75% accuracy in each 2-back task for partic-
ipants to have their data included in the analyses. Three non-musician participants exceeded 25% missing response rate
in 2-back music task (missing 18 to 21 items) and were removed from subsequent analysis of this task; thus, analyses of
the 2-back music task were conducted on 30 musicians and 27 non-musicians.

Table 4.  Correlations among demographic variables, digit span task, MMSE, and accuracy of
experimental tasks in all participants.

Variables 1 2 3 4 5 6 7 8 9 10
1. Years of learning music –  
2. Years of education .19 –  
3. Digit span −.04 −.05 –  
4. MMSE −.15 −.10 −.20 –  
5. 1-back homophone .31* .15 .15 −.06 –  
6. 1-back Mandarin tone .48** −.07 .08 −.02 .52** –  
7. 1-back music .85** −.01 −.05 −.16 .33* .54** –  
8. 2-back homophone .18 .08 .15 .19 .35** .52** .25 –  
9. 2-back Mandarin tone .27* .17 .07 −.04 .33* .32* .31* .40** –  
10. 2-back music .84** .04 −.05 −.13 .32* .51** .87** .22 .41* –

*Correlation is significant at the .05 level (2-tailed).


**Correlation is significant at the .01 level (2-tailed).

groups in the MMSE, t(58) = −1.43, p = .160, and digit span, t(58) = .59, p = .558. Table 4
includes Pearson’s correlations among years of learning music, years of education, digit span,
MMSE, and accuracy scores for all experimental tasks in all participants. Number of years of
learning music was significantly positively correlated with accuracy scores for 1-back Mandarin
homophone (r = .31), 1-back Mandarin tone (r = .48), 1-back music (r = .85), 2-back Mandarin
tone (r = .27), and 2-back music (r = .84) tasks.
Table 5 indicates Pearson’s correlations among years of learning music, years of education,
digit span, MMSE, and RT for all experimental tasks in all participants. Number of years of
learning music was significantly positively correlated with speed for the 2-back music task (r =
.43); and years of education was positively correlated with speed in the 1-back homophone
task (r = .29). Digit span scores were negatively correlated with RT in the 1-back Mandarin tone
(r = −.26) and 2-back Mandarin tone (r = −.34) tasks.
378 Psychology of Music 44(3)

Table 5.  Correlations among demographic variables, digit span task, MMSE, and speed of experimental
tasks in all participants.

Variables 1 2 3 4 5 6 7 8 9 10
1. Years of learning music –  
2. Years of education .19 –  
3. Digit span −.04 −.05 –  
4. MMSE −.15 −.10 −.20 –  
5. 1-back homophone −.21 .29* −.16 .12 –  
6. 1-back Mandarin tone −.20 .24 −.26* .07 .66** –  
7. 1-back music −.03 .08 −.16 .08 .41** .39** –  
8. 2-back homophone −.16 .16 −.15 .23 .60** .52** .41** –  
9. 2-back Mandarin tone −.17 .22 −.34** .15 .47** .57** .46** .72** –  
10. 2-back music .43* .22 −.15 −.16 .30* .28* .63** . 40** .51** –

*Correlation is significant at the .05 level (2-tailed).


**Correlation is significant at the .01 level (2-tailed).

Table 6.  Results of three-way mixed model ANOVA: Accuracy for group by task by difficulty level.

Source df F η2 p
Groupa 1 72.06 .57 <.001***
Difficultyc 1 370.41 .87 <.001***
Group × difficulty 1 .23 .00 .631
Within-group-error 55  
Tasksb 2 200.46 .79 <.001***
Group × tasks 2 74.93 .58 <.001***
Tasks × difficulty 2 27.93 .34 <.001***
Group × task × 2 .40 .01 .674
difficulty
Within-group-error 110  

Significant at the *** p < .001 level.


a. Group included musicians versus non-musicians; b. Task included Mandarin homophone versus Mandarin tone versus
Music; c. Difficulty included 1-back versus -2-back paradigm.

Analysis of accuracy and reaction time data by group, task, and difficulty level
A three-way ANOVA on the arcsine-transformed data (AR) revealed significant effects of group,
F(1, 55) = 72.60, p < .001, task, F(2, 110) = 200.46, p < .001, and difficulty level, F(1, 55) =
370.41, p < .001. There were two significant two-way interactions: between task and difficulty
level, F(2, 110) = 27.93, p < .001, and between task and group, F(2, 110) = 74.93, p < .001. A
simple main effect of difficulty level showed that under the 1-back condition, performance on
the homophone task was better than the tone task, followed by the music task, as well as for the
2-back condition. A simple main effect of group showed that for the musicians’ group, perfor-
mance on the homophone task was better than the tone task, followed by the music task, as well
as for the non-musicians’ group. There was no two-way interaction between group and difficulty
level, F(1, 55) = .23, p = .631. There was no three-way interaction, F(2, 110) = .40, p = .674
(see Table 6). Post hoc comparisons were undertaken to compare the AR for both groups. The
Lu and Greenwald 379

musicians scored higher than non-musicians on five experimental tasks: the 1-back homo-
phone, the 1-back tone, the 1-back music, the 2-back tone, and the 2-back music tasks. Accuracy
results for the two groups in the 1-back vs. 2-back tasks are depicted in Figure 4A.
A three-way ANOVA on the Log10 data (RT) revealed significant effects of difficulty level,
F(1, 55) = 38.58, p < .001, and task, F(2, 110) = 55.40, p < .001. There was no main effect of
group, F(1, 55) = .89, p = .348. There were two significant two-way interactions: between task
and difficulty level, F(2, 110) = 11.83, p < .001, and between task and group, F(2, 110) =
9.84, p < .001. There was no two-way interaction between group and difficulty level,
F(1, 55) = 2.48, p = .121. There was a three-way interaction effect, F(2, 110) = 13.04,
p < .001, in RT (see Table 7). Post hoc comparisons were undertaken to compare RTs for both
groups. The musicians had faster RTs than non-musicians in the 2-back tone task, t (58) =
−2.31, p = .024, whereas the musicians had slower RTs than non-musicians in the 2-back
music task, t(55) = 2.58, p = .013. RT results for all participants in the 1-back vs. 2-back tasks
are depicted in Figure 5.

The influence of visual similarity on print to sound conversion


1-back Mandarin homophone task.  A 2between (group: musicians versus non-musicians) × 4within
(category: ODPS versus OSPD versus OSPS versus ODPD) ANOVA on the arcsine-transformed
data (AR) revealed a significant effect of category, F(3, 174) = 79.77, p < .0011, but there was
no effect of group, F(1, 58) = 3.77, p = .057. There was an interaction effect, F(3, 174) = 2.98,
p = .033. Post hoc comparisons showed that all participants had the most accurate perfor-
mance in the ODPD and OSPD, followed by the ODPS, and OSPS categories. Thus, they made
correct rejections of PD stimuli at a similar rate for orthographically similar or dissimilar stim-
uli, and they correctly accepted PS stimuli at a higher rate when they were orthographically
dissimilar compared to when they were orthographically similar. These results for PD and for
PS stimuli suggest that the participants were not basing their homophone decisions on the
visual similarity of the stimuli. Additional planned comparisons were undertaken to compare
AR of both groups for four stimulus categories. There was no significant difference in perfor-
mance across the two groups except that the musicians scored higher than the non-musicians
for the ODPS category, t(58) = 4.22, p = .018; this result alone does not suggest that visual
similarity influenced performance in the 1-back homophone task.

1-back musical notation task.  A 2between (group: musicians versus non-musicians) × 4within (cate-
gories: LDPS versus LSPD versus LSPS versus LDPD) ANOVA on the arcsine-transformed data
(AR) revealed significant main effects of group, F(1, 58) = 132.51, p < .001, and category, F(3,
174) = 73.88, p < .001. There was an interaction effect, F(3, 174) = 30.26, p < .001. Post hoc
comparisons showed that all participants had the most accurate performance in the LDPD and
LSPD, followed by the LSPS and LDPS categories. Among musicians, there was no significant
difference in accuracy for the LDPS and LSPS categories, t(29) = −.19, p = .851. Performance
in the LDPD category was significantly more accurate than for the LSPD category, t(29) =
−4.40, p < .001, for musicians. Thus, the same location did not influence performance of musi-
cians given the same pitch (correct ‘yes’) targets, but it did influence their performance given
the different pitches (correct ‘no’) targets in that they were more likely to incorrectly accept a
different pitch target if it was visually in the same location as the probe. Among non-musicians,
accuracy of performance for the LSPS category was significantly more accurate than the LDPS
category, t(29) = 9.00, p < .001. In other words, non-musicians were more likely to accept a
correct match as being the same pitch if it looked similar (i.e., same location) as compared to
380 Psychology of Music 44(3)

A
1

0.9

0.8
Proporon Correct

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
1-back 1-back Tone 1-back Music 2-back 2-back Tone 2-back Music
Homophone Homophone musicians
non-musicians
B
1400

1200

1000

800

musicians
600
non-musicians

400

200

0
1-back 1-back Tone 1-back Music 2-back 2-back Tone 2-back Music
Homophone Homophone

Figure 4.  Performance for two groups in the 1-back vs. 2-back tasks. (A) Accuracy rates across all
reading tasks. (B) Reaction time across all reading tasks.
*Significant at the p < .05 level.

when it looked different from the probe. However, there was no significant difference in accu-
racy for the LDPD and the LSPD categories. Thus, overall in the 1-back music task the influence
of visual similarity on performance accuracy was inconsistent. See Figure 5 and Table 8 for
descriptive statistics of four categories.

Discussion
Studies of music and language have focused primarily on auditory language and not visual
language. Comparing music and Mandarin Chinese, we presented visual tasks requiring print
Lu and Greenwald 381

Table 7.  Results of three-way mixed model ANOVA: Reaction times for group by task by difficulty level.

Source df F η2 p
Groupa 1 .89 .01 .348
Difficultyc 1 38.58 .41 <.001***
Group × difficulty 1 2.48 .04 .121
Within-group-error 55  
Tasksb 2 55.40 .50 <.001***
Group × tasks 2 9.84 .15 <.001***
Tasks × difficulty 2 11.83 .18 <.001***
Group × task × difficulty 2 13.04 .19 <.001***
Within-group-error 110  

Significant at the *** p < .001 level.


a.Group included musicians versus non-musicians; b. Task included Mandarin homophone versus Mandarin tone versus
Music; c. Difficulty included 1-back versus -2-back paradigm.

to sound conversion (Schendel & Palmer, 2007), with print exposure sufficient to allow this
conversion (Sloboda, 1976).
Direct comparison of phonological and tonal recoding of Mandarin script in the same partici-
pants revealed greater difficulty extracting tone information than phonological information, both
in n-back tasks and preliminary reading tasks. There is evidence that tone processing is harder
than phonological processing for dyslexic children reading Chinese (Li & Ho, 2011). A supraseg-
mental phonological feature, such as linguistic tone, may be harder to retrieve than phonological
segments because tone cues are embedded in phonological information (Leong, 2002).
For musicians in our study, lexical tone processing in Chinese reading was easier than for
non-musicians. Delogu and colleagues (2010) found that musical ability enhances auditory
discrimination of lexical tone in Mandarin Chinese and noted the potential for musical training
to facilitate learning of a tonal language.

Musical ability and language


Based on the literature, there are three explanations for superior performance of musicians in
verbal tasks as compared to non-musicians. Musical training may: 1) result in expanded work-
ing memory capacity; 2) provide long-term memory (LTM) support for normal working mem-
ory; and/or, 3) result in improved executive function.

Expanded working memory capacity.  Musical experience may result in increased working memory
capacity, making musicians less vulnerable to high working memory demands. Music reading
is demanding of working memory (Lehman, 2007; Gudmundsdottir, 2010), and it has been
suggested that skilled performers such as musicians and chess players may have an expanded
working memory capacity (Chase & Simon, 1973; Stigler, 1984). If working memory is
expanded in musicians, the mechanism underlying this is unclear. Sloboda (1976) noted that
practice in sight-reading could lead to automaticity and decreased attentional demands during
sight-reading in musicians, and that the resulting sound codes could be maintained over lim-
ited periods with little or no additional load on what he termed ‘short term memory.’
The notion that working memory capacity can be expanded by musical experience may
involve the operation of a ‘single acoustic loop’ (Schendel, 2006). Irrelevant sounds (i.e., listen-
ing to music or spoken words) and articulatory suppression procedures (i.e., singing or saying
382 Psychology of Music 44(3)

Figure 5.  Accuracy rates for each stimulus category for Mandarin homophone and music notation tasks
in both groups. (A) Mandarin homophone task. (B) Musical notation task.

irrelevant sounds while trying to perform a serial recall task) both interfere with accuracy in
serial list recall. By examining interference from different types of irrelevant sounds in healthy
adults, Schendel (2006) found that music interferes with speech and speech interferes with
music. Schendel (2006) suggested that music and language have access to the same acoustic
store or loop with a single rehearsal mechanism for both language and music.
Schendel and Palmer (2007) compared verbal and music memory in serial list recall under
conditions of no interference versus intermittent interference from musical (‘la’) or verbal
(‘the’) articulatory suppression. They found when musicians were required to convert printed
notes to sound, memory for printed notes in serial recall was similar to the pattern observed
with verbal material. Articulatory suppression caused a greater decrement in memory for vis-
ual notes when participants were required to convert the visual notes into sound, supporting
the idea that the articulatory suppression effect is greatest when there is more similarity
between the to-be-remembered and the to-be-ignored stimuli. Schendel and Palmer (2007)
Lu and Greenwald 383

suggested that the same mechanisms are responsible for the storage and rehearsal of auditorily
encoded music and verbal information. If this is true, possibly musical experience could result
in greater efficiency of this common rehearsal mechanism, but these effects remain unspecified
at this time.
There is some evidence against the hypothesis that musicians have expanded working
memory capacity. In an auditory discrimination study, Delogu and colleagues (2010) found
musical ability enhances auditory discrimination of lexical tone but not phonology in
Mandarin Chinese. They found the length of auditory sequences did not interact with the
musical effect and so they asserted that the positive transfer of musical ability to tone dis-
crimination was not dependent on working memory capacity. Further research is needed to
examine these effects.

LTM support for normal working memory.  The second explanation for superior performance of
musicians in verbal tasks as compared to non-musicians involves interaction of long-term
memory for musical information with working memory. As described by Vallar (2006), verbal
LTM supports the phonological short-term store and the rehearsal process in aspects of imme-
diate retention. Presumably, LTM for musical information could function in the same way as
verbal LTM to support working memory. The musicians in the current study have expertise in
reading musical tone and this may be why they performed significantly more accurately and
faster than non-musicians in recoding print into linguistic tone in the Mandarin tone task. In
this study, the number of years learning music was positively correlated with accuracy scores
in the Mandarin homophone task (1-back) and the Mandarin tone task (1-back and 2-back).
Thus, long-term knowledge of music may have influenced working memory.
Amount of musical experience determined the degree of impairment by musical suppres-
sion on visually presented music in the Schendel and Palmer (2007) study described above.
That is, individuals with more musical experience were less impaired by musical suppression in
a task requiring them to convert visual notes to sound. A possible reason for these results is that
increasing musical experience results in greater LTM support for normal working memory.
Alternatively, greater musical experience could make the rehearsal process more efficient,
expanding overall working memory capacity as described above.
Interestingly, if LTM did facilitate working memory in the current study, it affected tone
judgments and phonological judgments. Although both groups performed better in the
Mandarin homophone task than the Mandarin tone task, the non-musicians were less accurate
than the musicians in the 1-back homophone and both tone tasks. The homophone task
requires phonological judgments (i.e., determining similarities and differences between
Mandarin homophone word pairs) and does not require tone judgments, yet it was easier for
the musicians to perform than the non-musicians.
Overall, the possibility that LTM support for working memory can explain the superior per-
formance of musicians in our Mandarin reading tasks is weakened by two factors. First, as
noted above, there is some evidence that the effects of musical ability on lexical tasks may not
relate to working memory function (Delogu et al., 2010). Second, the non-musicians in our
study were as expert in reading Mandarin as the musicians (i.e., had as much LTM support for
the Mandarin homophone task as the musicians). On the other hand, the possibility that LTM
support accounts for differences between the two groups is strengthened by the fact that the
number of years learning music was positively correlated with accuracy scores in Mandarin
reading tasks. Further research is needed to disentangle the effects of musical ability versus
formal musical training. Delogu and colleagues (2010) reported superior performance in
384 Psychology of Music 44(3)

Table 8.  Accuracy for four categories of target stimuli by group.

Tasks Category Accuracy Reaction time

Musiciansa Non-musiciansb Musiciansa Non-musiciansb


Homophone OSPS .91 (.05) .87(.10) 751(88) 797(86)
  ODPS .95(.06) .91(.07) 723(78) 778(108)
  OSPD .99(.02) .98(.04) 775(95) 821(112)
  ODPD .99(.02) .99(.01) 691(95) 726(117)
Music LSPS .91(.09) .77(.15) 859(113) 905(131)
  LDPS .90(.07) .28(.26) 945(142) 1093(189)
  LSPD .95(.06) .83(.24) 952(159) 960(168)
  LDPD .99(.04) .91 (.07) 855(156) 946 (193)
an = 30; bn = 30.

discrimination of lexical tone in individuals with musical ability (i.e., high melodic compe-
tences) with or without formal musical training.

Improved executive function.  The third explanation noted above for superior performance of musi-
cians in verbal tasks as compared to non-musicians is that musicians have better executive
function abilities, which could be decomposed into three processes: shifting among multiple
tasks or mental sets, updating and monitoring of representations in working memory, and inhi-
bition of responses (Hedden & Yoon, 2006). Studies addressing the relationship between execu-
tive function and musical working memory have focused on the musical training effect (Degé,
Kubicek, & Schwarzer, 2011; Hargreaves & Aksentijevic, 2011; Moreno, Bialystok, Barac,
Schellenberg, Cepeda, & Chau, 2011; Schellenberg, 2011). Moreno and colleagues (2011)
found short-term musical training enhanced verbal intelligence and executive function. They
concluded that musical training improved musical skills but also transferred to improved verbal
ability because cognitive processing of music overlaps with cognitive mechanisms used in lan-
guage (specifically, executive function).
The n-back tasks used in the current study are demanding of executive function abilities and
are similar to reading of musical notation in that they require shifting among multiple compo-
nents of the task, updating and monitoring of representations in working memory, and inhibi-
tion of responses. The significantly higher accuracy of the musicians in the 1-back homophone
task and the 1-back and 2-back tone tasks is evidence that they may have better executive func-
tion skills than the non-musicians.
The musical working memory model proposed by Berz (1995) includes a central executive
controller, and Ockelford (2007) described a ‘musical executive’ component that could relate to
the central executive (Baddeley, 2003), as noted above. In this context, one could hypothesize
that expert-level practice in complex visual tracking and integration of elements in musical
scores and their timely conversion to sound could facilitate operation of the central executive in
other reading tasks such as our experimental n-back tasks.

Other characteristics of musicians.  Another explanation for the superior performance of musi-
cians in this study as compared to non-musicians is that the musician group may have pos-
sessed personal and cognitive advantages for success prior to their musical training. Our
musician and non-musician groups did not differ in educational level, years of learning Eng-
lish, handedness, digit span and Mini-Mental State Exam performance, and participants in
Lu and Greenwald 385

both groups were motivated individuals who had achieved success academically and in a vari-
ety of hobbies and interests. Corrigall, Schellenberg, and Misura (2013) found that individual
differences in personality affected who takes music lessons and for how long; in their results,
the personality dimension with the best predictive power was openness-to-experience. Thus,
this explanation cannot be ruled out without direct examination in future studies. We suggest
that cultural factors be considered in studies of who takes music lessons. For example, anecdo-
tally it appears that cultural values placed on careers for males in science and engineering in
Taiwan result in more females than males taking music lessons.
Delogu and colleagues (2010) invoked a right hemisphere account to explain the relationship
between musical ability and auditory tone discrimination superiority in musicians. However, the
superior performance of musicians in verbal tasks in the current experiment, which included
both superior homophone (phonological) judgments and tone judgments, cannot be explained
by operation of the right hemisphere in lexical tone processing because phonological processing
is often described as involving left hemisphere function. Also, further suggestions by Delogu and
colleagues for how musically competent participants naïve to Mandarin Chinese were able to
perceive Chinese lexical tones do not apply to the current study because the participants in our
study were all native speakers of Mandarin Chinese. More research is needed to examine the
specific effects of musical ability on verbal task performance.

Conclusion
More accurate performance by musicians as compared to non-musicians in verbal working
memory tasks may reflect several processes including an expanded working memory capacity,
the influence of long-term memory on working memory, improved executive function, and/or
other characteristics of musicians. The results of the current study do not rule out any of
these hypotheses. However, the current study provides evidence that music and sight-reading
experience facilitates processing of phonology and lexical tone in reading tasks.
Our task design was the same for musicians and non-musicians across all 1-back and 2-back
tasks, allowing for the differential effects of language and music processing to be observed in
the two groups. The differences between musicians and non-musicians in this study were often
significant but fairly small in terms of proportion correct. The effect of musical experience on
language may be modest in healthy adults with normal reading ability. However, for cognitively
vulnerable individuals, such as dyslexic children or persons with aphasia, even a fairly small
facilitation effect of musical training could have a strong real-world impact in acquisition of
reading.
In summary, we examined the relationships among musical experience, tonal language pro-
cessing, and working memory using written Mandarin Chinese in musicians and non-musicians.
Evidence from the current study suggests that musical experience can facilitate language process-
ing. The cognitive benefits of musical experience can be specified further in future studies.

Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-
profit sectors.

References
Baddeley, A. D. (2003). Working memory: Looking back and looking forward. Nature Reviews Neuroscience,
4(10), 829–839.
386 Psychology of Music 44(3)

Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. A. Bower (Ed), Recent advances in learning
and motivation (Vol. 8, pp. 47–90). New York, NY: Academic Press.
Beneventi, H., Tonnessen, F. E., Ersland, L., & Hugdahl, K. (2010). Executive working memory processes
in dysleixa: Behavioral and fMRI evidence. Scandinavian Journal of Psychology, 51, 192–202.
Berz, W. L. (1995). Working memory in music: A theoretical model. Music Perception, 12(3), 353–364.
Bialystok, E., & DePape, A. M. (2009). Musical expertise, bilingualism, and executive functioning. Journal
of Experimental Psychology: Human Perception and Performance, 35(2), 565–574.
Bugos, J. A., Perlstein, W. M., McCrae, C.S., Brophy, T. S., & Bedenbaugh, P.H. (2007). Individualized
piano instruction enhances executive functioning and working memory in older adults. Aging &
Mental Health, 11(4), 464–471.
Chase, W. G., & Simon, H. A. (1973). The mind’s eye in chess. In W. G. Chase (Ed.), Visual information
processing (pp. 215–281). New York, NY: Academic Press.
Corrigall, K. A., Schellenberg, E. G., & Misura, N. M. (2013). Music training, cognition, and personality.
Frontiers in Psychology, 4, 222–231.
Degé, F., Kubicek, C., & Schwarzer, G. (2011). Music lessons and intelligence: A relation mediated by
executive functions. Music Perception, 29(2), 195–201.
Delogu, F., Lampis, G., & Belardinelli, M. O. (2010). From melody to lexical tone: Musical ability enhances
specific aspects of foreign language perception. European Journal of Cognitive Psychology, 22(1),
46–61.
Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). Mini-mental state: A practical method for grading
the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198.
Gudmundsdottir, H. R. (2010). Pitch error analysis of young piano students’ music reading performance.
International Journal of Music Education, 28(1), 61–70.
Hargreaves, D. J., & Aksentijevic, A. (2011). Music, IQ, and executive function. British Journal of
Psychology, 102(3), 306–308.
Hedden, T., & Yoon, C. (2006). Individual differences in executive processing predict susceptibility to
interference in verbal working memory. Neuropsychology, 20(5), 511–528.
Kim, J. J., Kim, M. S., Lee, J. S., Lee, D. S., Lee, M. C., & Kwon, J. S. (2002). Dissociation of working mem-
ory processing associated with native and second language: PET investigation. NeuroImage, 15(4),
879–891.
Kirk, R. E. (1995). Experimental design: Procedures for the behavioral sciences. Belmont, CA: Brooks/
Cole Pub. Co.
Krumhansl, C. K., & Toiviainen, P. (2003). Tonal cognition. In I. Peretz & R. Zatorre (Eds.), The cognitive
neuroscience of music (pp. 95–108). New York, NY: Oxford University Press.
Lehman, A. C. (2007). Book review of Component skills involved in sight reading music. Psychomusicology:
A Journal of Research in Music Cognition, 19(2), 91–94.
Leong, C. K. (2002). Segmental analysis and reading in Chinese. In H. S. R. Kao, C. K. Leong & D. G. Gao
(Eds.), Cognitive neuroscience studies of the Chinese language (Vol. 8 pp. 227–246). Hong Kong: Hong
Kong University Press.
Li, W. S., & Ho, C. S. H. (2011). Lexical tone awareness among Chinese children with developmental dys-
lexia. Journal of Child Language, 38(4), 793–808.
Lin, C. H., Wu, C. H., Ting, P. Y., & Wang, H. M. (1996). Frameworks for recognition of Mandarin syllables
with tones using sub-syllabic units. Speech Communication, 18(2), 175–190.
Miller, P., & Kupfermann, A. (2009). The role of visual and phonological representations in the processing
of written words by readers with diagnosed dyslexia: Evidence from a working memory task. Annual
of Dyslexia, 59, 12–33.
Moreno, S., Bialystok, E., Barac, R., Schellenberg, E. G., Cepeda, N. J., & Chau, T. (2011). Short-term
music training enhances verbal intelligence and executive function. Psychological science, 22(11),
1425–1433.
Ockelford, A. (2007). A music module in working memory? Evidence from the performance of a prodi-
gious musical savant. Musicae Scientiae, Special Issue, 5–36.
Lu and Greenwald 387

Patel, A. D. (2008). Sound elements: Pitch and timbre. In A. D. Patel (Ed.), Music, language, and the brain
(pp. 9–94). New York, NY: Oxford University Press.
Salamé, P., & Baddeley, A. D. (1989). Effects of background music on phonological short-term memory.
The Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 41(1-A), 107–122.
Schellenberg, E. G. (2011). Examining the association between music lessons and intelligence. British
Journal of Psychology, 102(3), 283–302.
Schendel, Z. A. (2006). The irrelevant sound effect: Similarity of content or similarity of process?
Dissertation Abstracts International Section B: The Science and Engineering, 67(6-b), 3477.
Schendel, Z. A., & Palmer, C. (2007). Suppression effect on musical and verbal memory. Memory &
Cognition, 35(4), 640–650.
Schön, D., Anton, J. L., Roth, M., & Besson, M. (2002). An fMRI study of music sight-reading. NeuroReport,
13(17), 2285–2289.
Simmons, M. R. (2000). The central executive and working memory: A dual-task investigation of the n-back
task. Gainesville: University of Florida.
Siok, W. T., & Fletcher, P. (2001). The role of phonological awareness and visual-orthographic skills in
Chinese reading acquisition. Developmental Psychology, 37(6), 886–899.
Sloboda, J. (1976). Visual perception of musical notation: Registering pitch symbols in memory. The
Quarterly Journal of Experimental Psychology, 28(1), 1–16.
Stewart, L., Walsh, V., & Frith, U. (2004). Reading music modifies spatial mapping in pianists. Perception
& Psychophysics, 66(2), 183–195.
Stigler, J. W. (1984). ‘Mental abacus’: The effect of abacus training on Chinese children’s mental calcula-
tion. Cognitive Psychology, 16(2), 145–176
Tong, Y., Francis, A. L., & Gandour, J. T. (2007). Processing dependencies between segmental and
suprasegmental features in Mandarin Chinese. Language and Cognitive Processes, 23(5), 689–708.
Vallar, G. (2006). Memory systems: The case of phonological short-term memory. A festschrift for cogni-
tive neuropsychology. Cognitive Neuropsychology, 23, 135–155
Wan, C. Y., & Schalug, G. (2010). Music making as a toole for promoting brain plasticity across the life
span. The Neuroscientist, 16(5), 566–577.
Wechsler, D. (1997). Wechsler Adult Intelligence Scale-Third edition. San Antonio, TX: The Psychological
Corporation.
Weekes, B. S., Chen, M. J., & Gang, Y. W. (1997). Anomia without dyslexia in Chinese. Neurocase, 3,
51–60.
Williamson, V. J., Baddeley, A. D., & Hitch, G. J. (2010). Musician’s and nonmusicians’ short-term
memory for verbal and musical sequences: Comparing phonological similarity and pitch proximity.
Memory & Cognition, 38(2), 163–175.
Wong, P. C. M., Soke, E., Russo, N. M., Dees, T., & Kraus, N. (2007). Musical experience shapes human
brainstem encoding of linguistic pitch patterns. Nature Neuroscience, 10, 420–422.
Zatorre, R. J., Belin, P., & Penhune, V. B. (2002). Structure and function of auditory cortex: Music and
speech. Trends in Cognitive Sciences, 6, 37–46.

Das könnte Ihnen auch gefallen