Sie sind auf Seite 1von 37

Research Report

Intervention for children with severe speech disorder: A comparison of two approaches
Sharon Crosbie, Alison Holm and Barbara Dodd
School of Education, Communication and Language Sciences, University of Newcastle upon Tyne, Newcastle upon Tyne, UK
(Received 22 September 2004; accepted 17 March 2005)

Abstract
Background: Children with speech disorder are a heterogeneous group (e.g. in terms of severity, types of errors and underlying causal factors). Much research has ignored this heterogeneity, giving rise to contradictory intervention study findings. This situation provides clinical motivation to identify the deficits in the speech-processing chain that underlie different subgroups of developmental speech disorder. Intervention targeting different deficits should result in a differential response to intervention across these subgroups. Aims: To evaluate the effect of two different types of therapy on speech accuracy and consistency of word production of children with consistent and inconsistent speech disorder. Methods & Procedures: Eighteen children (aged 4;086;05 years) with severe speech disorder participated in an intervention study comparing phonological contrast and core vocabulary therapy. All children received two 8-week blocks of each intervention. Changes in consistency of production and accuracy (per cent consonants correct) were used to measure the effect of each intervention. Outcomes & Results: All of the children increased their consonant accuracy during intervention. Core vocabulary therapy resulted in greater change in children with inconsistent speech disorder and phonological contrast therapy resulted in greater change in children with consistent speech disorder. Conclusions: The results provide evidence that treatment targeting the speechprocessing deficit underlying a childs speech disorder will result in efficient system-wide change. Differential response to intervention across subgroups provides evidence supporting theoretical perspectives regarding the nature of speech disorders: it reinforces the concept of different underlying deficits resulting in different types of speech disorder. Keywords: phonological disorder, therapy, inconsistency, phonological contrast, core vocabulary.
International Journal of Language & Communication Disorders ISSN 1368-2822 print/ISSN 1460-6984 online # 2005 Royal College of Speech & Language Therapists http://www.tandf.co.uk/journals DOI: 10.1080/13682820500126049

Address correspondence to: Dr Sharon Crosbie, Perinatal Research Centre, Royal Brisbane and Womens Hospital, Herston, Brisbane, Queensland, 4029, Australia. e-mail: scrosbie@somc.uq.edu.au
INT. J. LANG. COMM. DIS., OCTOBERDECEMBER 2005, VOL.

40, NO. 4, 467491

Introduction Speechlanguage pathologists (SLPs) have many options when deciding how to treat children with speech disorder. These intervention options include the unit to target (e.g. sound, error patterns, whole word, whole language); target selection (e.g. the specific sounds or pattern to target first); the number of contrasts to target; the approach to delivery of intervention; and service delivery options. The literature contains descriptions of many intervention approaches, reflecting different analysis procedures and theoretical perspectives of speech disorder. SLPs can choose to use or adapt these approaches in their clinical practice. SLPs aim to implement the most efficient treatment programmes to resolve childrens spoken difficulties and prevent later literacy problems. However, it is

sometimes difficult to ascertain from the research literature exactly what the programmes involve and how to implement them. It can be difficult to ascertain what the intervention aims to change and which children will receive most benefit from its use. Determining the subtle or not-so-subtle ways in which the intervention differs from other programmes can be problematic. The plethora of conflicting results reported in the literature also makes it difficult for SLPs to determine the evidence-base on which to select their intervention options. Many intervention approaches report mixed success (e.g. Forrest and Elbert 2001) and contradictory findings are common (e.g. Gierut et al. 1996, Rvachew and Nowak 2001). These findings reflect the complexity of the population. Few research studies have compared the efficacy and efficiency of different specific intervention approaches. Most recent research has examined the effect of manipulating one variable within a given parameter rather than attempting to determine the most effective approach. Intervention programmes differ within three broad parameters: the target selected, the approach selected; and the implementation structure selected. Target selection One parameter in which interventions differ, target selection, has received significant research attention. Intervention targets are usually selected on superordinate properties (e.g. markedness or implicational relationships, productive phonological knowledge, complexity) of a sound system (Gierut et al. 1996) or the function of sounds within a childs system (Williams 2000). Gierut et al. (1996) compared the effect of targeting early versus later developing sounds in two groups of children. Their results indicated that both targets resulted in phonological change; however, greater system-wide change occurred when the targets were later-developing sounds. Rvachew and Nowak (2001) provided counterevidence. Their group study of 48 children with moderate or severe speech delay showed greater local generalization for early developing rather than later developing targets. Stimulability is another target selection variable examined in the literature. Miccio and colleagues reported that stimulable sounds experience change without direct intervention (Powell and Miccio 1996, Miccio et al. 1999). Miccio (2002) suggested if a sound is stimulable it is being acquired naturally and may not require intervention. In addition, stimulable sounds may be added to the phonetic inventory even when not chosen specifically as therapy targets (Powell and Miccio 1996). 468 S. Crosbie et al. In contrast, Rvachew and Nowak (2001) found differences in the rate of treatment progress and generalization when they targeted stimulable and nonstimulable sounds. Children who received therapy for early developing phonemes of which they had productive phonological knowledge made more progress than children who received therapy for late developing phonemes of which they had little productive knowledge. Generalization occurred to untreated stimulable phonemes but not untreated unstimulable phonemes. Rvachew and Nowak questioned the efficacy of treatment response when targeting non-stimulable sounds: Unless the treatment of unstimulable phonemes boosts the rate of progress for stimulable phonemes beyond that due to maturation, it is difficult to see how the selective treatment of unstimulable phonemes could be the most efficient procedure (p. 621). Error consistency is a third target selection variable evaluated in the literature. The consistency of articulation error substitutes and the effect of this (in)consistency on intervention outcomes have been examined with differing findings. Forrest et al. (1997, 2000) and Forrest and Elbert (2001) investigated

children with articulation disorders and divided them into children with consistent sound substitutes (same substitute for the omitted sound in all instances) and those with variable substitutes (substitute varied both within and across word positions). Using traditional articulation therapy techniques, they found that children with consistent error substitutes were able to learn and generalize intervention targets effectively. The children with inconsistent substitutes did not benefit from the intervention. These findings have been taken as evidence that it is most effective to target consistent error patterns using traditional techniques. Tyler et al. (2003) examined the predictive value of error consistency to change in accuracy following intervention. In contrast to Forrest et al., they found that a highly inconsistent system (measured by the total number of different sound substitutions/ omissions made across word positions) was more likely to change than a consistent system. However, this study involved very different intervention techniques (morphsyntactic) to those used by Forrest and Elbert, which might account for the contradictory findings. Methods of phonological contrast intervention The second parameter in which interventions differ involves the decision of how to target the selected target. A range of phonological intervention methods have been developed and described. Five currently used methods follow (Baker and McLeod 2004):

N Minimal pairs: the approach contrasts the childs error with the target sound
using minimal pairs of words (e.g. Ferrier and Davis 1973, Blache and Parsons 1980, Weiner 1981, Gierut 1991). Two words that differ by one sound only form a minimal pair. A set of minimal pairs focuses on the contrast being targeted (e.g. f b: fun bun, fin bin, fill bill, fit bit). The minimal pair method is often implemented when error pattern analysis has been used and clear patterns are evident. It is considered a conceptual form of sound teaching and is frequently used in the treatment of phonological disorders stemming from cognitive or linguistic difficulties (Gierut 1998, p. S89). The minimal pair method has been used across different frameworks including phonological process analysis, distinctive Intervention for children with severe speech disorder 469 feature analysis and generative analysis. It assumes that there are patterns (e.g. stopping all fricatives) that are the basis for the childs error and sound organization. Maximal oppositions: Gierut (1990) described a variation on the minimal pair method. Instead of contrasting the target sound with the childs error, the intervention uses an independent comparison sound. The contrast to the target needs to be a sound that the child can produce correctly and one that is maximally different to the target sound. Gierut claimed that targeting maximal oppositions is more effective than minimal pairs. Empty set: Gierut (1991) developed another method of intervention: a variation on maximal oppositions known as contrasts within an empty set. This method involves single contrastive pairings of two target sounds. The target sounds are unknown, independent, and maximally different from each other. Multiple oppositions: the intervention method targets more than a single contrast pair (Williams 2000, 2003). This method involves multiple contrastive pairings of the childs error with several target sounds. It uses

the childs functional system as the basis for target selection. It is based on the system as a whole rather than on phonological error patterns that describe components of the systems (e.g. [t] for /k/ is fronting, [t] for /s/ is stopping). Metaphon: Dean et al. (1995) described Metaphon, another intervention method. It is based on contrasting speech sounds and properties. However, unlike other contrast methods, Metaphon aims to increase metalinguistic awareness. It emphasizes similarities and differences in sounds, recognizing, matching and classifying sounds according to their features. Intervention structure The third parameter in which interventions differ is the structure of the intervention. After choosing an approach to reorganize the childs speech system the clinician must consider how to implement the approach (the structure of the treatment). Again, the clinician is faced with choices. Fey (1986) described two treatment structures: vertical versus horizontal presentation. A vertical structure chooses a single target (sound or pattern) and works with this target to a set criterion of mastery. An alternative structure of intervention is a horizontal approach. The horizontal approach teaches several targets (sounds or patterns) simultaneously for a predetermined period of time. A third approach incorporates elements of the vertical and the horizontal structure. This is the cyclical approach (Hodson and Paden 1991). In a cyclical approach, the clinician selects several targets that they change at weekly intervals (e.g. targeting stopping 1 week, cluster reduction the next, gliding the next). The targets are then cycled (e.g. stopping targeted again in the fourth week). The approaches differ in two main ways: the number of targets selected for treatment, and the criterion used for progression (i.e., performance versus time based). Williams (2000) is one of the few researchers who have examined the effect of models and structures of intervention on outcome. Williams examined word versus 470 S. Crosbie et al. naturalistic speech intelligibility models of intervention and horizontal, vertical and cyclical structures of intervention in ten longitudinal case studies of children with moderate to profound phonological impairments. All children in the study progressed through the models of intervention so, initially, they experienced a high degree of focus on a target (e.g. vertical intervention structure with a word level model). This changed to a low degree of focus to facilitate generalization (e.g. combined structure at a conversational level model). Williams (2000, p. 27) suggested that one treatment model or structure may not fit all children or may not fit a child throughout the course of intervention. Models and structures may need to change as the childs needs change. The question of efficacy and efficiency of intervention is under-examined in the literature (e.g. comparison of rate of progress between groups of children using different target selection criteria measured in clinical sessions and weeks/months involved). For example, it might be possible to show that selecting a later developing, non-stimulable, consistently in error target sound results in acquisition of the target sound with spontaneous generalization to a number of sounds not targeted directly. However, it is also necessary to show that this process is more efficient (i.e. takes less time) than directly targeting each of those sounds in a developmental order. Few studies directly compare different intervention approaches for children with speech disorder. Hesketh et al. (2000) compared the effects of metaphonological and articulation-based therapy on the phonological ability of 61 children with

developmental phonological disorders. The children were allocated to a treatment approach and received ten weekly sessions of individual therapy. Children receiving metaphonological therapy completed general phonological awareness (PA) tasks and specific PA tasks involving their target error pattern. Children in the articulation group practised the production of problematic phonemes. They found that both groups significantly improved on phonological awareness and output measures but with no effect of therapy type. They found no evidence that working on metaphonological skills was necessary for children with phonological disorder. Dodd and Gillon (2001) criticized Hesketh et al.s (2000) study suggesting that their results could reflect the heterogeneity of the participants and the content of the metaphonological therapy. Children with speech disorder are a heterogeneous group in terms of severity (number of errors), type of errors (surface speech pattern), underlying causal factors, and maintenance factors. Many research studies, however, continue to ignore heterogeneity giving rise to contradictory findings. Subgroups of children with speech disorder Different deficits in the speech-processing chain underlie the subgroups of developmental speech disorder (Dodd and McCormack 1995). Research that considered the subgroups of children with speech disorder found that children respond differently to therapy approaches that target different aspects of the speechprocessing chain (Alcorn et al. 1995, Holm et al. 1997, Dodd and Bradford 2000). Previous research indicates four subgroups of children with speech disorder (Brierly 1987, Bradford and Dodd 1994, Dodd and McCormack 1995). A psycholinguistic perspective has allowed the testing of hypotheses regarding the factor/s or deficit/s underlying the different types of disorder. The level of Intervention for children with severe speech disorder 471 breakdown in the speech-processing chain for each of four subgroups has been identified:

N Articulation impairment: inability to produce a perceptually acceptable


version of particular phonemes, either in isolation or in any phonetic context. Children may consistently produce a specific distortion (e.g. lateral lisp) or substitute another phoneme (e.g. [w] for /r/) (Grundy 1989). Articulation errors are due to a peripheral problem where the wrong motor programme for the production of specific speech sounds has been learned (Fey 1992). Delayed phonological skills: speech characterized by the use of regular error patterns that occur in normal development but at a chronological age when the patterns should not be evident. Little is known about the cause of phonological delay. Children with phonological delay have not been found to have a specific deficit (Dodd and McCormack 1995). However, studies of the natural history of delay suggest that some delayed children remain delayed, others achieve age-appropriate speech, and some typically developing children become delayed (Dodd et al. 2000). Consistent deviant disorder: systematic use of atypical (non-developmental) phonological patterns (e.g. deleting all syllable initial consonants) (Leonard 1985, Ingram 1989). An impaired ability to abstract and/or organize knowledge about the nature of the phonological system causes these errors (Dodd and McCormack 1995). For example, Brierly (1987) found that children with consistent deviant phonology performed more poorly than other speech impaired children on tasks of phonological awareness,

such as recognition of alliteration and rhyme. These children have poor understanding of the phonemic rules of the language when assessed on a legality awareness task (Dodd et al. 1989). This cognitive deficit arises at the internal organizational level of the speech-processing chain (Grundy 1989). Inconsistent speech disorder: speech characterized by variable productions of the same lexical items or phonological features not only from context to context, but also within the same context. Inconsistency characterized by multiple error types (unpredictable variation between a relatively large number of phones) suggests the lack of a stable phonological system because of a deficit in phonological planning. Phonological planning refers to the process of phoneme selection and sequencing. Dodd and McCormack (1995) argued that children with speech characterized by inconsistency generate under-specified or degraded phonological plans for word production. This leads to phonetic programmes with articulatory parameters that are too broad, leading to additional phonetic variability even when the correct phoneme is selected. Inconsistent speech disorder is distinct from childhood apraxia of speech (CAS) although inconsistency characterizes both disorders (Ozanne 1995). Children with CAS are unlike children with inconsistent disorder in a number of important ways: (1) they are worse in imitation than in spontaneous production (whereas children with inconsistent disorder are better in imitation than in spontaneous production); (2) cues to elicit production of words differ; and (3) they have oro-motor difficulties. 472 S. Crosbie et al. Broomfield and Dodd (2004) report the following prevalence rates for the four subgroups: 12.5% articulation impairment, 57.5% delayed phonological skills, 20.6% consistent deviant phonological disorder and 9.4% inconsistent phonological disorder. Researchers broadly agree on the prevalence rates cross-linguistically for Dodds classification of functional speech disorder sub-groups with the subgroups identified in Cantonese (So and Dodd 1994), Putonghua (Zhu and Dodd 2000), German (Fox and Dodd 2001), Turkish (Topbas and Konrot 1996) and Spanish (Goldstein 1995). The cross-linguistic similarities of the types of speech disorders suggests that the deficits underlying disorder are independent of the phonological system per se. The surface speech errors reflect underlying deficit/s in the speechprocessing mechanism regardless of the phonological system of the language being acquired. Most previous research on intervention efficacy has focused on a single heterogeneous group of children with speech disorder rather than comparing the relative effects of differing approaches with different children. Dodd and Bradford (2000) compared three therapy approaches for children with different types of phonological disorder. They presented three detailed case studies: two children with inconsistent speech disorder and a child with consistent speech disorder (speech output characterized by consistent use of developmental and non-developmental error patterns). The study trialled three therapies with each child: phonological contrast, core vocabulary, and PROMPT (Hayden 1999). The results indicated that children who were making inconsistent errors received the most benefit from core vocabulary that focused on consistency of whole-word production. One child with inconsistent speech disorder also benefited from phonological contrast therapy after consistency had been established. The child with consistent speech disorder received the most benefit from phonologically based intervention.

Dodd and Bradfords results provide evidence that aspects of a childs speech system (phonetic, phonological) may respond to different types of therapy approaches that target different aspects of the speech-processing system. The results of their study also suggested that the sequence of therapy might be important. For example, a child with inconsistent speech disorder may benefit from phonological contrast therapy after consistency has been established. The study described in this paper compares the effect of two therapy approaches with two subgroups of children with speech disorder. The intervention differed in terms of the underlying deficit targeted and the speech unit targeted. Phonological contrast therapy targeting a cognitivelinguistic deficit Phonological contrast approaches target speech error patterns. The aim of therapy is to reorganize a childs linguistic system. Most phonological intervention approaches rely on a communicative need for phonological reorganization. For example, words are contrasted to confront the childs system with communicative breakdown (I dont know whether you mean sun, fun or bun because they all sound like bun to me). Intervention, therefore, aims to develop these meaningful contrasts of words. The clinician shows the child that phonemes contrast a difference in meaning (keytea, shoetwo) and that these contrasts need to be made to avoid misunderstanding. This process requires recognition of similarities and differences Intervention for children with severe speech disorder 473 of sounds and how these mark differences in meaning. The process allows the child to organize sounds into classes and sequences into structures. Active participation in this process results in new hypotheses and patterns (Grunwell 1997). The resulting reorganization should be evident in the pattern of generalization. The approach predicts that the target contrast will generalize to other sounds affected by the childs error pattern (e.g. f b will generalize to other fricatives affected by stopping). Alternatively, a range of contrasts within an error pattern can be targeted simultaneously (e.g. a child who stops all fricatives might be given pairs including: sun bun, shin pin, shoe two, thick tick). Intervention should aim to facilitate within and across class generalization not just local generalization (Gierut 2001). Core vocabulary therapy targeting phonological planning Inconsistency across words and within the same linguistic context indicates a pervasive speech-processing difficulty (Grunwell 1981, Forrest et al. 1997, 2000, Williams and Stackhouse 2000). Children with inconsistent speech disorder are resistant to phonological contrast or traditional therapy (Forrest et al. 1997, 2000). Intervention target selection is difficult as a child with inconsistent speech disorder may use a range of sound substitutions that differ in manner of production, place of production, or voicing. Taking an articulatory approach that targets a single sound is ineffective when a child with adequate oromotor control sometimes produces the target accurately or, is stimulable for the sound. The core vocabulary approach effectively improves consistency of word production (Holm and Dodd 1999, Dodd and Bradford 2000). Core vocabulary therapy does not target surface error patterns or specific sound features; it targets whole word production. Learning to say a set of high frequency, functional words consistently, targets the underlying deficit in phonological planning. Providing detailed specific information about a limited number of words and drilling the use of that information with continued systematic practise improves the ability to create a phonological plan on-line. Aims of the investigation

Few studies have examined how children with different speech disorders respond to interventions that target different underlying speech-processing deficits. The primary aim of this study was to investigate the effect of two treatment approaches on the consistency of word production and speech accuracy of children with either an inconsistent speech disorder or a consistent disorder. It was hypothesized that children with inconsistent speech disorder would best respond to core vocabulary therapy that targets the ability to form phonological plans (templates) on-line. Children with consistent non-developmental speech error patterns were hypothesized to respond best to phonological contrast therapy targeting the reorganization of cognitivelinguistic information. A third hypothesis was that children with inconsistent speech disorder who received core vocabulary therapy first would benefit more from the phonological contrast therapy than the inconsistent children receiving the phonological contrast therapy first. A final hypothesis was that intervention targeting the contrastive use of phonemes would 474 S. Crosbie et al. be more effective for children with inconsistent speech disorder once consistency was established. Methods Participants Speechlanguage pathologists from Education Queensland (Australian state government service provider) were asked to refer children aged between 4;6 and 7 years with moderate to severe phonological disorder. Twenty children were recruited who met the following inclusion criteria:

N Severity: standard score of 3 on the per cent consonants correct (PCC)


measure of the Phonology Assessment (DEAP Diagnostic Evaluation of Articulation and Phonology [standard score mean of 10, normal range of 713], Dodd et al. 2002). Subgroup classification: to be included in this study, children were required to have either an inconsistent speech disorder or a consistent speech disorder. Children were considered to have an inconsistent speech disorder if they had a score of 40% or more on the Inconsistency Assessment. Children were considered to have a consistent speech disorder if they scored below 40% on the Inconsistency Assessment and used at least two atypical error patterns on the Phonology Assessment (cf., Dodd et al. 2002). Oromotor structure and skills: no structural problems apparent on oral examination. Within the normal range on one or more of the components of the oromotor assessment of the DEAP examining isolated movements of the lips and tongue, sequenced volitional movements and diadochokinetic skill (mean of three standard scores above 6). Evaluation of

N Receptive language: within the normal range on the Clinical N Non-verbal skills: within the normal range

Language Fundamentals Preschool (Wiig et al. 1992). on the Visual-Motor Integration Assessment (Beery and Buktenica 1997). hearing test.

N Hearing: normal hearing as shown by the childs last

N Language background: monolingual speaker of English.

Two children withdrew from the study for reasons unrelated to the research. The results of 18 children are presented here. The group included 11 boys and seven girls, ranging in age from 4;08 to 6;05 years, with a mean age of 6;02 years. Table 1 reports the details of the children included in the study. Pre-treatment assessment: differential diagnosis of speech disorder An experienced paediatric speechlanguage pathologist assessed each child in a quiet room at their school or preschool. Parents were invited to be present at the assessment. Each childs speech, oromotor and receptive language skills were assessed to allow for differential diagnosis of their speech disorder. The Articulation, Inconsistency and Phonology Assessments of the Diagnostic Evaluation of Articulation and Phonology (DEAP; Dodd et al. 2002) were used to measure speech skills. The DEAP Intervention for children with severe speech disorder 475
Table 1. Participant details and pre-intervention assessment data Child CA (months) Gender Rec Lang SS VMI SS Oromotor mean SS{ Initial PCC Initial inconsistency (%) Subgroup Phonetic inventory* 1 77 M 112 111 8 43 56 inconsistent all sounds 2 57 F 120 112 10 62 60 inconsistent all sounds 3 60 M 108 79 7.5 51 40 inconsistent all sounds except /g, z/ 4 60 M 120 90 8.7 51 60 inconsistent all sounds 5 67 F 110 112 9 46 46 inconsistent all sounds except /k, g/ 6 66 M 91 87 6.7 25 44 inconsistent all sounds except /t, d, s, z/ 7 56 F 102 107 10 50 56 inconsistent all sounds 8 57 M 79 79 7 34 56 inconsistent all sounds 9 60 M 110 124 10 53 40 inconsistent all sounds except /ts 10 67 M 85 110 10 31 56 inconsistent all sounds except /s, z/ 11 60 M 118 102 6 55 36 consistent all sounds 12 65 M 110 107 7.7 48 36 consistent all sounds 13 67 M 110 82 7.5 76 36 consistent all sounds except /k, g/ 14 58 M 96 96 9 45 12 consistent all sounds 15 60 F 100 102 10 32 32 consistent all sounds except /ts, dz/ 16 62 F 102 90 9.3 36 28 consistent all sounds except /ts, dz/ 17 59 F 104 85 10 56 24 consistent all sounds except /s, t, g/ 18 62 F 108 116 10 49 24 consistent all sounds {Mean of the three oromotor SS from DEAP; *as appropriate for chronological age (Dodd et al., 2003).

476 S. Crosbie et al. provided standard scores with a mean of 10 and normal range of 713 for each assessment. The assessing SLP made on-line transcriptions of the speech data. All productions were recorded using a Marantz CP130. All on-line transcriptions were checked against the audio recording following the assessment to ensure accuracy. The Articulation Assessment established the childs phonetic inventory by examining the childs ability to produce phonemes in words or in isolation. Thirty tokens (mostly CVC) were elicited in a picture-naming task. All consonant sounds (except J) were sampled in syllable-initial and -final positions. If a child failed to produce a sound in the picture-naming task, the examiner asked the child to imitate the sound in an open syllable or in isolation. The Phonology Assessment data was used to examine phonological ability by identifying and classifying error patterns in a childs speech. The assessment consisted of two parts: picture namingeliciting 50 tokens covering all consonants in syllable-initial and final position; and picture description eliciting 14 tokens from the naming task in a connected speech context. The PCC was calculated from

the phonology data in accordance to the assessment manual instructions. Consistent speech error patterns (five examples of error pattern) were identified and classified according to the assessment manual as typical or atypical of normal development. Children identified as having at least two non-developmental patterns were categorized as having consistent speech disorder. The Inconsistency Assessment was administered to establish consistency of word production. Each child named a set of 25 pictures three times within the assessment session. Each trial was separated by an activity or different speech task. The three realizations of the same lexical item from the same context were compared to calculate an inconsistency score. For example, if the child produced ten items differently across the three trials they would obtain a score of 40%. Four categories of response were differentiated (Grunwell 1992): three productions of the same lexical item correct and consistent; three productions consistent but incorrect (e.g. zebra [debwe], [debwe], [debwe]); variation between correct and incorrect realizations (e.g. zebra [zebre], [debre], [zebre]); and all three productions incorrect and inconsistent (e.g. zebra: [debwe], [vebe], [zebe]). Children who produced at least 40% of words variably were considered to have an inconsistent disorder. Ten of the 18 children in the study had an inconsistent speech disorder (seven boys and three girls). Eight children had a consistent speech disorder (four boys and four girls). The two groups were comparable for age and severity of speech impairment. An analysis of variance confirmed no significant differences between the inconsistent and consistent subgroups in chronological age (F(1,17)50.18, p50.68) or PCC (F(1,17)50.73, p50.41). There was a significant difference between the two groups on the inconsistency score at initial assessment (F(1,17)534.84, p,0.001). Table 1 presents the pre-intervention assessment data for the two subgroups of children with speech disorder. Reliability Inter-rater reliability measures were taken for the phonemic transcriptions, the inconsistency score and the childs differential diagnosis (i.e. inconsistent speech disorder, consistent speech disorder). Intervention for children with severe speech disorder 477 Phonemic transcriptions Broad transcriptions (phonemic) were made on-line during assessment sessions. The assessors checked their own on-line transcription with reference to the audio-recording following the assessment. To determine inter-judge reliability, an independent experienced SLP re-transcribed ten of the childrens assessment transcriptions (phonology and inconsistency assessments) from the audiorecordings (equivalent to 11% of all assessment data). Point-to-point reliability was calculated based on each judges transcription of each phoneme. Identical segmental transcriptions (excluding diacritics) were coded as agreements. The overall mean for broad transcription agreement was 93.7%, range 87.498.2%. The original assessors transcription was used for all analyses. Inconsistency score Each assessor determined an inconsistency score for each childs transcription. The samples re-transcribed to examine transcription reliability were also used to examine the reliability of the Inconsistency Scores. The reliability transcriber calculated an Inconsistency Score for each of the transcribed samples. Point-topoint reliability was calculated based on each judges score for each of the 25 items (inconsistent versus consistent production). Identical scores were coded as agreements. The overall mean for inconsistency item agreement was 94.7%, range

84100%. Diagnosis of speech disorder Each assessor provided a diagnosis of speech disorder for each child based on all of the data collected at the initial assessment. Identical diagnoses were coded as agreements. The overall agreement on the differential diagnosis was 100%. Consistency of intervention approach Three measures were undertaken to ensure an appropriate consistency of approach across the two clinicians: (1) target and goal selections were planned jointly; (2) the same resources, when applicable, were used; and (3) videotapes of sessions conducted by each clinician were shared. To ensure that there were no differences in intervention outcomes for each clinician a two-factor analysis of variance with repeated measures (difference scores6clinician) was calculated. There was a significant effect of difference scores (F(1,3)54.18, p50.01) but not clinician (F(1,1)50.11, p50.75). Results showed no significant interaction of difference scores and clinician (F(1,1)50.03, p50.87). Baseline data To establish the stability of the childrens phonological systems baseline data was collected before intervention. The initial speech assessment was repeated with a 3week interval. A paired samples t-test compared the measures and revealed no significant change (t50.11, d.f. 17, p50.92). The Pearson correlation coefficient, 478 S. Crosbie et al. r50.82 (p,0.001), confirms the high inter agreement between the two measures. The childrens phonological systems were considered to be stable before intervention. Project design A multiple baseline design with alternating treatments was used. Once eligibility was confirmed, children were allocated to one of the two therapies by order of referral. Treatment 1 was implemented after the baseline period followed by a 4-week withdrawal period, followed by treatment 2. The method of allocation to treatment ensured children in both subgroups of speech disorder received the blocks of therapy in both possible orders (core vocabulary followed by phonological contrast; phonological contrast followed by core vocabulary). Each child participated in 16 (30-minute) individual therapy sessions in each 89-week treatment block. Two experienced paediatric speech language pathologists (the first two authors) administered the intervention. All children received both intervention blocks from the same SLP. In most cases one intervention session each week was provided at home and one session at school to allow the SLP to liaise with both parents and teachers. Parents were asked to complete daily practise activities at home during the treatment blocks. There was no revision during the withdrawal periods. The Phonology and Inconsistency assessments from the DEAP were elicited at the end of each treatment block and again 8 weeks after the final assessment. Two treatment approaches were provided to each participant: (1) Phonological contrast therapy (targeting error patterns): error patterns were identified from analysis of the phonological assessment data. An error pattern was selected for intervention according to the following criteria: targeting nondevelopmental patterns before developmental; consistency and frequency of the use of the error pattern; effect on intelligibility of successful remediation; and stimulability of the speech sounds required. Children with highly inconsistent speech rarely have clearly identifiable error patterns. This makes intervention target selection very difficult. A child with

inconsistent speech disorder may use a range of sound substitutions that differ in manner of production, place of production or voicing. For example, one of the children with inconsistent speech in this study marked /s/ with [b, f, v, t, d, s] or deleted the sound. It is difficult to select the appropriate error to contrast given the range of substitutions and lack of identifiable patterns (i.e. there were no identifiable patterns to the substitutions in terms of word position, surrounding phonemes etc. and the inconsistency was occurring in the same lexical item in the same linguistic context so could not be attributable to factors such as differences in stress or prosody). The children with inconsistent speech who received phonological contrast therapy first therefore received therapy generally targeting structural error patterns (e.g. final consonant deletion, cluster reduction) evident in their speech. Each error pattern was targeted in four stages: auditory discrimination; production in single words; production in phrases (set and then spontaneous); and production in sentences within conversation. A 90% accuracy-training criterion (based on the final 20 productions of target items elicited in the session) was required to move from word to phrase to sentence stage. When an error pattern Intervention for children with severe speech disorder 479 moved to phrase stage a new error pattern was introduced. Ten non-treated probe words were elicited at the end of every second session to monitor generalization (three times throughout treatment block). A minimal pair approach (sometimes with multiple oppositions) was used to reorganize the childs phonological system. The homonymy in the childs system was directly exposed to show the children that they were failing to contrast meaning adequately, that is, the comparison sound to the target was the childs error. The minimal pairs were selected to target specific error patterns. A multiple oppositions approach was used where possible. Pairs of words were included simultaneously targeting a range of sounds affected by the error pattern (e.g. final consonant deletion: bee beep beak bead beef bees beam beach bean beat; backing: tea key, tar car, dough go, die guy). The first stage of the treatment was auditory discrimination. This process was also important to ensure that the stimuli words were familiar and recognizable from the pictures being used. The child was required to discriminate accurately (e.g. sort into words with a final sound versus words without a final sound) and recognize each pair of words. The child was then required to start producing the minimal pairs, initially in imitation, and then spontaneously. Feedback was given regarding the pattern being targeted. For example, the presence of a final sound (bee no/beep yes), what the final sound was (e.g. beep has a /p/ on the end bee p and whether or not the child had used the sound appropriately (e.g. I didnt hear a /p/ on the end when you said beep it sounded like bee to me). Similar linguistic and communicative feedback was given throughout each stage of intervention and for each error pattern targeted. The meaning or communicative basis for the contrast was maximized throughout intervention. Activities were planned that resulted in communicative breakdown if the child did not use the correct form. (2) Core vocabulary therapy (targeting consistency of word production): a modified core vocabulary approach to that described previously in the literature was implemented (Dodd and Iacono 1989, Holm and Dodd 1999, Dodd and Bradford 2000). The complete intervention programme (e.g. therapy activities, information provided to parents/teachers) used in the current study is detailed in Dodd et al. (2004). The child, parents and teacher selected a list of 50 words that were functionally

powerful for the child. The types of words commonly included on the childrens lists were peoples names (e.g. family, teacher, friends), pet names, places (e.g. home street, school, toilet, shops), function words (e.g. please, sorry, thank you), foods (e.g. weetbix, cornflakes, toast, water, chips, drink) and the childs favourite things (e.g. Simpsons, Polly Pocket, teddy, games). The words were not selected according to word shape or segments. They were chosen because the child frequently used the words in their functional communication. The childs increasingly intelligible use of the words selected motivated the use of consistent productions. Each week, ten words were randomly selected from the set of 50 target words. The clinician established the childs best production of each target word. The childs best production was achieved by teaching the word sound-by-sound, using cues such as syllable segmentation, imitation and cued articulation as outlined in Passy (1990). For example, to teach Joseph, the clinician might say: Joseph has two syllables [dzoo] and [sef]. The first syllable [dzoo] has two sounds, /dz/ and /oo/, and the 480 S. Crosbie et al. second syllable [sef] has three sounds /s/, /e/ and /f/. You try it [dzoo]: childs imitation, SLPs feedback, child tries again. Now [sef]: response, feedback, try again. Now put it together: [dzoo-sef]. For some children, a highly effective technique is to link sounds to letters. Usually, children with inconsistent speech disorder are able to imitate all (or most) sounds. If it is not possible to elicit a correct production then the best production may include developmental errors (e.g. [doosef] for Joseph, [taemre] for camera). After the best production was established, the child was required to produce those ten words in the same way throughout the week. The parents and teacher practised the words daily with the child, and reinforced productions of those words in everyday communication situations. The SLP emphasized to parents and other people involved with the child (i.e. teacher, child care worker) that the primary target of the intervention was to make sure the child said the ten words exactly the same way each time they attempted to say them, not the achievement of error-free productions. The ten target words were revised in games and activities during the second weekly session with the SLP. During the core vocabulary therapy, it was considered important to be explicit about the purpose of therapy, the nature of the errors made, and how they could be corrected. If the child produced a target that deviated from the best production, the clinician imitated the production and explicitly explained that the word differed and how it differed. For example, the childs target word was sun and he produced [gn]: the clinician would say [gn], thats different to how I

say it. That had a [g] sound at the start but you need to make it a [s], [sn]. The SLP avoided asking the child to imitate the target word since imitation provides a phonological plan that inconsistent children can use without having to generate their own plan for the word. Instead, the SLP provided information about the plan.
At the end of the second weekly session, the child was asked to produce the ten words three times. Any words they produced consistently were removed from the list of 50 words. Inconsistently produced words remained on the list from which the next weeks ten words were chosen randomly. Once a fortnight a set of ten untreated probe words of two or more syllables (e.g. giraffe, elephant) were elicited three times to monitor generalization. Results

The effects of core vocabulary and phonological contrast therapy were compared for children with inconsistent and consistent speech disorder on two outcome measures: inconsistency of word production and speech accuracy (PCC calculated from the phonology assessment). Difference scores were calculated for each child on the two outcome measures following each type of therapy. Thus, each child had four scores:

N Difference in PCC following core vocabulary therapy. N Difference in PCC following phonological contrast therapy. N Difference in inconsistency score following core vocabulary therapy. N Difference in inconsistency score following phonological contrast
therapy. Intervention for children with severe speech disorder 481 Analysis of variance with repeated measures compared the outcome measures (within subjects factor of therapy: core vocabulary versus phonological contrast; between subjects factor of subgroup: inconsistent versus consistent speech disorder). No evidence was found against the claim that the distribution for the difference scores was normal. A KolmogorovSmirnov test for goodness-of-fit was insignificant for each of the difference scores (p.0.05): difference in PCC following core vocabulary therapy, Z50.0.69, p50.72; difference in PCC following phonological contrast therapy, Z50.79, p50.56; difference in inconsistency score following core vocabulary therapy, Z50.50, p50.96; difference in inconsistency score following phonological contrast therapy, Z50.94, p50.34. Effect of therapy on consistency of word production Table 2 shows the PCC and inconsistency scores for each child across the three main assessments in the study. Table 3 shows the mean (SD) difference in inconsistency scores following each type of therapy by subgroup of speech disorder. An ANOVA with repeated measures compared the amount of change on the inconsistency measure (difference between initial assessment and following treatment) made by the two subgroups of children with speech disorder (inconsistent and consistent) during the two types of therapy. The results showed a significant effect of therapy (F(1,17)55.62, p,0.05) and group (F(1,17)55.77, p,0.05). The interaction between the type of therapy and subgroup of speech disorder was also significant (F(1,17)513.79, p,0.005). Core vocabulary resulted in greater change to consistency than phonological contrast therapy. As predicted, children with inconsistent speech disorder changed more than children with consistent speech disorder. The interaction was examined by plotting each subgroups mean difference on the inconsistency measure following core vocabulary and phonological contrast therapy (figure 1). The consistency of the children with inconsistent speech increased most through core vocabulary therapy. In contrast, the consistency of children with consistent speech disorder changed more when they received phonological contrast therapy. Effect of therapy on speech accuracy Table 3 shows the mean (SD) difference in PCC following each type of therapy by subgroup of speech disorder. An ANOVA with repeated measures compared the change in speech accuracy (difference on PCC between initial assessment and following treatment) made by the two subgroups of children with speech disorder (inconsistent and consistent) during the two types of therapy. The results show a significant effect of therapy (F(1,17)54.52, p,0.05). Overall, phonological contrast

therapy was more effective in changing the PCC than core vocabulary therapy. The effect of group was not significant (F(1,17)50.98, p50.34). The results show a significant interaction between the type of therapy and subgroup of speech disorder (F(1,17)518.75, p,0.001). Figure 2 illustrates the interaction by plotting each subgroups mean difference on PCC measure following core vocabulary and phonological contrast therapy. Phonological contrast therapy was most effective in changing the PCC of children with a consistent speech 482 S. Crosbie et al.
Table 2. Inconsistency and PCC measures at initial assessment and after each block of intervention for each participant Child CA (months) Gender Subgroup Order of therapy PCC Inconsistency (%) Initial Block 1 Block 2 Post Initial Block 1 Block 2 Post 1 77 M inconsistent CV PC 43 75 89 82 56 24 12 20 2 57 F inconsistent CV PC 62 89 98 99 60 16 0 0 3 60 M inconsistent CV PC 51 54 62 62 40 20 20 16 4 60 M inconsistent CV PC 51 66 64 73 60 40 36 20 5 67 F inconsistent CV PC 46 53 60 64 46 28 30 36 6 66 M inconsistent CV PC 25 37 44 67 44 24 32 20 7 56 F inconsistent CV PC 50 60 82 90 56 32 24 0 8 57 M inconsistent PC CV 34 41 57 32 56 48 24 28 9 60 M inconsistent PC CV 53 61 84 92 40 44 12 8 10 67 M inconsistent PC CV 31 48 61 58 56 48 36 28 11 60 M consistent PC CV 55 68 70 81 36 36 24 24 12 65 M consistent PC CV 48 90 98 100 36 8 4 0 13 67 M consistent PC CV 76 83 90 94 36 28 20 12 14 58 M consistent PC CV 45 67 73 87 12 8 12 4 15 60 F consistent PC CV 32 65 68 87 32 32 24 12 16 62 F consistent CV PC 36 47 80 77 28 36 4 8 17 59 F consistent CV PC 56 65 75 69 24 20 16 32 18 62 F consistent CV PC 49 57 92 88 24 12 12 4

Intervention for children with severe speech disorder 483 disorder. In contrast, the PCC of children with inconsistent speech disorder increased when they received core vocabulary therapy. Maintenance of progress All children were assessed 8-weeks post-intervention to examine whether the gains made during therapy were maintained. An analysis of variance with repeated
Table 3. Group summary of change in inconsistency score and PCC following each intervention Group Change in inconsistency (% mean, SD) Change in PCC (mean, SD) Core vocabulary Phonological contrast Core vocabulary Phonological contrast Consistent (n58) 5.00 (7.63) 9.50 (12.99) 6.75 (3.01) 24.62 (12.88) Inconsistent (n510) 24.6 (9.14) 4.20 (7.57) 15.80 (9.05) 9.70 (6.57) Overall 15.89 (12.99) 6.56 (10.35) 11.78 (8.28) 16.33 (12.22) Figure 1. Mean change in inconsistency for each therapy by subgroup of speech disorder. Figure 2. Mean change in PCC for each therapy by subgroup of speech disorder.

484 S. Crosbie et al.

measures compared the PCC of the two subgroups of children with speech disorder (inconsistent and consistent) at the post-therapy and follow-up assessments. The results show no effect of assessment time (immediately following therapy versus follow-up, F(1,17)54.19, p50.06), group (F(1,17)52.82, p50.11) or interaction between time of assessment and group (F(1,17)50.81, p50.38). All children maintained the accuracy gains made during therapy. An analysis of variance with repeated measures compared the inconsistency score of the two subgroups of children with speech disorder (inconsistent and consistent) at the post-therapy and follow-up assessments. The results show an effect of assessment time (F(1,17)56.36, p,0.05) but not group (F(1,17)52.15, p50.16). The interaction between time of assessment and group was not significant (F(1,17)51.19, p50.29). The mean inconsistency scores continued to decrease following withdrawal of therapy (children with inconsistent speech disorder: mean change 6.3%; children with consistent speech disorder: mean change 2.5%). An analysis of variance confirmed no significant differences at follow-up between the subgroups of children (between group: inconsistent versus consistent speech disorder) on either the PCC measure (F(1,17)53.08, p50.99) or inconsistency score (F(1,17)51.05, p50.32). Order effects for children with inconsistent speech disorder It was hypothesized that children with inconsistent speech disorder would make more progress (accuracy: PCC) if they received phonological contrast therapy after core vocabulary therapy. An analysis of variance (between subjects factor of therapy order: CV-PC versus PC-CV) examined whether there were any order of therapy effects for children with inconsistent speech disorder. There were no significant differences between the groups on difference in PCC following phonological contrast therapy (F(1,9)50.08, p50.78) or the PCC at the follow-up assessment (F(1,9)50.001, p50.98). Discussion The purpose of this study was to evaluate the relative effects of two different types of therapy on the consistency of word production and speech accuracy of children with consistent or inconsistent speech disorder. Eighteen children with severe speech disorder participated in an intervention programme that compared phonological contrast and core vocabulary therapy. All the children increased their consonant accuracy during intervention. However, core vocabulary therapy resulted in greater change in children with inconsistent speech disorder and phonological contrast therapy resulted in greater change in children with consistent speech disorder. The results provide evidence that treatment targeting the speechprocessing deficit underlying the childs speech disorder will result in generalization. Core vocabulary therapy provided to children diagnosed with an inconsistent disorder resulted in consistent phonological output of both treated and untreated words. Similarly, phonological contrast therapy resulted in suppression of error patterns, not just remediation of targeted lexical items. The experimental designs in the majority of therapy efficacy studies fail to account for the heterogeneous nature of children with speech disorder. Intervention for children with severe speech disorder 485 Consequently, conflicting results emerge. Children present with different speech disorders (i.e. surface characteristics). Grouping children with speech disorder according to severity, causal factors or linguistic symptomatology is unsatisfactory because it fails to explain the mental operations that result in the production of disordered speech. Psycholinguistic profiling approaches (e.g. Stackhouse and Wells 1997) enable specific intervention targets to be selected based on the individual

childs needs. Dodd (1995) proposed four subgroups of functional speech disorder that reflect different breakdowns in the speech-processing chain. It seems logical that therapy targeting the specific breakdown will most effectively change the surface speech characteristics. The introduction outlined three parameters across which interventions differ. This study examined one of these broad parameters: the intervention approach. Before manipulating individual variables such as target selection, more research is needed to determine whether specific intervention approaches are more effective with different types of speech disorder. Once treatment selection has been determined target selection and other more specific variables become an issue. This study examined phonological contrast therapy that targeted a cognitivelinguistic deficit (unit: phonological error patterns) and core vocabulary therapy that targeted a deficit in phonological planning (unit: whole word). This study recruited children from two subgroups of speech disorder: consistent and inconsistent phonological disorder. Children were classified as having a consistent speech disorder if they used consistent non-developmental speech error patterns. These children also used some developmental rules that were appropriate for their chronological age, or delayed. However, the presence of unusual, nondevelopmental error patterns signals an impaired ability to derive and organize knowledge about the nature of their native phonological system (Dodd and McCormack 1995). It was hypothesized that therapy highlighting the phonological contrasts in error would result in an increase in phonological accuracy as measured by PCC. The results confirmed this hypothesis. Core vocabulary therapy was not hypothesized to alter the childs phonology significantly because it targets a different aspect of the speech-processing chain. When children with consistent speech disorder received core vocabulary, therapy analysis showed little change to their inconsistency score or PCC. It is not surprising that a therapy approach targeting consistency of production did not promote change as the children were already consistent. This type of therapy did not highlight homonymy and so the children did not receive the information they required about the contrastive nature of phonemes. In contrast, phonological contrast therapy resulted in significant system-wide changes. The interaction between accuracy (PCC) and the type of therapy showed that while therapy worked for both groups of children, children with consistent speech disorder made greater accuracy gains than children with inconsistent speech disorder when they received phonological contrast therapy. This finding is consistent with previous research (Dodd and Bradford 2000) and provides evidence for the hypothesis that children whose speech errors are consistent and atypical have a cognitivelinguistic deficit that benefits from therapy that targets reorganization of their phonological knowledge. Children were identified with inconsistent speech disorder if their phonological output had a high degree of variability (at least 40%) and was characterized by multiple error forms for the same lexical item in the same linguistic context. The 486 S. Crosbie et al. surface speech characteristics reflect a deficit in phonological planning. It was hypothesized that core vocabulary therapy targeting the ability to form, or access, phonological plans (templates) on-line would increase consistency in children with inconsistent speech disorder. The results supported the hypothesis. Core vocabulary therapy resulted in increased consistency of production in children with inconsistent speech disorder. An unexpected finding was that core vocabulary therapy created system-wide

change for children with inconsistent speech disorder. Not only did the specific aspect being targeted change (i.e. consistency), but also a global measure of accuracy (PCC) increased. The results provide support for the hypothesis that the underlying deficit for this subgroup of children was phonological planning and not a cognitive linguistic deficit. By improving the ability to form or access phonological plans, the phonological system was able to self-correct and operate successfully. Dodd and Bradford (2000) observed that the sequence of therapies that target different speech deficits might affect phonological outcome. Specifically, children with inconsistent speech disorder may benefit more from phonological contrast therapy once they established consistency of production. To investigate order of therapy effects, the current study used an alternating treatment design. It was hypothesized that children with inconsistent speech disorder who received phonological contrast therapy after receiving core vocabulary would have a better outcome (PCC) on the phonological therapy than children with inconsistent speech disorder who received the therapy approaches in the alternative order (i.e. phonological contrast therapy first followed by core vocabulary). The results did not support the hypothesis. Core vocabulary resulted in the most change to PCC with no differences noted in the amount of change due to phonological contrast therapy, irrespective of whether it was the first or second block of intervention. This finding was surprising. It is logical to assume that a consistent system will be more open to change from phonological contrast therapy than an inconsistent system as it is difficult to identify any patterns in the inconsistent system, let alone target them effectively. It is this factor that might be obscuring the results slightly. The children with inconsistent speech who received phonological contrast therapy first had therapy that targeted structural error patterns (e.g. final consonant deletion, cluster reduction). It is possible that the inconsistent children used some of the information provided in the phonological contrast therapy to improve their phonological planning anyway. Phonological planning involves selection and sequencing phonemes. The phonological contrast therapy given to these children gave them specific feedback regarding one aspect of this planning: the consonantvowel structure of words. Further research needs to examine whether inconsistent children respond differently to phonological contrasts targeting other error patterns such as fronting or stopping. Regardless of the lack of order effect, the most significant finding remains that core vocabulary therapy was more effective than phonological contrast therapy in terms of changes in both the consistency and accuracy of the children with inconsistent speech. Successful interventions not only need to create system-wide change, but also need to maintain that change. An intervention that targets the underlying deficit and not just the surface characteristics of a speech disorder should do both. It should promote real phonological change that is maintained. This study showed significant Intervention for children with severe speech disorder 487 differences in accuracy and consistency measures between initial assessment and final assessment. Accuracy and consistency improved. All children maintained the gains in the per cent of consonants correct after therapy was withdrawn. This study revealed a difference in consistency measures between the final assessment and post-therapy (8 weeks of therapy withdrawal). Consistency of word production continued to improve. The greatest change was observed in the children with inconsistent speech disorder. This pattern of maintenance may reflect a phonological system continuing to integrate a new processing skill.

Therapy that teaches or refines a childs ability to formulate phonological plans would influence the speech-processing system. The period of monitoring in this study (8 weeks) may not have been long enough to observe the final result of integration. The results differ from previous efficacy studies of children with inconsistent speech in terms of change in accuracy and maintenance of progress made in therapy. Forrest and Elbert (2001) reported a treatment programme for four boys who had variable substitution patterns and who had made limited progress in therapy. A multiple baseline treatment design was implemented. The target sound was a fricative omitted from the phonetic inventory by each child. The boys received two 45-minute sessions per week. Therapy targeted the chosen sound in word-final position in three words. The stages of therapy were auditory exposure, imitation and spontaneous production elicited by picture stimuli. Generalization probes measured change in untreated contexts. Only one child met the criteria for treatment termination. Two children showed some generalization to untreated word positions. One child did not show any evidence of generalizing the treated sound to untreated word positions. Forrest and Elbert interpreted the results as evidence that children with variable productions of a sound not in their inventory are rigid when they learn to produce the sound and are unable to recognize that the sound can be produced in different contexts (e.g. other word positions). The differences between the current study and Forrest and Elberts study may reflect significant methodological differences. Forrest and Elbert implemented a different categorization of inconsistency. The subject details given in their paper do not allow comparison with the subjects in the current study. The second major difference between the studies is the approach implemented. Forrest and Elbert used an articulatory approach that did not directly target inconsistency. Core vocabulary therapy specifically targeted consistency of word production. Treating consistency created system wide change that subjects maintained. Conclusions The results indicate that different parts of the speech-processing chain respond differently to therapy targeting different processing skills. A phonological planning deficit can be targeted effectively using a whole word approach. A cognitive linguistic deficit responds best to a phonological contrast approach. Clinically, it is essential to differentially diagnose consistent from inconsistent phonological disorders. The two are caused by different deficits in the speech-processing chain and respond best to different therapeutic approaches. The results provide an evidence-based choice of phonological treatment for children with moderatesevere speech disorder. 488 S. Crosbie et al. Acknowledgments The PPP Healthcare Medical Trust supported this research. The authors are grateful to the children, their parents and teachers who participated in the study, Education Queensland, and the Speech Language Pathologists who referred children for the study. References
ALCORN, M., JARRATT, T., MARTIN, W. and DODD, B., 1995, Intensive group therapy: efficacy of a wholelanguage approach. In B. Dodd (ed.), Differential Diagnosis and Treatment of Children with Speech Disorders (London: Whurr), pp. 181198. BAKER, E. and MCLEOD, S., 2004, Evidence based management of phonological impairment in children. Child Language Teaching and Therapy, 20, 261285. BEERY, K. and BUKTENICA, N., 1997, Developmental Test of Visual-Motor Integration (Toronto: Modern Curriculum).

BLACHE, S. and PARSONS, C., 1980, A linguistic approach to distinctive feature learning. Language Speech and Hearing Services in Schools, 11, 203207. BRADFORD, A. and DODD, B., 1994, The motor planning abilities of phonologically disordered children. European Journal of Disorders of Communication, 23, 349369. BRIERLY, A., 1987, Phonological Disorder in Children (Sydney: Macquarie University). BROOMFIELD, J. and DODD, B., 2004, The nature of referred subtypes of primary speech disability. Child Language Teaching and Therapy, 20, 135151. DEAN, E., HOWELL, J., WATERS, D. and REID, J., 1995, Metaphon: a metalinguistic approach to the treatment of phonological disorder in children. Clinical Linguistics and Phonetics, 9, 119. DODD, B., 1995, Differential Diagnosis and Treatment of Children with Speech Disorders (London: Whurr). DODD, B. and BRADFORD, A., 2000, A comparison of three therapy methods for children with different types of developmental phonological disorder. International Journal of Language and Communication Disorders, 35, 189209. DODD, B., CROSBIE, S. and HOLM, A., 2004, Core Vocabulary Therapy: An Intervention for Children with Inconsistent Speech Disorder (Brisbane: Perinatal Research Centre, Royal Brisbane & Womens Hospital, University of Queensland). DODD, B. and GILLON, G., 2001, Letters to Editor: Phonological awareness therapy and articulation training approaches. International Journal of Language and Communication Disorders, 36, 265269. DODD, B., HOLM, A., ZHU, H. and CROSBIE, S., 2003, Phonological development: normative data from British English-speaking children. Clinical Linguistics and Phonetics, 17, 617643. DODD, B. and IACONO, T., 1989, Phonological disorders in children: changes in phonological process use during treatment. British Journal of Disorders of Communication, 24, 333351. DODD, B., LEAHY, J. and HAMBLY, G., 1989, Phonological disorders in children: underlying cognitive deficits. British Journal of Developmental Psychology, 7, 5571. DODD, B. and MCCORMACK, P., 1995, A model of speech processing for differential diagnosis of phonological disorders. In B. Dodd (ed.), Differential Diagnosis and Treatment of Children with Speech Disorders (London: Whurr), pp. 6589. DODD, B., ZHU, H., CROSBIE, S., HOLM, A. and OZANNE, A., 2002, Diagnostic Evaluation of Articulation and Phonology (London: Psychological Corporation). DODD, B. J., ZHU, H. and SHATFORD, C., 2000, Does speech disorder spontaneously resolve? In I. Barriere, G. Morgan, S. Chiat and B. Woll (eds), Child Language Seminar 1999 Proceedings (London: City University Press), pp. 310. FERRIER, E. and DAVIS, M., 1973, A lexical approach to the remediation of final sound omissions. Journal of Speech and Hearing Disorders, 38, 126131. FEY, M., 1986, Language Intervention with Young Children (San Diego, CA: College-Hill Press). FEY, M., 1992, Clinical Forum: Phonological assessment and treatment. Articulation and phonology: inextricable constructs in speech pathology. Language, Speech and Hearing Services in Schools, 23, 225232. FORREST, K., DINNSEN, D. and ELBERT, M., 1997, Impact of substitution patterns on phonological learning by misarticulating children. Clinical Linguistics and Phonetics, 11, 6376.

Intervention for children with severe speech disorder 489


FORREST, K. and ELBERT, M., 2001, Treatment for phonologically disordered children with variable substitution patterns. Clinical Linguistics and Phonetics, 15, 4145. FORREST, K., ELBERT, M. and DINNSEN, D., 2000, The effect of substitution patterns on phonological treatment outcomes. Clinical Linguistics and Phonetics, 14, 519531. FOX, A. and DODD, B., 2001, Phonologically disordered German-speaking children. American Journal of Speech Language Pathology, 10, 291307. GIERUT, J., 1990, Linguistic foundations of language teaching: phonology. Journal of SpeechLanguage Pathology and Audiology, 14, 521. GIERUT, J., 1991, Homonymy in phonological change. Clinical Linguistics and Phonetics, 5, 119137. GIERUT, J., 1998, Treatment efficacy: functional phonological disorders in children. Journal of Speech, Language and Hearing Research, 41, S85S100. GIERUT, J., 2001, Complexity in phonological treatment: clinical factors. Language, Speech and Hearing Services in Schools, 32, 229241. GIERUT, J., MORRISETTE, M., HUGHES, M. and ROWLAND, S., 1996, Phonological treatment efficacy and developmental norms. Language, Speech and Hearing Services in Schools, 27, 215230. GOLDSTEIN, B., 1995, Spanish phonological development. In H. Kayser (ed.), Bilingual SpeechLanguage Pathology: An Hispanic Focus (San Diego, CA: Singular), pp. 1738. GRUNDY, K., 1989, Developmental speech disorders. In K. Grundy (ed.), Linguistics in Clinical Practice (London: Taylor & Francis), pp. 255280. GRUNWELL, P., 1981, The Nature of Phonological Disability in Children (London: Academic Press). GRUNWELL, P., 1992, Process of phonological change in developmental speech disorders. Clinical

Linguistics and Phonetics, 6, 101122. GRUNWELL, P., 1997, Developmental phonological disability: order in disorder. In B. Hodson and M. Edwards (eds), Perspectives in Applied Phonology (Gaithersburg, MD: Aspen), pp. 61104. HAYDEN, D., 1999, PROMPT Manual Level 1 and 2 (Santa Fe, NM: PROMPT Institute). HESKETH, A., ADAMS, C., NIGHTINGALE, C. and HALL, R., 2000, Phonological awareness therapy and articulatory training approaches for children with phonological disorders: a comparative outcome study. International Journal of Language and Communication Disorders, 35, 337354. HODSON, B. and PADEN, E., 1991, Phonological remediation cycles and targets. In B. Hodson and E. Paden (eds), Targeting Intelligible Speech (Austin, TX: Pro-Ed), pp. 95113. HOLM, A. and DODD, B., 1999, An intervention case study of a bilingual child with phonological disorder. Child Language Teaching and Therapy, 15, 139158. HOLM, A., OZANNE, A. and DODD, B., 1997, Efficacy of intervention for a bilingual child making articulation and phonological errors. International Journal of Bilingualism, 1, 5569. INGRAM, D., 1989, Phonological Disability in Children (London: Cole & Whurr). LEONARD, L., 1985, Unusual and subtle phonological behaviour in the speech of phonologically disordered children. Journal of Speech and Hearing Disorders, 50, 413. MICCIO, A., 2002, Clinical problem solving: assessment of phonological disorders American Journal of Speech Language Pathology, 11, 221229. MICCIO, A., ELBERT, M. and FORREST, K., 1999, The relationship between stimulability and phonological acquisition in children with normally developing and disordered phonologies. American Journal of Speech Language Pathology, 8, 347363. OZANNE, A., 1995, The search for developmental verbal dyspraxia. In B. Dodd (ed.), Differential Diagnosis and Treatment of Children with Speech Disorder (London: Whurr), pp. 91109. PASSY, J., 1990, Cued Articulation (Hawthorn, Victoria: ACER). POWELL, T. and MICCIO, A., 1996, Stimulability: a useful clinical tool. Journal of Communication Disorders, 29, 237254. RVACHEW, S. and NOWAK, M., 2001, The effect of target-selection in phonological learning. Journal of Speech, Language and Hearing Research, 44, 610623. SO, L. and DODD, B., 1994, Phonologically disordered Cantonese-speaking children. Clinical Linguistics and Phonetics, 8, 235255. STACKHOUSE, J. and WELLS, B., 1997, Childrens Speech and Literacy Difficulties 1 (London: Whurr). TOPBAS, S. and KONROT, A., 1996, Variability in phonological disorders: a search for systematicity? Evidence from Turkish speaking children. In, International Clinical Phonetics and Linguistics Association 5th Annual Conference. Munich, Germany, 1618 September. TYLER, A., LEWIS, K. and WELCH, C., 2003, Predictors of phonological change following intervention. American Journal of Speech Language Pathology, 12, 289298.

490 S. Crosbie et al.


WEINER, F., 1981, Treatment of phonological disability using the method of meaningful minimal contrast: two case studies. Journal of Speech and Hearing Disorders, 46, 2934. WIIG, E., SEMEL, E. and SECORD, W., 1992, Clinical Evaluation of Language Fundamentals Preschool (San Antonio, TX: Psychological Corporation). WILLIAMS, A., 2000, Multiple oppositions: case studies of variables in phonological intervention. American Journal of SpeechLanguage Pathology, 9, 282288. WILLIAMS, A., 2003, Speech Disorders: Resource Guide for Preschool Children (Clifton Park, NY: Thomson/ Delmar Learning). WILLIAMS, P. and STACKHOUSE, J., 2000, Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics and Phonetics, 14, 267293. ZHU, H. and DODD, B., 2000, Putonghua (modern standard Chinese)-speaking children with speech disorder. Clinical Linguistics and Phonetics, 14, 165191.

Intervention for children with severe speech disorder 491

From Wikipedia, the free encyclopedia Jump to: navigation, search Speech perception is the process by which the sounds of language are heard, interpreted and understood. The study of speech perception is closely linked to the fields of phonetics and phonology in linguistics and cognitive psychology and perception in psychology. Research in

speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, as well as in foreign-language teaching. The process of perceiving speech begins at the level of the sound signal and the process of audition. (For a complete description of the process of audition see Hearing.) After processing the initial auditory signal, speech sounds are further processed to extract acoustic cues and phonetic information. This speech information can then be used for higher-level language processes, such as word recognition.

[edit] Acoustic cues

Figure 1: Spectrograms of syllables "dee" (top), "dah" (middle), and "doo" (bottom) showing how the onset formant transitions that define perceptually the consonant [d] differ depending on the identity of the following vowel. (Formants are highlighted by red dotted lines; transitions are the bending beginnings of the formant trajectories.) The speech sound signal contains a number of acoustic cues that are used in speech perception. The cues differentiate speech sounds belonging to different phonetic categories. For example, one of the most studied cues in speech is voice onset time or VOT. VOT is a primary cue signaling the difference between voiced and voiceless plosives, such as "b" and "p". Other cues differentiate sounds that are produced at different places of articulation or manners of articulation. The speech system must also combine these cues to determine the category of a specific speech sound. This is often thought of in terms of abstract representations of phonemes. These representations can then be combined for use in word recognition and other language processes.

It is not easy to identify what acoustic cues listeners are sensitive to when perceiving a particular speech sound: At first glance, the solution to the problem of how we perceive speech seems deceptively simple. If one could identify stretches of the acoustic waveform that correspond to units of perception, then the path from sound to meaning would be clear. However, this correspondence or mapping has proven extremely difficult to find, even after some forty-five years of research on the problem.[1] If a specific aspect of the acoustic waveform indicated one linguistic unit, a series of tests using speech synthesizers would be sufficient to determine such a cue or cues. However, there are two significant obstacles: 1. One acoustic aspect of the speech signal may cue different linguistically relevant dimensions. For example, the duration of a vowel in English can indicate whether or not the vowel is stressed, or whether it is in a syllable closed by a voiced or a voiceless consonant, and in some cases (like American English // and //) it can distinguish the identity of vowels.[2] Some experts even argue that duration can help in distinguishing of what is traditionally called short and long vowels in English.[3] 2. One linguistic unit can be cued by several acoustic properties. For example in a classic experiment, Alvin Liberman (1957) showed that the onset formant transitions of /d/ differ depending on the following vowel (see Figure 1) but they are all interpreted as the phoneme /d/ by listeners.[4]

[edit] Linearity and the segmentation problem


Main article: Speech segmentation

Figure 2: A spectrogram of the phrase "I owe you". There are no clearly distinguishable boundaries between speech sounds. Although listeners perceive speech as a stream of discrete units[citation needed] (phonemes, syllables, and words), this linearity is difficult to be seen in the physical speech signal (see Figure 2 for an example). Speech sounds do not strictly follow one another, rather, they overlap.[5] A speech sound is influenced by the ones that precede and the ones that follow. This influence can even be exerted at a distance of two or more segments (and across syllable- and word-boundaries).[5]

Having disputed the linearity of the speech signal, the problem of segmentation arises: one encounters serious difficulties trying to delimit a stretch of speech signal as belonging to a single perceptual unit. This can be again illustrated by the fact that the acoustic properties of the phoneme /d/ will depend on the production of the following vowel (because of coarticulation).

[edit] Lack of invariance


The research and application of speech perception has to deal with several problems which result from what has been termed the lack of invariance. As was suggested above, reliable constant relations between a phoneme of a language and its acoustic manifestation in speech are difficult to find. There are several reasons for this:

Context-induced variation. Phonetic environment affects the acoustic properties of speech sounds. For example, /u/ in English is fronted when surrounded by coronal consonants.[6] Or, the VOT values marking the boundary between voiced and voiceless plosives are different for labial, alveolar and velar plosives and they shift under stress or depending on the position within a syllable.[7] Variation due to differing speech conditions. One important factor that causes variation is differing speech rate. Many phonemic contrasts are constituted by temporal characteristics (short vs. long vowels or consonants, affricates vs. fricatives, plosives vs. glides, voiced vs. voiceless plosives, etc.) and they are certainly affected by changes in speaking tempo.[1] Another major source of variation is articulatory carefulness vs. sloppiness which is typical for connected speech (articulatory "undershoot" is obviously reflected in the acoustic properties of the sounds produced). Variation due to different speaker identity. The resulting acoustic structure of concrete speech productions depends on the physical and psychological properties of individual speakers. Men, women, and children generally produce voices having different pitch. Because speakers have vocal tracts of different sizes (due to sex and age especially) the resonant frequencies (formants), which are important for recognition of speech sounds, will vary in their absolute values across individuals[8] (see Figure 3 for an illustration of this). Research shows that infants at the age of 7.5 months are unable to recognize information presented by speakers of different genders, however by the age of 10.5 months,they are able to detect the similarities.[9] Dialect and foreign accent can also cause variation, as can the social characteristics of the speaker and listener.[10]

[edit] Perceptual constancy and normalization

Figure 3: The left panel shows the 3 peripheral American English vowels /i/, //, and /u/ in a standard F1 by F2 plot (in Hz). The mismatch between male, female, and child values is apparent. In the right panel formant distances (in Bark) rather than absolute values are plotted using the normalization procedure proposed by Syrdal and Gopal in 1986.[11] Formant values are taken from Hillenbrand et al. (1995)[8] Despite the great variety of different speakers and different conditions, listeners perceive vowels and consonants as constant categories. It has been proposed that this is achieved by means of the perceptual normalization process in which listeners filter out the noise (i.e. variation) to arrive at the underlying category. Vocal-tract-size differences result in formant-frequency variation across speakers; therefore a listener has to adjust his/her perceptual system to the acoustic characteristics of a particular speaker. This may be accomplished by considering the ratios of formants rather than their absolute values.[11][12][13] This process has been called vocal tract normalization (see Figure 3 for an example). Similarly, listeners are believed to adjust the perception of duration to the current tempo of the speech they are listening to this has been referred to as speech rate normalization. Whether or not normalization actually takes place and what is its exact nature is a matter of theoretical controversy (see theories below). Perceptual constancy is a phenomenon not specific to speech perception only; it exists in other types of perception too.

[edit] Categorical perception


Main article: Categorical perception

Figure 4: Example identification (red) and discrimination (blue) functions Categorical perception is involved in processes of perceptual differentiation. People perceive speech sounds categorically, that is to say, they are more likely to notice the differences between categories (phonemes) than within categories. The perceptual space between categories is therefore warped, the centers of categories (or "prototypes") working like a sieve[14] or like magnets[15] for incoming speech sounds.

In an artificial continuum between a voiceless and a voiced bilabial plosive[disambiguation needed ], each new step differs from the preceding one in the amount of VOT. The first sound is a prevoiced [b], i.e. it has a negative VOT. Then, increasing the VOT, it reaches zero, i.e. the plosive is a plain unaspirated voiceless [p]. Gradually, adding the same amount of VOT at a time, the plosive is eventually a strongly aspirated voiceless bilabial [p]. (Such a continuum was used in an experiment by Lisker and Abramson in 1970.[16] The sounds they used are available online.) In this continuum of, for example, seven sounds, native English listeners will identify the first three sounds as /b/ and the last three sounds as /p/ with a clear boundary between the two categories.[16] A two-alternative identification (or categorization) test will yield a discontinuous categorization function (see red curve in Figure 4). In tests of the ability to discriminate between two sounds with varying VOT values but having a constant VOT distance from each other (20 ms for instance), listeners are likely to perform at chance level if both sounds fall within the same category and at nearly 100% level if each sound falls in a different category (see the blue discrimination curve in Figure 4). The conclusion to make from both the identification and the discrimination test is that listeners will have different sensitivity to the same relative increase in VOT depending on whether or not the boundary between categories was crossed. Similar perceptual adjustment is attested for other acoustic cues as well.

[edit] Top-down influences


The process of speech perception is not necessarily uni-directional. That is, higher-level language processes connected with morphology, syntax, or semantics may interact with basic speech perception processes to aid in recognition of speech sounds. It may be the case that it is not necessary and maybe even not possible for a listener to recognize phonemes before recognizing higher units, like words for example. After obtaining at least a fundamental piece of information about phonemic structure of the perceived entity from the acoustic signal, listeners are able to compensate for missing or noise-masked phonemes using their knowledge of the spoken language. In a classic experiment, Richard M. Warren (1970) replaced one phoneme of a word with a cough-like sound. His subjects restored the missing speech sound perceptually without any difficulty and what is more, they were not able to identify accurately which phoneme had been disturbed.[17] This is known as the phonemic restoration effect. Another basic experiment compares recognition of naturally spoken words presented in a sentence (or at least a phrase) and the same words presented in isolation. Perception accuracy usually drops in the latter condition. Garnes and Bond (1976) also used carrier sentences when researching the influence of semantic knowledge on perception. They created series of words differing in one phoneme (bay/day/gay, for example). The quality of the first phoneme changed along a continuum. All these stimuli were put into different sentences each of which made sense with one of the words only. Listeners had a tendency to judge the ambiguous words (when the first segment was at the boundary between categories) according to the meaning of the whole sentence.[18]

[edit] Brain Damage

Speech Perception with Brain Disabilities The first ever hypothesis of speech perception were used with patients who suffered from auditory comprehension deficit, also known as Receptive Aphasia. Since then there have been many disabilities that have been classified, which resulted in a true definition of speech perception. [19] The term speech perception describes the process of interest that employs sub lexical contexts to the probe process. It consists of many different language and grammatical functions, such as: features, segments (phonemes), syllabic structure (unit of pronunciation), phonological word forms (how sounds are grouped together), grammatical features, morphemic (prefixes and suffixes), and semantic information (the meaning of the words). In the early years, they were more interested in the acoustics of speech. For instance, they were looking at the differences between /ba/ or /da/, but now research has been directed to the response in the brain from the stimuli. In recent years, there has been a model developed to create a sense of how speech perception works; this model is known as the Dual Stream Model. This model has drastically changed from how psychologists look at perception. The first section of the Dual Stream Model is the ventral pathway. This pathway incorporates middle temporal gyrus, inferior temporal sulcus and perhaps the inferior temporal gyrus. The ventral pathway shows phonological representations to the lexical or conceptual representations, which is the meaning of the words. The second section of the Dual Stream Model is the dorsal pathway. This pathway includes the sylvian parietotemporal, inferior frontal gyrus, anterior insula, and premotor cortex. Its primary function is to take the sensory or phonological stimuli and transfer it into a articulatory-motor representation (formation of speech). [20] Brain Disabilities Aphasia: There are two different kinds of aphasic patients: Expressive Aphasia (also known as Broca's Aphasia) and Receptive Aphasia (also known as Wernickes Aphasia). There are three distinctive dimensions to phonetics: manner of articulation, place of articulation, and voicing. [21] Expressive aphasia: Patients who suffer from this condition typically have lesions on their left inferior frontal cortex. These patients are described with having severe syntactical deficits, which means that they have extreme difficulty in forming sentences correctly. Expressive aphasic patients suffer from more regular rule governed principles in forming sentences, which is closely related to Alzheimer patients. For instance instead of saying the red ball bounced, both of these patients would say bounced ball the red. This is just one example of what a person might say; there are of course many possibilities. [22] Receptive Aphasia: The patients suffer from lesions or damage located in the left temproparietal lobe. Receptive Aphasic patients mostly suffer from lexical-semantic difficulties, but also have difficulties in comprehension tasks. Though they have difficulty saying things or describing things, these people showed that they could do well in online comprehension tasks. This is closely related to Parkinsons disease because both of the diseases have trouble in distinguishing irregular verbs. For instance using the example of the dog went home, a person suffering from Expressive aphasia or Parkinsons disease would say the dog goed home. [23] Parkinsons Disease This disease attacks the brain and makes the patients unable to stop shaking. The effects could be difficulty in walking, communicating, or functioning. Overtime the symptoms go from mild to severe, which can cause extreme difficulties in a persons life. Many psychologists relate Parkinsons disease to Progressive Nonfluent Aphasia, which would cause a person to have comprehension deficits and being able to recognize irregular verbs. For instance using the

example of the dog went home, a person suffering from Expressive aphasia or Parkinsons disease, would say the dog goed home. [24] Treatments Aphasia A group of psychologists conducted a study to test the McGurk effect with Aphasia patients and speech reading. [25] The subjects watched dubbed videos in which the audio and visual did not match. Then after they completed the first part of the experiment, the experimenters taught the aphasic patients to speech read, which is the ability to read lips. The experimenters then conducted the same test and found that the people still had more of an advantage of audio only over visual only, but they also found that the subjects did better in audio-visual than audio alone. The patients also did improve their place of articulation and their manner of articulation. This all means that aphasic patients might benefit from learning how to speech read (lip reading). Parkinsons Disease There are quite a few drug therapies that are possible for Parkinsons disease (ex. Sinemet). Since there is no cure for it, the patient will probably end up having to have surgery done to relieve some of the symptoms. When a patient has this procedure done, they are most likely going to receive a deep brain stimulation. So it will keep the brain stimulated even though the disease tries to disable it. Recently a study was performed to test if surgery helps the patients discover their symptoms post surgery than presurgery. They found that the symptoms were still present but the patients were more aware of their difficulties than before they had surgery. [26] This shows that surgery does improve a patients speech perception, even though it might not cure their disease.

[edit] Research topics


[edit] Infant speech perception
Infants begin the process of language acquisition by being able to detect very small differences between speech sounds. They are able to discriminate all possible speech contrasts (phonemes). Gradually, as they are exposed to their native language, their perception becomes languagespecific, i.e. they learn how to ignore the differences within phonemic categories of the language (differences that may well be contrastive in other languages for example, English distinguishes two voicing categories of plosives, whereas Thai has three categories; infants must learn which differences are distinctive in their native language uses, and which are not). As infants learn how to sort incoming speech sounds into categories, ignoring irrelevant differences and reinforcing the contrastive ones, their perception becomes categorical. Infants learn to contrast different vowel phonemes of their native language by approximately 6 months of age. The native consonantal contrasts are acquired by 11 or 12 months of age.[27] Some researchers have proposed that infants may be able to learn the sound categories of their native language through passive listening, using a process called statistical learning. Others even claim that certain sound categories are innate, that is, they are genetically specified (see discussion about innate vs. acquired categorical distinctiveness). If day-old babies are presented with their mother's voice speaking normally, abnormally (in monotone), and a stranger's voice, they react only to their mother's voice speaking normally. When a human and a non-human sound is played, babies turn their head only to the source of human sound. It has been suggested that auditory learning begins already in the pre-natal period.[28]

How do researchers know if infants can distinguish between speech sounds? One of the techniques used to examine how infants perceive speech, besides the head-turn procedure mentioned above, is measuring their sucking rate. In such an experiment, a baby is sucking a special nipple while presented with sounds. First, the baby's normal sucking rate is established. Then a stimulus is played repeatedly. When the baby hears the stimulus for the first time the sucking rate increases but as the baby becomes habituated to the stimulation the sucking rate decreases and levels off. Then, a new stimulus is played to the baby. If the baby perceives the newly introduced stimulus as different from the background stimulus the sucking rate will show an increase.[28] The sucking-rate and the head-turn method are some of the more traditional, behavioral methods for studying speech perception. Among the new methods (see Research methods below) that help us to study speech perception, near-infrared spectroscopy is widely used in infants.[27]

[edit] Cross-language and second-language speech perception


A large amount of research has studied how users of a language perceive foreign speech (referred to as cross-language speech perception) or second-language speech (second-language speech perception). The latter falls within the domain of second language acquisition. Languages differ in their phonemic inventories. Naturally, this creates difficulties when a foreign language is encountered. For example, if two foreign-language sounds are assimilated to a single mother-tongue category the difference between them will be very difficult to discern. A classic example of this situation is the observation that Japanese learners of English will have problems with identifying or distinguishing English liquid consonants /l/ and /r/.[29] Best (1995) proposed a Perceptual Assimilation Model which describes possible cross-language category assimilation patterns and predicts their consequences.[30] Flege (1995) formulated a Speech Learning Model which combines several hypotheses about second-language (L2) speech acquisition and which predicts, in simple words, that an L2 sound that is not too similar to a native-language (L1) sound will be easier to acquire than an L2 sound that is relatively similar to an L1 sound (because it will be perceived as more obviously "different" by the learner).[31]
[32]

===Speech perception in language or hearing impairment=== Research in how people with language or hearing impairment perceive speech is not only intended to discover possible treatments. It can provide insight into what principles underlie non-impaired speech perception. Two areas of research can serve as an example:

Listeners with aphasia. Aphasia affects both the expression and reception of language. Both two most common types, Expressive aphasia and Receptive aphasia, affect speech perception to some extent. Expressive aphasia causes moderate difficulties for language understanding. The effect of Receptive aphasia on understanding is much more severe. It is agreed upon, that aphasics suffer from perceptual deficits. They are usually unable to fully distinguish place of articulation and voicing.[33] As for other features, the difficulties vary. It has not yet been proven whether low-level speech-perception skills are affected in aphasia sufferers or whether their difficulties are caused by higher-level impairment alone.[33]

Listeners with cochlear implants. Cochlear implantation restores access to the acoustic signal in individuals with sensorineural hearing loss. The acoustic information conveyed by an implant is usually sufficient for implant users to properly recognize speech of people they know even without visual clues.[34] For cochlear implant users, it is more difficult to understand unknown speakers and sounds. The perceptual abilities of children that received an implant after the age of two are significantly better than of those who were implanted in adulthood. A number of factors have been shown to influence perceptual performance. These are especially duration of deafness prior to implantation, age of onset of deafness, age at implantation (such age effects may be related to the Critical period hypothesis) and the duration of using an implant. There are differences between children with congenital and acquired deafness. Postlingually deaf children have better results than the prelingually deaf and adapt to a cochlear implant faster.[34] In both children with cochlear implants and normal hearing, vowels and voice onset time becomes prevalent in development before the ability to discriminate the place of articulation. Several months following implantation, children with cochlear implants are able to normalize speech perception.

[edit] Noise
One of the basic problems in the study of speech is how to deal with the noise in the speech signal. This is shown by the difficulty that computer speech recognition systems have with recognizing human speech. These programs can do well at recognizing speech when they have been trained on a specific speaker's voice, and under quiet conditions. However, these systems often do poorly in more realistic listening situations where humans are able to understand speech without difficulty. See also: King-Kopetzky syndrome and Auditory processing disorder

[edit] Music-Language Connection


Research into the relationship between music and cognition is an emerging field related to the study of speech perception. Originally it was theorized that the neural signals for music were processed in a specialized "module" in the right hemisphere of the brain. Conversely, the neural signals for language were to be processed by a similar "module" in the left hemisphere.[35] However, utilizing technologies such as fMRI machines, research has shown that two regions of the brain traditionally considered exclusively to process speech, Broca's and Wernicke's areas, also become active during musical activities such as listening to a sequence of musical chords.[35] Other studies, such as one performed by Marques et al. in 2006 showed that 8-year-olds that were given six months of musical training showed an increase in both their pitch detection performance as well as in their electrophysiological measures when made to listen to an unknown foreign language[36] Conversely, some research has revealed that, rather than music affecting our perception of speech, our native speech can affect our perception of music. One example is the tritone paradox. The tritone paradox is where a listener is presented with two computer-generated tones (such as C and C-Sharp) that are half an octave (or a tritone) apart and are then asked to determine

whether the pitch of the sequence is descending or ascending. One such study, performed by Ms. Diana Deutsch, found that the listeners interpretation of ascending or descending pitch was influenced by the listeners language or dialect, showing variation between those raised in the south of England and those in California or from those in Vietnam and those in California whose native language was English.[35] A second study, performed in 2006 on a group of English speakers and 3 groups of East Asian students at University of Southern California, discovered that English speakers who had begun musical training at or before age 5 had a 8% chance of having perfect pitch. For the East Asian students that were fluent in their native tone language, 92 percent of the students had perfect pitch.[35]

[edit] Research methods


The methods used in speech perception research can be roughly divided into three groups: behavioral, computational, and, more recently, neurophysiological methods. Behavioral experiments are based on an active role of a participant, i.e. subjects are presented with stimuli and asked to make conscious decisions about them. This can take the form of an identification test, a discrimination test, similarity rating, etc. These types of experiments help to provide a basic description of how listeners perceive and categorize speech sounds. Computational modeling has also been used to simulate how speech may be processed by the brain to produce behaviors that are observed. Computer models have been used to address several questions in speech perception, including how the sound signal itself is processed to extract the acoustic cues used in speech, as well as how speech information is used for higherlevel processes, such as word recognition.[37] Neurophysiological methods rely on utilizing information stemming from more direct and not necessarily conscious (pre-attentative) processes. Subjects are presented with speech stimuli in different types of tasks and the responses of the brain are measured. The brain itself can be more sensitive than it appears to be through behavioral responses. For example, the subject may not show sensitivity to the difference between two speech sounds in a discrimination test, but brain responses may reveal sensitivity to these differences.[27] Methods used to measure neural responses to speech include event-related potentials, magnetoencephalography, and near infrared spectroscopy. One important response used with event-related potentials is the mismatch negativity, which occurs when speech stimuli are acoustically different from a stimulus that the subject heard previously. Neurophysiological methods were introduced into speech perception research for several reasons: Behavioral responses may reflect late, conscious processes and be affected by other systems such as orthography, and thus they may mask speakers ability to recognize sounds based on lowerlevel acoustic distributions.[38] Without the necessity of taking an active part in the test, even infants can be tested; this feature is crucial in research into acquisition processes. The possibility to observe low-level auditory processes independently from the higher-level ones makes it possible to address long-standing

theoretical issues such as whether or not humans possess a specialized module for perceiving speech[39][40] or whether or not some complex acoustic invariance (see lack of invariance above) underlies the recognition of a speech sound.[41]

[edit] Theories
Research into speech perception (SP) has by no means explained every aspect of the processes involved. A lot of what has been said about SP is a matter of theory. Several theories have been devised to develop some of the above mentioned and other unclear issues. Not all of them give satisfactory explanations of all problems, however the research they inspired has yielded a lot of useful data.

[edit] Speech Mode Hypothesis


Speech Mode Hypothesis is the idea that the perception of speech requires the use of specialized mental processing.[42][43] The Speech Mode Hypothesis is a branch off of Fodor's Modularity Theory (see Modularity of Mind). It utilizes a vertical processing mechanism where limited stimuli are processed by special-purpose areas of the brain that are stimuli specific.[43]

Two Versions of Speech Mode Hypothesis

Weak Version Listening to speech engages previous knowledge of language.[42]

Strong Version Listening to speech engages specialized speech mechanisms for perceiving speech.[42]

Three important experimental paradigms have evolved in the search to find evidence for the speech mode hypothesis. These are dichotic listening, categorical perception, and duplex perception.[42] Through the research in these categories it has been found that there may not be a specific speech mode but instead one for auditory codes that require complicated auditory processing. Also it seems that modulatiy is learned in perceptual systems.[42] Despite this the evidence and counter-evidence for the Speech Mode Hypothesis is still unclear and needs further research.

[edit] Motor theory


Main article: Motor theory of speech perception Some of the earliest work in the study of how humans perceive speech sounds was conducted by Alvin Liberman and his colleagues at Haskins Laboratories.[44] Using a speech synthesizer, they constructed speech sounds that varied in place of articulation along a continuum from /b/ to /d/

to //. Listeners were asked to identify which sound they heard and to discriminate between two different sounds. The results of the experiment showed that listeners grouped sounds into discrete categories, even though the sounds they were hearing were varying continuously. Based on these results, they proposed the notion of categorical perception as a mechanism by which humans are able to identify speech sounds. More recent research using different tasks and methodologies suggests that listeners are highly sensitive to acoustic differences within a single phonetic category, contrary to a strict categorical account of speech perception. In order to provide a theoretical account of the categorical perception data, Liberman and colleagues[45] worked out the motor theory of speech perception, where "the complicated articulatory encoding was assumed to be decoded in the perception of speech by the same processes that are involved in production"[1] (this is referred to as analysis-by-synthesis). For instance, the English consonant /d/ may vary in its acoustic details across different phonetic contexts (see above), yet all /d/'s as perceived by a listener fall within one category (voiced alveolar plosive) and that is because "lingustic [sic?] representations are abstract, canonical, phonetic segments or the gestures that underlie these segments."[1] When describing units of perception, Liberman later abandoned articulatory movements and proceeded to the neural commands to the articulators[46] and even later to intended articulatory gestures,[47] thus "the neural representation of the utterance that determines the speaker's production is the distal object the listener perceives".[47] The theory is closely related to the modularity hypothesis, which proposes the existence of a special-purpose module, which is supposed to be innate and probably human-specific. The theory has been criticized in terms of not being able to "provide an account of just how acoustic signals are translated into intended gestures"[48] by listeners. Furthermore, it is unclear how indexical information (e.g. talker-identity) is encoded/decoded along with linguistically relevant information.

[edit] Direct realist theory


The direct realist theory of speech perception (mostly associated with Carol Fowler) is a part of the more general theory of direct realism, which postulates that perception allows us to have direct awareness of the world because it involves direct recovery of the distal source of the event that is perceived. For speech perception, the theory asserts that the objects of perception are actual vocal tract movements, or gestures, and not abstract phonemes or (as in the Motor Theory) events that are causally antecedent to these movements, i.e. intended gestures. Listeners perceive gestures not by means of a specialized decoder (as in the Motor Theory) but because information in the acoustic signal specifies the gestures that form it.[49] By claiming that the actual articulatory gestures that produce different speech sounds are themselves the units of speech perception, the theory bypasses the problem of lack of invariance.

[edit] Fuzzy-logical model

The fuzzy logical theory of speech perception developed by Dominic Massaro[50] proposes that people remember speech sounds in a probabilistic, or graded, way. It suggests that people remember descriptions of the perceptual units of language, called prototypes. Within each prototype various features may combine. However, features are not just binary (true or false), there is a fuzzy value corresponding to how likely it is that a sound belongs to a particular speech category. Thus, when perceiving a speech signal our decision about what we actually hear is based on the relative goodness of the match between the stimulus information and values of particular prototypes. The final decision is based on multiple features or sources of information, even visual information (this explains the McGurk effect).[48] Computer models of the fuzzy logical theory have been used to demonstrate that the theory's predictions of how speech sounds are categorized correspond to the behavior of human listeners.[51]

[edit] Acoustic landmarks and distinctive features


Main article: Acoustic landmarks and distinctive features In addition to the proposals of Motor Theory and Direct Realism about the relation between phonological features and articulatory gestures, Kenneth N. Stevens proposed another kind of relation: between phonological features and auditory properties. According to this view, listeners are inspecting the incoming signal for the so-called acoustic landmarks which are particular events in the spectrum carrying information about gestures which produced them. Since these gestures are limited by the capacities of humans' articulators and listeners are sensitive to their auditory correlates, the lack of invariance simply does not exist in this model. The acoustic properties of the landmarks constitute the basis for establishing the distinctive features. Bundles of them uniquely specify phonetic segments (phonemes, syllables, words).[52]

[edit] Exemplar theory


Exemplar models of speech perception differ from the four theories mentioned above which suppose that there is no connection between word- and talker-recognition and that the variation across talkers is "noise" to be filtered out. The exemplar-based approaches claim listeners store information for word- as well as talkerrecognition. According to this theory, particular instances of speech sounds are stored in the memory of a listener. In the process of speech perception, the remembered instances of e.g. a syllable stored in the listener's memory are compared with the incoming stimulus so that the stimulus can be categorized. Similarly, when recognizing a talker, all the memory traces of utterances produced by that talker are activated and the talker's identity is determined. Supporting this theory are several experiments reported by Johnson[13] that suggest that our signal identification is more accurate when we are familiar with the talker or when we have visual representation of the talker's gender. When the talker is unpredictable or the sex misidentified, the error rate in word-identification is much higher. The exemplar models have to face several objections, two of which are (1) insufficient memory capacity to store every utterance ever heard and, concerning the ability to produce what was

heard, (2) whether also the talker's own articulatory gestures are stored or computed when producing utterances that would sound as the auditory memories.

[edit] See also


Related to the case study of Genie (feral child) Neurocomputational speech processing Speech-Language Pathology

[edit] References
1. ^ a b c d Nygaard, L.C., Pisoni, D.B. (1995). "Speech Perception: New Directions in Research and Theory". In J.L. Miller, P.D. Eimas. Handbook of Perception and Cognition: Speech, Language, and Communication. San Diego: Academic Press. 2. ^ Klatt, D.H. (1976). "Linguistic uses of segmental duration in English: Acoustic and perceptual evidence". Journal of the Acoustical Society of America 59 (5): 12081221. DOI:10.1121/1.380986. PMID 956516. 3. ^ Halle, M., Mohanan, K.P. (1985). "Segmental phonology of modern English". Linguistic Inquiry 16 (1): 57116. 4. ^ Liberman, A.M. (1957). "Some results of research on speech perception" (PDF). Journal of the Acoustical Society of America 29 (1): 117123. DOI:10.1121/1.1908635. http://www.haskins.yale.edu/Reprints/HL0016.pdf. Retrieved 2007-05-17. 5. ^ a b Fowler, C.A. (1995). "Speech production". In J.L. Miller, P.D. Eimas. Handbook of Perception and Cognition: Speech, Language, and Communication. San Diego: Academic Press. 6. ^ Hillenbrand, J.M., Clark, M.J., Nearey, T.M. (2001). "Effects of consonant environment on vowel formant patterns". Journal of the Acoustical Society of America 109 (2): 748763. DOI:10.1121/1.1337959. PMID 11248979. 7. ^ Lisker, L., Abramson, A.S. (1967). "Some effects of context on voice onset time in English plosives" (PDF). Language and Speech 10 (1): 128. PMID 6044530. http://www.haskins.yale.edu/Reprints/HL0067.pdf. Retrieved 2007-05-17. 8. ^ a b Hillenbrand, J., Getty, L.A., Clark, M.J., Wheeler, K. (1995). "Acoustic characteristics of American English vowels". Journal of the Acoustical Society of America 97 (5 Pt 1): 30993111. DOI:10.1121/1.411872. PMID 7759650. 9. ^ Houston, Derek M.; Juscyk, Peter W. (October 2000). "The role of talker-specific information in word segmentation by infants". Journal of Experimental Psychology: Human Perception and Performance 26 (5): 15701582. DOI:10.1037/00961523.26.5.1570. http://babytalk.iupui.edu/pdfs/HoustonJusczyk_2000.pdf. Retrieved 1 March 2012. 10. ^ Hay, Jennifer; Drager, Katie (2010). "Stuffed toys and speech perception". Linguistics 48 (4): 865892. DOI:10.1515/LING.2010.027.

11. ^ a b Syrdal, A.K., Gopal, H.S. (1986). "A perceptual model of vowel recognition based on the auditory representation of American English vowels". Journal of the Acoustical Society of America 79 (4): 10861100. DOI:10.1121/1.393381. PMID 3700864. 12. ^ Strange, W. (1999). "Perception of vowels: Dynamic constancy". In J.M. Pickett. The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology. Needham Heights (MA): Allyn & Bacon. 13. ^ a b c Johnson, K. (2005). "Speaker Normalization in speech perception". In Pisoni, D.B., Remez, R.. The Handbook of Speech Perception. Oxford: Blackwell Publishers. http://corpus.linguistics.berkeley.edu/~kjohnson/papers/revised_chapter.pdf. Retrieved 2007-05-17. 14. ^ Trubetzkoy, Nikolay S. (1969). Principles of phonology. Berkeley and Los Angeles: University of California Press. ISBN 0-520-01535-5. 15. ^ Iverson, P., Kuhl, P.K. (1995). "Mapping the perceptual magnet effect for speech using signal detection theory and multidimensional scaling". Journal of the Acoustical Society of America 97 (1): 553562. DOI:10.1121/1.412280. PMID 7860832. 16. ^ a b Lisker, L., Abramson, A.S. (1970). "The voicing dimension: Some experiments in comparative phonetics" (PDF). Proc. 6th International Congress of Phonetic Sciences. Prague: Academia. pp. 563567. http://www.haskins.yale.edu/Reprints/HL0087.pdf. Retrieved 2007-05-17. 17. ^ Warren, R.M. (1970). "Restoration of missing speech sounds". Science 167 (3917): 392393. DOI:10.1126/science.167.3917.392. PMID 5409744. 18. ^ Garnes, S., Bond, Z.S. (1976). "The relationship between acoustic information and semantic expectation". Phonologica 1976. Innsbruck. pp. 285293. 19. ^ Poeppel, David; Philip J. Monahan (2008). "Speech Perception: Cognitive Foundations and Cortical Implementation". Current Directions in Psychological Science. 2 17: 80-85. 20. ^ Hickok, Gregory; David Poeppel (May 2007). "The cortical organization of speech processing". Nature Reviews/ Neuroscience 8: 393-402. 21. ^ Hessler, Dorte; Jonkers, Bastiaanse (December 2010). "The influence of phonetic dimensions on aphasic speech perception". Clinical Linguistics and Phonetics. 12 24: 980-996. 22. ^ Hessler, Dorte; Jonkers, Bastiaanse (December 2010). "The influence of phonetic dimensions on aphasic speech perception". Clinical Linguistics and Phonetics. 12 24: 980-996. 23. ^ Hessler, Dorte; Jonkers, Bastiaanse (December 2010). "The influence of phonetic dimensions on aphasic speech perception". Clinical Linguistics and Phonetics. 12 24: 980-996. 24. ^ Frost, Eleanor; Tripoliti, Hariz, Pring, Limousin (2010). "Self-perception of speech changes in patients with Parkinson's disease following deep brain stimulation of the subthalamic nucleus". International Journal of Speech-Language Pathology 12 (5): 399404. 25. ^ Hessler, Dorte; Jonkers, Bastiaanse (December 2010). "The influence of phonetic dimensions on aphasic speech perception". Clinical Linguistics and Phonetics. 12 24: 980-996. 26. ^ Frost, Eleanor; Tripoliti, Hariz, Pring, Limousin (2010). "Self-perception of speech changes in patients with Parkinson's disease following deep brain stimulation of the

subthalamic nucleus". International Journal of Speech-Language Pathology 12 (5): 399404. 27. ^ a b c Minagawa-Kawai, Y., Mori, K., Naoi, N., Kojima, S. (2006). "Neural Attunement Processes in Infants during the Acquisition of a Language-Specific Phonemic Contrast". The Journal of Neuroscience 27 (2): 315321. DOI:10.1523/JNEUROSCI.1984-06.2007. PMID 17215392. 28. ^ a b Crystal, David (2005). The Cambridge Encyclopedia of Language. Cambridge: CUP. ISBN 0-521-55967-7. Definition of 'social adjustment' 2. 1. social adjustment Adaptation of the person to the social environment. Adjustment may take place by adapting the self to the environment or by changing the environment. (From Campbell, Psychiatric Dictionary, 1996)
U.S. National Library of Medicine

Das könnte Ihnen auch gefallen