Beruflich Dokumente
Kultur Dokumente
Vowel Duration Preceding Voiced and Voiceless Stops Duration of the vowel nucleus including onset and offset formant transitions
VOWEL DURATION
Cue: VD
Removal of the closure interval & release burst forces the listener to rely on available cues that are present in the vocalic segments.
Analogous to the case in vision, where a gray stimulus appears darker when viewed adjacent to a white stimulus than when viewed adjacent to a black one
A short vocalic interval makes a long closure interval seem even longer, and thereby favors perception of a final consonant as voiceless.
Vocalic duration cues voicing by enhancing the closure duration cue. In audition, a long closure segment seems even longer when heard in the context of a short vocalic segment. Vocalic duration thus comes to cue voicing by enhancing the saliency of the closure interval duration.
Because the contrast hypothesis is based up on general principles of perception, it predicts that the capacity for vowel duration to modify perception of closure duration (and thus that the ability to use vocalic duration as a voicing cue) should be universal.
For English, durational differences between vowels and vocalic nuclei preceding voiced vs. voiceless stops are large compared to many other languages.
Reason
Lack of experience with final stop consonants in Japanese and Mandarin. Neither Japanese or Mandarin allows CVCs ending in stops Therefore, native speakers of this language make less extensive use of VD as a cue before the stop.
It could be the long versus short vowel contrast that could have improved their performance. Mandarin No phonemically long and short vowels.
//
/I/
The range of durations in panel (b) is shorter than those in panels (a) and (c) because of intrinsic duration differences between short and long vowels.
/i/
VD - Sufficient cue
The results of several studies in which preceding vowel duration was varied found it to be a sufficient cue to the voicing distinction
(Krause, 1982; OKane, 1978; Raphael, 1972; Raphael, Dorman, & Liberman, 1980; Raphael et al., 1975).
Raphael (1972)
Used the Pattern Playback to synthesize syllables ending in voiced and voiceless stops
Final consonants
Voiceless Preceded by short duration vowels & Voiced Preceded by long duration vowels.
After recording the voiced series, each of the voiced sounds were converted to voiceless stimulus by eliminating the final 50-msec F1 transition. This produces the bottom stimulus
Conclusion
Preceding vowel duration was both a necessary & sufficient cue to syllable-final voicing. Similar findings were also reported in two subsequent synthesis studies by Raphael et al. (1975, 1980).
Studies which do not Experimental evidence supporting VD consists support vowel duration primarily of a series of synthetic speech studies by Raphael & his colleagues (Raphael, 1972; Raphael et al., 1975, 1980).
nd Cue 2
F1 Onset, Steady state and Offset
Why do we need to study different vowel contexts? Experimental evidence supporting VD consists
primarily of a series of synthetic speech studies by Raphael & spectral properties may be Because temporal and his colleagues (Raphael, 1972; Raphael et al., 1975, 1980). weighted differently as a function of steady-state
level, it is important to examine the properties in different vowel contexts (Hillenbrand et al., 1984; Summers, 1988).
Summers, 1988
F1 structure provides information for final-consonant voicing
F1 offset values may play a role in voicing perception only in some vowel contexts.
He examined whether differences in F1 frequency in the initial-transition and steady-state portions of preceding vowels provide perceptual information about postvocalic voicing. Used cascade formant synthesis software (Klatt, 1980)
Summers 1988
Varied the steady-state F1 value of the vowel (, ) in /bVb/ & /bVp/ syllables & found that a lower steady-state F1 yielded more nal [+voice] identication responses . This perceptual experiment followed an earlier production study (Summers , 1987) showing that, before [ +voice] consonants, F1 is lower throughout most of the vowel.
&
Both vowels examined, contained high F1 frequencies.
The results do not support F1 offset frequency as a voicing cue because the results of LH versus LL comparison in which offset frequency differences were present were not statistically significant. Low F1 offset frequency voiced High F1 offset frequency voiceless Steady state information may outweigh the onset cues for conveying final voicing. Due to longer length conveys more information.
Learning about
Combination of frequency and intensity characteristics associated with gradual versus abrupt termination of the preceding vowel
Offset characteristics of vowels - F1 transition preceding final stop consonants are important to the perception of the voicing (Parker, 1974; Walsh & Parker, 1981, 1983; Walsh, Parker, & Miller, 1987)
Voiced
Voiceless
Reason
Perceptual use of F1 may be less affected by native language experience or relatively easy to learn. F1 offset may be a universal cue. It may be a language specific cue, that is more easily learnt by non-native speakers. (Studies involving very inexperienced subjects will be needed to verify this)
Synthetic CVC stimuli. They found that both VD and the frequency of F1 offset affected listeners perception of voicing class.
//
/I/
/i/
//
650 Hz
600 Hz
200 Hz
400 Hz
Vowel context
The influence of the F1 transition-offset frequency on voicing perception appeared to be related to vowel context, specifically, the frequency of F1 steady-state value for a particular vowel.
Production constraints
Production constraints restrict the extent of frequency change in vowel transition offset relative to vowel steady state for /i/ compared to // and /I/.
In English the extent of the F1 transition is greater for the voiced than for the unvoiced stop consonants (Liberman, Delattre and Cooper, 1959)
Contrary to Walsh et al. (1987) results, the Fischer and Ohde found that the fastest transition rate of 10 Hz/ms did not necessarily elicit the perception of voiced sounds.
2ND STUDY
The following figure shows the mean rating of final consonant voicing as a function of vowel duration and the change in rate of the F 1 offset transition.
Low vowel // - high-F1 steady-state value High vowel /i/ with a low-F1 steady-state value.
// 500 Hz F1 offset
// 200 Hz F1 offset
For the // continua with the 500-Hz F1 offset frequency, transition rate did not significantly influence voicing perception at any vowel duration.
The other two vowel continua, both with lower F1 offset frequency values of 200 Hz, demonstrated some influence of transition rate for syllables in the range of 200-300 ms as previously reported by Walsh et al. (1987)
Stimuli with the fastest transition rate generally elicited the lowest voicing ratings.
Krause (1982a)
Krause (1982a) examined development of vowel length as a cue to phonological voicing in post-vocalic stops among children. She synthesized 3 monosyllabic spectral configuration to represent the pairs bip/bib, pot/pod and back/bag. Pot/pod & back/bag contained low F1 transitions. 3 & 6 year old children along with adults. Result: Age incrs shorter VDs were sufficient to change perception from VL to V.
1 group- Always labeled back/bag as bag. 2nd group Bip/ bib stimuli (level F1 transition) as bip Krause concluded for some children, the presence of an F1 offset transition may always cue a voiced consonant and the absence of F1 transition may always cue a VL stop, independent of the VD.
VOWEL TILT
The onset spectrum of the stop changes relative to the following vowel
Vowel tilts shallower - more positive than the consonant onset tilt are expected to result in more labial responses. Vowel tilts steeper more negative than the consonant onset tilt are expected to result in more alveolar responses. 6 dB/ oct. consonant onset tilt diverged to different vowel tilts.
Twenty-one subjects could identify /ba/ or /da/ in a series of 40 CVs that varied along both F2-onset frequency in eight steps
Short-term spectra for the rst four pitch pulses for the stimuli.
Stimuli with an F2-onset frequency = 1400 Hz. In the first panel - more labial responses Second panel - more alveolar responses despite having identical stimulus onset spectra because the change in tilt is different. For the 1st panel, tilt becomes shallower until it reaches a at spectrum that is sustained during the duration of the vowel whereas, for the bottom panel, tilt becomes steeper until it reaches a steeply negative spectrum for the vowel.
The first panel shows a relative attening of spectral tilt (6 to 0 dB/ oct) from consonant onset (t= 0 ms) to vowel steady state (t= 30 ms). This pattern of change is predicted to increase the perception of a labial stop consonant. In contrast, the 2nd panel is predicted to increase the perception of an alveolar stop consonant because spectral tilt becomes steeper over the course of the consonant transition.
Mean data for the experiment in which the probability of responding /da/ as a function of F2-onset frequency is plotted separately for each vowel tilt. Maximum likelihood ts of the identication functions are displayed for the mean data at each vowel tilt as different lines (see the legend).
The experiment demonstrated the relative influence of spectral tilt change as a perceptual cue to stop consonant identification in a CV context without bursts.
FORMANT TRANSITION
Revoile (1981)
1ST STUDY
Revoile (1981)
Studied transition switched and transition deleted. Switching vowel transition resulted in listeners' perceiving the voicing characteristics of the following stop to be that of the stop in the syllable in which the vowel transition was produced.
Deletion of the VT impaired the overall identification of voicing in the following stop.
The acoustic shape of the formant transition varies as a function of the following vowel. Therefore, the vowel-formant transitions are necessarily context dependant cues for stop consonants.
The results indicate that: with the exception of voiceless stops identified from forward CV transitions, consonants were identified considerably better than chance from CV and VC vowel transitions. More correct identifications of consonants were made from VC transitions than from CV transitions in both the forward and backward play conditions. Backward play CV transitions produced much higher identification scores than those played forward, and
Listeners derive more information from transitions when they are pre-consonantal than when they are postconsonantal.
Task : Subjects (17) had to check the appropriate consonant on a list provided. They studied 1) Transition position (CV/VC) 2) Voiced / Voiceless 3) Place ( alveolar/ labial/ velar) Neutral vowel // was used with each stimulus. Aspiration of the stop release was removed
Subtests
CV transitions voiced stops & voiceless stops VC transitions - voiced stops & voiceless stops
Place of production
Labial consonants Transitions from the neutral vowel are neutral or negative, that is the second formant is not higher in frequency than the steady state portion of the vowel. Palatal & alveolar Transitions from the neutral vowel are positive, that is the second formant frequency increases relative to the vowel SS. Negative portion of the formant transition produces fewer confusions than the positive inflections, since the former is indicative of only the labial position.
The discrimination of vowel duration by infants' Rebecca E. Eilers, Dale H. Bull, D. Kimbrough Oiler, and Diana C. Lewis
INFANT PERCEPTION
VD perception in infancy
Three groups of nine 5 to 11-month-old infants provided evidence of discrimination of speech like stimuli differing only in vowel duration. Ease of discrimination was directly related to the magnitude of the ratio of the longer to shorter vowel.
Infants were tested for discrimination of synthetic stimuli differing by 100-, 200-, and 300- ms vowel duration in one-, two-, and three- syllable contexts and on a final position stop voicing contrast cued by voice excitation only.
In all cases, the contrasting durations were carried by the last vowel of the synthetic word.
Group one infants discriminated three vowel duration contrasts (with ratios of 0.33, 0.67, and 1.0) embedded in a synthetic/mad/syllable;
Group two: Discriminated these same duration contrasts within the bisyllable/samad/, Group three: In the trisyllable /masamad/.
House and Fairbanks (1953) showed that voicing of final consonants in English is cued by a 69% increase in VD preceding a voiced consonant. Since infants showed fairly good discrimination of VD increments of 67%, They may be able to make phonemic discriminations of final consonant voicing in English. Conclusion Dominant cue for final consonant voicing relative duration of pre-consonantal vowel.
These same three infant groups failed to provide evidence of discrimination of a final position released stop consonant contrast (/mat/versus /mad/) cued by voice excitation during closure of the/d/and not the /t/.
Thank You
Tense vowels are longer than lax vowels and low vowels are longer than high. http://dspace.vidyanidhi.org.in:8080/dspace/bits tream/2009/1389/7/UOM-1996-041-6.pdf
http://www.freeppt.net/animalbackgrounds.html
Production differences
The vocal folds abduct sooner in the productions with voiceless final stops.
Organization of the whole word differs as a function of the voicing of the final stop. The jaw lowers faster & moves to more open positions in syllables with voiceless, rather than voiced, final stops (Gracco, 1994; Summers, 1987).
In words with voiceless final stops the jaw is also quicker to close (Summers, 1987) The tongue is quicker to move away from its vowel-related posture (Raphael, 1975). For words with voiceless final stops, laryngeal vibration ceases before vocaltract closure is achieved, whereas laryngeal vibration continues into closure for words with voiced final stops. All these articulatory differences create numerous acoustic differences between words with voiced and voiceless final stops: Words with voiceless final stops have shorter portions than words with voiced final stops.
Fo
A lower steady-state fo and a lower offset fo both increased voiced labeling responses.
Higher F2
Lower F2
Vowel length
VD Offset cues
VD
The stimuli included 10 English vowel contexts, 11 levels of F2 onset per vowel, and 3 levels of F3 onset orthogonally varied with the F2 variables 10 vowels x 11 F2 onsets x 3 F3 onsets = 330 stimuli In order of ascending F2 vowel /, o, a, , u, , , I, e, i/. Task : Subjects were asked to identify each stimulus as most similar to b, d, g, w or no consonant.
Across tokens various parameters were kept constant (stimuli were burstless).
300 ms
2nd formant transition sampled at its onset is F2 Onset Frequency of the second formant sampled in the middle of the following vowel (F2 vowel) for a coarticulated consonant. The results strongly indicate F2 onset and F2 vowel, in combination, serve as important cues for stop consonant place of articulation.
F3 carries most weight when the F2 vocalic transitional cues are not distinctive.
Effects of F3
There was no effect of the F3 condition on /b/ identification in back vowel contexts and on /g/ identification in front vowel contexts.
While there were effects on /d/ versus /g/ identification in back vowel contexts and on b versus d identifications in front vowel contexts.
Comparing patterns of F3
There are F3 effects in those regions in which there is an overlap between the different stop places of articulation back vowel /b/ and front vowel /g/. There are tradeoff consequences between the overlapping stops in the region of their overlap. /d/ and /g/overlap in back vowel space, /b/ and /d/ in front vowel space. These tradeoffs are in the natural directions, With g-like F3 elevating /g/ versus /d/ identifications, & b-like F3 elevating /b/ versus /d/ identifications.
Results of discriminant analysis showing percent classification of place of articulation. Predictor variables were F2 onset and F2 vowel frequencies.
Results of discriminant analysis showing percent classification of place of articulation, across all vowel contexts. Predictor variables were F2 onset and F2 vowel frequencies. Sussman (1991)
Clearly, additional information (F3) is needed to discriminate /d/ versus /g/ in back vowel contexts.
How do transitions vary depending on the place of articulation and the vowel context?
Bilabials
Transitions are longer before unrounded than before rounded vowels.
Apical stops
The distance between the point of occlusion and vowel target configuration varies, so we can expect both devoiced and voiced transitions to be more effective cues to /d/ before back vowels, where transitions are relatively long, than before front vowels, where transitions are short.
Velars
Determining factor is degree of similarity between the velar tongue constriction and that of the following vowel; in general close vowels such as /i/ will have little transition and open vowels /a/ has marked transition.
The results of this study indicated that there was a tendency toward reciprocal performances on bursts and transition;
When transitions are brief for /b/ before rounded vowels, for /d/ before high front vowels, & /g/- before close vowels, the burst lies near the major spectral peak of the following vowel and contributes significantly to the perceptual outcome.
Where transitions are extensive, for /b/ before middle, unrounded vowels, for /d/ before central-back vowels, the burst is distinctive from the major spectral peak of the following vowel and contributes little to the perceptual outcome.