Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CS 551/651: Structure of Spoken Language Spectrogram Reading: Stops John-Paul Hosom Fall 2010.

Similar presentations


Presentation on theme: "1 CS 551/651: Structure of Spoken Language Spectrogram Reading: Stops John-Paul Hosom Fall 2010."— Presentation transcript:

1 1 CS 551/651: Structure of Spoken Language Spectrogram Reading: Stops John-Paul Hosom Fall 2010

2 2 Stops/Plosives There are six plosives (oral stops) in American English:.bilabialalveolarvelar unvoiced | /p/ /t/ /k/ voiced | /b/ /d/ /g/ plus the flap /dx/ which is a very short /t/ or /d/ Plosives can be difficult to identify and discriminate; contextual cues can be varied Cue (1) is the formant transitions of neighboring vowels: for bilabials, F2 drops at CV boundary for alveolars, F2 goes toward 1800 Hz at CV boundary for velars, F2 may meet F3 (velar pinch) or be fairly flat Cue (2) is that voiced plosives may have pre-voicing; more likely when plosive is between two vowels

3 3 Stops/Plosives Cue (3) is that voiced plosives usually have VOT of < 30 msec, but unvoiced plosives usually have VOT of > 50 msec Cue (4) is that the VOT is shortest for bilabials, longer for alveolars, and longest for velars. (VOT /p/ < /t/ < /k/ and /b/ < /d/ < /g/) Cue (5) is that aspirated (unvoiced) plosives show evidence of F2 and F3 during aspiration; voiced plosives usually don’t Cue (6) is the spectral shape; in theory, the shape of the spectrum at burst release can be used to distinguish plosives: /p/ and /b/ have energy low in frequency or weakly spread throughout spectrum, /t/ and /d/ have more energy above 4KHz (related to alveolar fricatives /s/ and /z/), /k/ and /g/ tend to have more well-defined peaks in the spectrum (near formant locations).

4 4 Stops/Plosives Other cues related to spectral shape: Cue (7a): In the context of front vowels, /k/ and /g/ have spectral peak just above F2 of adjacent vowel, making them confusable with /t/ and /d/; but front vowels show more “velar pinch” Cue (7b): In the context of back vowels, /k/ and /g/ have one spectral peak between 1000 and 1500 Hz, a second peak between 3000 and 4500 Hz. Cue (8): Velar bursts also sometimes display “double burst”, or a second burst during the frication Cue (9): Post-vocalic consonants are often unreleased; they can be identified by (a) glottalization, (b) sudden drop in vowel energy, or (c) formant movement at end of vowel

5 5 Stops/Plosives Cue (10): When the plosive is unreleased, the voicing distinction is based more on length of preceding vowel; voiced plosives are associated with longer vowels, unvoiced plosives with shorter vowels Cue (11): In V 1 C 1 C 2 V 2 patterns, where both C are plosives, the existence of two plosives is in the different formant transitions in V 1 and V 2, the longer duration of closure, and sometimes in a brief “click” in spectrum indicating a change in place of articulation Cue (12): Plosives have different characteristics in stressed vs. unstressed environments. VOT for unvoiced plosives before unstressed vowels is shorter than VOT for unvoiced plosives before stressed vowels; plosives in an unstressed-vowel environment are less spectrally clear; in unstressed syllables, /t/ and /d/ may be realized as a flap /dx/.

6 6 Stops/Plosives Cue (13): Flaps have short duration (< 30 msec), dip in energy levels between two vowels, weak F2 and F3, and F2 tends toward 1800 Hz Cue (14): Consonant clusters can provide restrictions; for 3-consonant clusters (beginning with /s/-plosive), the only valid combinations are: /s p l/, /s p r/, /s p y/ /s t r/, /s t y/ /s k l/, /s k r/, /s k y/, /s k w/ Cue (15): In /s/−plosive−vowel combinations, VOT tends to be shorter and duration of /s/ shorter than normal

7 7 Plosives: Unvoiced Initial in Front-Vowel Context /p iy t iy k iy/

8 8 Plosives: Voiced Initial in Front-Vowel Context /b iy d iy g iy/

9 9 Plosives: Unvoiced Initial in Mid-Vowel Context /p ah t ah k ah/

10 10 Plosives: Voiced Initial in Mid-Vowel Context /b ah d ah g ah/

11 11 Plosives: Unvoiced Initial in Back-Vowel Context /p aa t aa k aa/

12 12 Plosives: Voiced Initial in Back-Vowel Context /b aa d aa g aa/

13 13 Plosives: Unvoiced Final in Front-Vowel Context /iy p iy t iy k/

14 14 Plosives: Voiced Final in Front-Vowel Context /iy b iy d iy g/

15 15 Plosives: Unvoiced Final in Mid-Vowel Context /ah p ah t ah k/

16 16 Plosives: Voiced Final in Mid-Vowel Context /ah b ah d ah g/

17 17 Plosives: Unvoiced Final in Back-Vowel Context /aa p aa t aa k/

18 18 Plosives: Voiced Final in Back-Vowel Context /aa b aa d aa g/


Download ppt "1 CS 551/651: Structure of Spoken Language Spectrogram Reading: Stops John-Paul Hosom Fall 2010."

Similar presentations


Ads by Google