Presentation on theme: "1 CS 551/651: Structure of Spoken Language Lecture 4: Characteristics of Manner of Articulation John-Paul Hosom Fall 2008."— Presentation transcript:
1 CS 551/651: Structure of Spoken Language Lecture 4: Characteristics of Manner of Articulation John-Paul Hosom Fall 2008
2 Self-Study If you want to look at spectrograms of your own voice, there are several programs available: 1.Matlab Use the “specgram” command; color map can be changed using “colormap gray” or similar commands 2.CSLU Toolkit Download from http://www.cslu.ogi.edu/toolkit Registration required but free for educational use Plot spectrograms with “SpeechView” tool. 3.Praat Download from http://www.fon.hum.uva.nl/praat/ Free and available for windows, lunix, Macintosh, etc.
6 Acoustic-Phonetic Features: Manner of Articulation Approximately 8 manners of articulation: NameSub-Types Examples. Vowelvowel,diphthongaa, iy, uw, eh, ow, … Approximantliquid, glidel, r, w, y Nasalm, n, ng Plosiveunvoiced, voicedp, t, k, b, d, g Fricativeunvoiced, voicedf, th, s, sh, v, dh, z, zh Affricateunvoiced, voicedch, jh Aspirationh Flapdx, nx Change in manner of articulation usually abrupt and visible; manner provides much information about location of phonemes.
7 Acoustic-Phonetic Features: Manner of Articulation Approximants (/l/, /r/, /w/, /y/): vowel-like properties, but more constriction /l/ has tongue-tip touching alveolar ridge, /r/ has tongue tip curled up/back (retroflex), raised and “bunched” dorsum, sides of tongue touching molars, /w/ has tongue back and lips rounded, /y/ has tongue toward front and very high glides (/w/, /y/) can be viewed as “extreme” production of a vowel (sometimes called semivowels): /w/ /uw/ /y/ /iy/
8 Acoustic-Phonetic Features: Manner of Articulation Approximants (/l/, /r/, /w/, /y/): movement of tongue slower than other vowel-to-vowel or consonant-to-vowel transitions, but not as slow as diphthong movement sometimes voiceless when following a voiceless plosive (“play”) /l/ may have slight discontinuity when tongue makes/breaks contact with alveolar ridge; other approximants have no discontinuity
9 Acoustic-Phonetic Features: Manner of Articulation Nasal (/m/, /n/, /ng/): produced with velic port open and obstruction in vocal tract sound travels through nasal cavities these cavities filter speech with both poles (resonances) and zeros (anti-resonances) longer pathway causes primary resonance to be low (220-300 Hz) anti-resonances cause higher frequencies to have lower power /m/ F1 P1 F2 F3 P2 F4 F5 F6 Z1 Z2
10 Acoustic-Phonetic Features: Manner of Articulation Nasal (/m/, /n/, /ng/): formant structure obscured by pole-zero pairs all three English nasals look and sound similar (place of articulation has little effect on spectrum); can be distinguished primarily by coarticulatory effects on adjacent vowel(s). sometimes very brief duration (“camp”, “winner”) occasional confusion with /w/, /l/ (if F3 not visible), and closure portion of voiced plosives often sharp discontinuity with adjacent vowel adjacent vowel may be nasalized
11 Acoustic-Phonetic Features: Manner of Articulation Plosive (Oral Stop) (/p/, /t/, /k/, /b/, /d/, /g/): 1.closure along vocal tract (lips, alveolar ridge, velum) 2.buildup of air pressure behind closure 3.release of closure 4.burst of air 5.possible aspiration following burst complex process, several changes over brief time span some context-dependent attributes, some semi-invariant ones voiced bursts sometimes have “voice bar” in low- frequency region, caused by vocal fold vibration with complete oral and velic closure. sometimes voice bar is excellent cue; sometimes can be confused with a nasal
13 Acoustic-Phonetic Features: Manner of Articulation Plosive (Oral Stop) (/p/, /t/, /k/, /b/, /d/, /g/): closure and time required to build pressure results in “silence” region of spectrum prior to burst burst airflow is a step function, which becomes similar to an impulse, which has equal energy at all frequencies identity of a plosive contained in (at least) three areas: (1) voice-onset-time (VOT) / duration of aspiration (2) formant transitions in neighboring vowels/approximants (3) spectral shape of burst “voiced” plosives may not show any real voicing (!)
14 Acoustic-Phonetic Features: Manner of Articulation Fricative (/f/, /th/, /s/, /sh/, /v/, /dh/, /z/, /zh/): fricatives produced by forcing air through a constriction in the mouth constriction located anywhere from the labiodental region (/f/, /v/) to palato-alveolar region (/sh/, /zh/) all English fricatives come in voiced and unvoiced varieties voicing may not be present in voiced fricatives (!), making duration an important distinguishing cue (voiced shorter) the location and type of the constriction create spectral anti-resonances as well as resonances the main difference between /s/ and /f/ is in frequencies above 4000 Hz; telephone-band speech has limit of 4KHz.
15 Acoustic-Phonetic Features: Manner of Articulation Fricative (/f/, /th/, /s/, /sh/, /v/, /dh/, /z/, /zh/): Rules for distinguishing between /dh/ and /v/: /dh/ - formant structure is clearly visible OR frication is stronger at 5000 Hz and not so strong at low frequencies /v/ -formants not visible at location of maximum frication OR low-frequency energy is as strong as the energy at 5000 Hz However, due to the difficulty of distinguishing /dh/ from /v/ and distinguishing /th/ from /f/, in the spectrogram reading exercises we will treat them as the same.
16 Acoustic-Phonetic Features: Manner of Articulation Affricate (/ch/, /jh/): Affricates are conceptually like diphthongs: two separate phonemes considered as one English has two affricates: /ch/ /t sh/ /jh/ /d zh/ Sometimes cue to affricate is in burst preceding fricative; in closure between vowel and fricative. Sometimes cue to affricate is in voicing or duration.
17 Acoustic-Phonetic Features: Manner of Articulation Aspiration (/h/): like vowels, except usually no voicing can usually see formant structure formant patterns similar to surrounding vowel(s) /ah h aw s/ = “a house”
18 Acoustic-Phonetic Features: Manner of Articulation Flaps (/dx/, /nx/): allophone of /t/, /d/, or /n/ very brief duration; no closure for /dx/ indicated by dip in energy and F2 near 1800 Hz “write another”
19 Spectrogram Reading: Fricatives usually can divide fricatives into “strong” and “weak”: strong = /s/, /sh/, /z/, /zh/ weak = /f/, /v/, /th/, /dh/ voicing may be present only in transition into a voiced fricative; sometimes not at all voiced fricatives tend to be shorter than unvoiced, relative to the duration of the neighboring vowel place of articulation causes some change in spectral shape: /sh/ and /zh/ have greater energy at lower frequency than /s/, /z/
20 Spectrogram Reading: Fricatives /th/ sometimes has adjacent vowel’s F3, F4, F5 extend into /th/, in contrast with /f/ /th/ and /f/ often have weak energy during middle part of fricative sometimes /f/ and /th/ best distinguished by formant transitions of neighboring vowel(s)… more labial vs. more alveolar characteristics of transitions. sometimes /f/ has strong low-frequency energy (breath noise in a close-talking microphone) sometimes /th/ has more high-frequency energy above 4 kHz
21 Spectrogram Reading: Fricatives /f iy th iy s iy sh iy/