Presentation on theme: "Vowels (again) February 23, 2010 The News For Thursday: Give me a (one paragraph or so) description of what you’re thinking of doing for a term project."— Presentation transcript:
Vowels (again) February 23, 2010
The News For Thursday: Give me a (one paragraph or so) description of what you’re thinking of doing for a term project Also note: two new readings have been posted 1.Peterson & Barney (1952) 2.Liljencrants & Lindblom (1972)
Fun Stuff Who is producing each of these vowels? (And which vowel are they producing?)
Source/Filter Lab Review Silke made predictions on the basis of her formant values:
Practical Stuff So you want to plot your formant space…
Source/Filter Lab Review Stephanie made an interesting (general) prediction:
Peterson & Barney (1952) Gordon Peterson and Harold Barney conducted a landmark study of variability in the production and perception of English vowels way back in Methods: 1.Recorded speakers of “General American English” reading a list of 10 hVd words (heed, hid, head, etc.) twice speakers (33 men, 28 women, 10 children) 3.Measured the F0, F1, F2 and F3 from the midpoint of all 1520 vowels. 4.Presented all 1520 vowels to 70 listeners in a vowel identification experiment (in eight sessions).
Peterson & Barney (1952) Acoustically, they found much variability in vowel production Also: much overlap in terms of absolute formant frequencies General confirmation of F1-F2 vowel space schema “herd” distinguished by low F3.
Peterson & Barney (1952)
They organized their response data in the form of a confusion matrix. Each row corresponds to the “intended vowel” = the stimulus category Each column corresponds to the classification made by the listeners = the response category
Peterson & Barney (1952) Some confusion matrix basics: Entries on the main diagonal represent correct responses. Entries off the main diagonal represent the “confusions” Popular confusions here include: “hod” perceived as “hawed” (1013 / 10273) “hid” perceived as “head” (694 / 10279)
Peterson & Barney (1952) Summing up the columns provides a rough sense of the listeners’ response bias = tendency to favor one response category over another, independent of the stimulus presented Popular options: “had” (10906), “hawed” (10737) Not-so-popular: “hid” (9813), “hud” (9956)
Peterson & Barney (1952) Note: listeners identified only 94.4% of vowels correctly “heed”, “who’d” and “herd” were highly distinct; “hod” and “head” were not The available response options in the neighborhood matter…
Source/Filter Lab Review Sue plotted some confusion matrices:
Source/Filter Lab Review Rhonda (and Jon) broke things down by features:
Class Confusion Matrix This is the response data summed across all conditions… From all five listeners.
Back to Perturbation Theory Basic idea #1: vocal tract resonances (formants) are the result of standing waves in the vocal tract These standing waves have areas where velocity alternates between high and low (anti-nodes), and areas where velocity does not change (nodes)
Perturbation Principles Basic Idea #2: constriction at a velocity anti-node decreases a resonant frequency anti-node
Perturbation Principles Basic Idea #3: constriction at a velocity node increases a resonant frequency node
Labial Constrictions in the labial region are at anti-nodes for both F1 and F2. Labial constrictions decrease both F1 and F2
Labial Constrictions in the palatal region are at an F2 node and near an F1 anti-node F1 decreases; F2 increases Palatal
Labial Constrictions in the velar region are at an F2 anti-node and near an F1 anti-node F1 decreases; F2 decreases PalatalVelar
Labial Constrictions in the pharyngeal region are at an F2 anti- node and near an F1 node F1 increases; F2 decreases PalatalVelarPharynx
Labial Constrictions in the laryngeal region are at an F2 node and an F1 node F1 increases; F2 increases PalatalVelarPharynxLarynx
Different Sources For a particular articulatory configuration, the vocal tract will resonate at a certain set of frequencies… no matter what the sound source is. (Remember the talk box) Let’s see what happens when we change our sound source to a duck call…
Duck Call Vowels duck call is placed here Now let’s filter the duck call with differently shaped plastic tubes…. Care to make any predictions?
Another View [i]
Duck Call Spectrograms [i]
Duck Call Spectra [i]
How About These? duck call is placed on this side
[i] vs. [e] [i][e]
[u] vs. [o] [u][o]
Philosophical Fragments Consider the Cardinal Vowels. Two “anchor” vowels: [i] - Cardinal Vowel 1 - highest, frontest vowel possible - Cardinal Vowel 5 - lowest, backest vowel possible Remaining vowels are spaced at equal intervals of frontness and height between the anchor vowels. Note: [u] - Cardinal Vowel 8 - may serve as a third anchor as the highest, backest, roundest vowel possible Q: Why are the first two anchors unrounded… While the third anchor is rounded?
Cardinal Vowel Diagram o
Secondary Cardinal Vowels
Perturbation to the Rescue! Rounding back vowels takes advantage of an acoustic synergy…which lowers both F1 and F2. LabialPalatalVelarPharynxLarynx Q: Is there anything wrong with rounding other (non- back) vowels?
A “Bad” Vowel Space One answer is found in the typical structure of vowel systems. For instance, a five vowel system is rarely, if ever, distributed thusly: [i] [e] [æ]
Five Vowel Spaces Many languages with only five vowels spread them out evenly in the vowel space in a triangle Here’s a popular vowel space option: iu eo a
Gujarati Vowel Space
A Complicated Vowel Space The language is Swedish.
Adaptive Dispersion Theory Developed by Bjorn Lindblom and Johan Liljencrants (Swedish speakers) Adaptive Dispersion theory says: Vowels should be as acoustically distinct from each other as possible (This helps listeners identify them correctly) So…languages tend to maximize the distance between vowels in acoustic space Note: lack of ~ distinction in Canadian English.
Liljencrants + Lindblom (1972) Attempted to quantify “contrast” in the vowel space. to emphasize the importance of perception in the formation of phonological structure. They start with an articulatory model of the limits of the vowel space: note: space is plotted in three formants… and in mels (auditory equivalent of frequency)
Liljencrants + Lindblom (1972) Quantification of contrast in the space: Given m pairs of n vowels, Where m = (n * (n-1)) / 2 And r i 2 = the Euclidean distance between the ith pair of vowels, in formant space. The perceptual goal of the system is: I.e., the more formant space between vowels, the easier they will be to distinguish from one another. Note: floating magnets analogy Also: crowded elevator analogy
Liljencrants & Lindblom (1972) In perceptually optimal systems… vowels tend to spread out around the edges of the available space. There is also a trend for more high vowel contrasts than are normally found in language.