Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech Perception Richard Wright Linguistics 453.

Similar presentations


Presentation on theme: "Speech Perception Richard Wright Linguistics 453."— Presentation transcript:

1

2 Speech Perception Richard Wright Linguistics 453

3 Class Overview Physiology Auditory Shaping of the signal Auditory Cues Normalization and Context Experiment types

4 Physiology 1: The Ear Outer: Pinna, Ear Canal, Ear Drum Middle: Ossicles, Oval Window Inner: Cochlea — Basilar Membrane, Tectorial Membrane, Hair Cells

5

6 Physiology 1: The Outer Ear Pinna: directional hearing Ear Canal: high frequency emphasis (very short resonator closed at one end) Ear Drum: membrane’s vibrations convert pressure fluctuations to mechanical movement

7 Physiology 1: The Middle Ear Convert eardrum movement to movement of oval window — overcomes air to fluid impedance. Lower frequency emphasis (500- 4000 Hz) Lessen impact of very loud noises by stiffening (damping) Ossicles (Malleus, Incus, Stapes):

8 Physiology 1: The Inner Ear Cochlea: fluid filled cavity, wave propagation in fluid caused by movement of oval window Basilar Membrane: stiff and narrow at base — wide and flaccid at apex: base = high frequencies and apex = low frequencies (acts like series of band pass filters). Most of membrane is devoted to sounds below 5000 Hz. Shearing between Basilar and Tectorial membranes displace hair cells exciting cochlear nerve endings

9 Physiology 2: Nerual Pathway Cochlear Nerve Cochlear Nucleus Lateral Lemniscus Auditory Cortex

10

11 Auditory Shaping of the Signal Frequency Selectivity: Changes in frequency of stimulus do not result in equivalent changes in sensitivity Non-linear loudness sensitivity Phase Locking and noise reduction Lateral Inhibition and Tuning Onsets and neural spikes

12 Frequency Selectivity

13 Onset Advantage Delgutte and Kiang (1984)

14 What are Cues? Cues: information in the signal that listeners use in recovering the segmental content of the utterance –Place cues –Manner cues –Voicing cues –Vowel quality cues

15 Distribution of Cues Place cues

16 Distribution of Cues Manner cues

17 Distribution of Cues Voicing cues

18 Stop release bursts are very brief and difficult to recover: stops rely on formant transition cues Distribution of Cues

19 Stop release bursts are very brief and difficult to recover: stops rely on formant transition cues Fricative noise, particularly sibilant, contains robust cues: fricatives may be recovered in the absence of formant transitions Distribution of Cues

20 Stop release bursts are very brief and difficult to recover: stops rely on formant transition cues Fricative noise, particularly sibilant, contains robust cues: fricatives may be recovered in the absence of formant transitions Nasals contain strong manner cues but weak place cues Distribution of Cues

21 Onset Advantage Redundancy advantage: Onset stops automatically have both a release burst and a set of formant transitions Coda stops may be unreleased and therefore have less cue redundancy

22 Onset Advantage Onset consonant with flanking vowels

23 Experimental Tasks Identification Discrimination Rating Method of Adjustment (MOA)

24 Exp.Tasks 1: Identification Listeners are asked to identify stimuli as speech sounds... Open set: options open Forced choice: listeners choices constrained

25 Experiment 1: Onset vs Coda Stimuli –male speaker of American English –/ba, da, ga, ab, ad, ag/ bursts excised –16 bit, 22 kHz –mixed in three levels of white noise: no noise noise at 2 dB above RMS of signal noise at 2 dB below RMS of signal

26 Experiment 1: Onset vs Coda Task –onsets & codas mixed and randomized –presented binaurally over headphones –3 way forced choice task: “B D G” –labeled button press –self paced

27 Exp.Tasks 2: Discrimination Listeners are asked to respond “same” or “different” to presented sets of stimuli AX discrimination: fixed initial stimulus, variable second stimulus (same/different) ABX discrimination: two fixed initial stimuli, variable third stimulus (same A, same B)

28 Experiment 2: vowel discrimination Stimuli –Synthetic vowel continuum –Equal steps: 2.37 Bark along F1-F2 dimension –16 bit, 11 kHz –variable AX design

29 Task –same/different response to vowel pairs –presented binaurally over headphones –labeled button press –speeded (limited time to decide) Experiment 2: vowel discrimination

30 Exp.Tasks 3: Ratings Listeners are asked to rate a stimulus in some way: goodness, similarity, accentedness Example: Effect of intonational contour on naturalness: listeners hear sentences with and without f0 contour and rate naturalness on a 1-5 scale.

31 Exp.Tasks 4: MOA Listeners are asked to adjust a stimulus along some dimensions until it fits some criterion: matches another stimulus, sounds most natural, matches a category, etc. (can be identification, discrimination, or rating exp.)

32 Advantages and shortcomings 1 Open identification –Good: most natural, subjects understand –Bad: time consuming, little control of variables, stats difficult (non-comparable resoponses across subjects Forced choice identification –Good: less time consuming, control of response variables –Bad: not as natural

33 Advantages and shortcomings 2 Discrimination –Good: allows experimenter to map relationship between classification and discrimination –Bad: very time consuming, not at all natural, unintuitive to subjects

34 Advantages and shortcomings 3 Rating –Good: allows experimenter to map preferences in a multidimensional space, allows for correlation between one or more aspects of stimulus –Bad: hard to control interactions between preferences and stimulus variables, not that natural

35 Advantages and shortcomings 4 Method of adjustment (MOA) –Good: much quicker method of mapping multidimensional perceptional –Bad: not natural, complex interaction of stimulus variables


Download ppt "Speech Perception Richard Wright Linguistics 453."

Similar presentations


Ads by Google