Speech Perception Richard Wright Linguistics 453.

Speech Perception Richard Wright Linguistics 453

Class Overview Physiology Auditory Shaping of the signal Auditory Cues Normalization and Context Experiment types

Physiology 1: The Ear Outer: Pinna, Ear Canal, Ear Drum Middle: Ossicles, Oval Window Inner: Cochlea — Basilar Membrane, Tectorial Membrane, Hair Cells

Physiology 1: The Outer Ear Pinna: directional hearing Ear Canal: high frequency emphasis (very short resonator closed at one end) Ear Drum: membrane’s vibrations convert pressure fluctuations to mechanical movement

Physiology 1: The Middle Ear Convert eardrum movement to movement of oval window — overcomes air to fluid impedance. Lower frequency emphasis (500- 4000 Hz) Lessen impact of very loud noises by stiffening (damping) Ossicles (Malleus, Incus, Stapes):

Physiology 1: The Inner Ear Cochlea: fluid filled cavity, wave propagation in fluid caused by movement of oval window Basilar Membrane: stiff and narrow at base — wide and flaccid at apex: base = high frequencies and apex = low frequencies (acts like series of band pass filters). Most of membrane is devoted to sounds below 5000 Hz. Shearing between Basilar and Tectorial membranes displace hair cells exciting cochlear nerve endings

Physiology 2: Nerual Pathway Cochlear Nerve Cochlear Nucleus Lateral Lemniscus Auditory Cortex

Auditory Shaping of the Signal Frequency Selectivity: Changes in frequency of stimulus do not result in equivalent changes in sensitivity Non-linear loudness sensitivity Phase Locking and noise reduction Lateral Inhibition and Tuning Onsets and neural spikes

Frequency Selectivity

Onset Advantage Delgutte and Kiang (1984)

What are Cues? Cues: information in the signal that listeners use in recovering the segmental content of the utterance –Place cues –Manner cues –Voicing cues –Vowel quality cues

Distribution of Cues Place cues

Distribution of Cues Manner cues

Distribution of Cues Voicing cues

Stop release bursts are very brief and difficult to recover: stops rely on formant transition cues Distribution of Cues

Stop release bursts are very brief and difficult to recover: stops rely on formant transition cues Fricative noise, particularly sibilant, contains robust cues: fricatives may be recovered in the absence of formant transitions Distribution of Cues

Stop release bursts are very brief and difficult to recover: stops rely on formant transition cues Fricative noise, particularly sibilant, contains robust cues: fricatives may be recovered in the absence of formant transitions Nasals contain strong manner cues but weak place cues Distribution of Cues

Onset Advantage Redundancy advantage: Onset stops automatically have both a release burst and a set of formant transitions Coda stops may be unreleased and therefore have less cue redundancy

Onset Advantage Onset consonant with flanking vowels

Experimental Tasks Identification Discrimination Rating Method of Adjustment (MOA)

Exp.Tasks 1: Identification Listeners are asked to identify stimuli as speech sounds... Open set: options open Forced choice: listeners choices constrained

Experiment 1: Onset vs Coda Stimuli –male speaker of American English –/ba, da, ga, ab, ad, ag/ bursts excised –16 bit, 22 kHz –mixed in three levels of white noise: no noise noise at 2 dB above RMS of signal noise at 2 dB below RMS of signal

Experiment 1: Onset vs Coda Task –onsets & codas mixed and randomized –presented binaurally over headphones –3 way forced choice task: “B D G” –labeled button press –self paced

Exp.Tasks 2: Discrimination Listeners are asked to respond “same” or “different” to presented sets of stimuli AX discrimination: fixed initial stimulus, variable second stimulus (same/different) ABX discrimination: two fixed initial stimuli, variable third stimulus (same A, same B)

Experiment 2: vowel discrimination Stimuli –Synthetic vowel continuum –Equal steps: 2.37 Bark along F1-F2 dimension –16 bit, 11 kHz –variable AX design

Task –same/different response to vowel pairs –presented binaurally over headphones –labeled button press –speeded (limited time to decide) Experiment 2: vowel discrimination

Exp.Tasks 3: Ratings Listeners are asked to rate a stimulus in some way: goodness, similarity, accentedness Example: Effect of intonational contour on naturalness: listeners hear sentences with and without f0 contour and rate naturalness on a 1-5 scale.

Exp.Tasks 4: MOA Listeners are asked to adjust a stimulus along some dimensions until it fits some criterion: matches another stimulus, sounds most natural, matches a category, etc. (can be identification, discrimination, or rating exp.)

Advantages and shortcomings 1 Open identification –Good: most natural, subjects understand –Bad: time consuming, little control of variables, stats difficult (non-comparable resoponses across subjects Forced choice identification –Good: less time consuming, control of response variables –Bad: not as natural

Advantages and shortcomings 2 Discrimination –Good: allows experimenter to map relationship between classification and discrimination –Bad: very time consuming, not at all natural, unintuitive to subjects

Advantages and shortcomings 3 Rating –Good: allows experimenter to map preferences in a multidimensional space, allows for correlation between one or more aspects of stimulus –Bad: hard to control interactions between preferences and stimulus variables, not that natural

Advantages and shortcomings 4 Method of adjustment (MOA) –Good: much quicker method of mapping multidimensional perceptional –Bad: not natural, complex interaction of stimulus variables

Speech Perception Richard Wright Linguistics 453.

Similar presentations

Presentation on theme: "Speech Perception Richard Wright Linguistics 453."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Speech Perception Richard Wright Linguistics 453.

Similar presentations

Presentation on theme: "Speech Perception Richard Wright Linguistics 453."— Presentation transcript:

Similar presentations

About project

Feedback