Presentation is loading. Please wait.

Presentation is loading. Please wait.

Auditory Perception April 9, 2009 Auditory vs. Acoustic So far, we’ve seen two different auditory measures: 1.Mels (unit of perceived pitch) Auditory.

Similar presentations


Presentation on theme: "Auditory Perception April 9, 2009 Auditory vs. Acoustic So far, we’ve seen two different auditory measures: 1.Mels (unit of perceived pitch) Auditory."— Presentation transcript:

1

2 Auditory Perception April 9, 2009

3 Auditory vs. Acoustic So far, we’ve seen two different auditory measures: 1.Mels (unit of perceived pitch) Auditory correlate of Hertz (frequency) 2.Sones (unit of perceived loudness) Auditory correlate of decibels (intensity) Both were derived from pitch and loudness estimation experiments…

4 Masking Another scale for measuring auditory frequency emerged in the 1960s. This scale was inspired by the phenomenon of auditory masking. One sound can “mask”, or obscure, the perception of another. Unmasked: Masked: Q: How narrow can we make the bandwidth of the noise, before the sinewave becomes perceptible? A: Masking bandwidth is narrower at lower frequencies.

5 Critical Bands Using this methodology, researchers eventually determined that there were 24 critical bands of hearing. The auditory system integrates all acoustic energy within each band.  Two tones within the same critical band of frequencies sound like one tone Ex: critical band #9 ranges from 920-1080 Hz  F1 and F2 for might merge together Each critical band  0.9 mm on the basilar membrane.  The auditory system consists of 24 band-pass filters. Each filter corresponds to one unit on the Bark scale.

6 Bark Scale of Frequency The Bark scale converts acoustic frequencies into numbers for each critical band

7 Bark Table BandCenterBandwidthBandCenterBandwidth 15020-1001318501720-2000 2150100-2001421502000-2320 3250200-3001525002320-2700 4350300-4001629002700-3150 5450400-5101734003150-3700 6570510-6301840003700-4400 7700630-7701948004400-5300 8840770-9202058005300-6400 91000920-10802170006400-7700 1011701080-12702285007700-9500 1113701270-148023105009500-12000 1216001480-1720241350012000-15500

8 Your Grandma’s Spectrograph Originally, spectrographic analyzing filters were constructed to have either wide or narrow bandwidths.

9 Spectral Differences Acoustic vs. auditory spectra of F1 and F2

10 Cochleagrams Cochleagrams are spectrogram-like representations which incorporate auditory transformations for both pitch and loudness perception Acoustic spectrogram vs. auditory cochleagram representation of Cantonese word Check out Peter’s vowels in Praat.

11 Cochlear Implants Cochlear implants transmit sound directly to the cochlea through a series of band-pass filters… like the critical bands in our native auditory system. These devices can benefit profoundly deaf listeners with nerve deafness. = loss of working hair cells in the inner ear. Contrast with: a hearing aid, which is simply an amplifier. Old style: amplifies all frequencies New style: amplifies specific frequencies, based on a listener’s particular hearing capabilities.

12 Cochlear Implants A Cochlear Implant artificially stimulates the nerves which are connected to the cochlea.

13 Nuts and Bolts The cochlear implant chain of events: 1.Microphone 2.Speech processor 3.Electrical stimulation What the CI user hears is entirely determined by the code in the speech processor Number of electrodes stimulating the cochlea ranges between 8 to 22.  poor frequency resolution Also: cochlear implants cannot stimulate the low frequency regions of the auditory nerve

14 Noise Vocoding The speech processor operates like a series of critical bands. It divides up the frequency scale into 8 (or 22) bands and stimulates each electrode according to the average intensity in each band. This results in what sounds (to us) like a highly degraded version of natural speech.

15 What CIs Sound Like Check out some nursery rhymes which have been processed through a CI simulator:

16 CI Perception One thing that is missing from vocoded speech is F0. …It only encodes spectral change. Last year, Aaron Byrnes put together an experiment testing intonation perception in CI-simulated speech for his honors thesis. Tested: discrimination of questions vs. statements And identification of most prominent word in a sentence. 8 channels: 22 channels:

17 The Findings CI User: Excellent identification of the most prominent word. At chance (50%) when distinguishing between statements and questions. Normal-hearing listeners (hearing simulated speech): Good (90-95%) identification of the prominent word. Not too shabby (75%) at distinguishing statements and questions. Conclusion 1: F0 information doesn’t get through the CI. Conclusion 2: Noise-vocoded speech might not be a completely accurate CI simulation.

18 Mitigating Factors The amount of success with Cochlear Implants is highly variable. Works best for those who had hearing before they became deaf. The earlier a person receives an implant, the better they can function with it later in life. Works best for (in order): Environmental Sounds Speech Speaking on the telephone (bad) Music (really bad)

19 Practical Considerations It is largely unknown how well anyone will perform with a cochlear implant before they receive it. Possible predictors: lipreading ability rapid cues for place are largely obscured by the noise vocoding process. fMRI scans of brain activity during presentation of auditory stimuli.

20 Infrared Implants? Some very recent research has shown that cells in the inner ear can be activated through stimulation by infrared light. This may enable the eventual development of cochlear implants with very precise frequency and intensity tuning. Another research strategy is that of trying to regrow hair cells in the inner ear.

21 One Last Auditory Thought Frequency coding of sound is found all the way up in the auditory cortex. Also: some neurons only fire when sounds change.

22 A Philosophical Interlude Q: What’s a category? A classical answer: A category is defined by properties. All members of the category exhibit the same properties. No non-members of the category exhibit all of those properties.  The properties of any member of the category may be split into: Definitive properties Incidental properties

23 Classical Example A rectangle (in Euclidean geometry) may be defined as having the following properties: 1.Four-sided, two-dimensional figure (quadrilateral) 2.Four right angles This is a rectangle.

24 Classical Example Adding a third property gives the figure a different category classification: 1.Four-sided, two-dimensional figure (quadrilateral) 2.Four right angles 3. Four equally long sides This is a square.

25 Classical Example Altering other properties does not change the category classification: 1.Four-sided, two-dimensional figure (quadrilateral) 2.Four right angles 3. Four equally long sides This is still a square. A. Is red. definitive properties incidental property

26 Classical Linguistic Categories Formal phonology traditionally defined all possible speech sounds in terms of a limited number of properties, known as “distinctive features”. (Chomsky + Halle, 1968) [d] = [CORONAL, +voice, -continuant, -nasal, etc.] [n] = [CORONAL, +voice, -continuant, +nasal, etc.] … Similar approaches have been applied in syntactic analysis. (Chomsky, 1974) Adjectives = [+N, +V] Prepositions = [-N, -V]

27 Prototypes The psychological reality of classical categories was called into question by a series of studies conducted by Eleanor Rosch in the 1970s. Rosch claimed that categories were organized around privileged category members, known as prototypes. (instead of being defined by properties) Evidence for this theory initially came from linguistic tasks: 1.Semantic verification (Rosch, 1975) Is a robin a bird? Is a penguin a bird? 2.Category member naming.

28 Prototype Category Example: “Bird”

29 Exemplar Categories Cognitive psychologists in the late ‘70s (e.g., Medin & Schaffer, 1978) questioned the need for prototypes. Phenomena explained by prototype theory could be explained without recourse to a category prototype. The basic idea: Categories are defined by extension.  Neither prototypes nor properties are necessary. Categorization works by comparing new tokens to all exemplars in memory. Generalization happens on the fly.

30 A Category, Exemplar-style “square”

31 Back to Perception When people used to talk about categorical perception, they meant perception of classical categories. A stop is either a [b] or a [g] (no in between) Remember: in classical categories, there are: definitive properties incidental properties Q: What are the properties that define a stop category? The definitive properties must be invariant. (shared by all category members) So…what are the invariant properties of stop categories?

32 The Acoustic Hypothesis People have looked long and hard for invariant acoustic properties of stops, with little success. (and some people are still looking) Frequency values of compact (synthetic) bursts cueing different places of articulation, in various vowel contexts. (Liberman et al., 1952)

33 Theoretical Revision Since invariant acoustic properties could not be found (especially for velars)… It was assumed that listeners perceived (articulatory) gestures, not (acoustic) sounds. Q: What invariant articulatory properties define stop categories? A: If they exist, they’re hard to find. Motor Theory Revision #2: Listeners perceive “intended” gestures. Note: “intentions” are kind of impossible to observe. But they must be invariant…right?

34 Another Brick in the Wall Another problem for motor theory: Perception of speech sounds isn’t always categorical. In particular: vowels are perceived in a more gradient fashion than stops. However, vowel perception becomes more categorical when the vowels are extremely short.

35 It’s also hard to identify any invariant acoustic properties for vowels. Variation is rampant across: tokens speakers genders dialects age groups, etc. Variability = a huge problem for speech perception.

36 More Problems Also: infants exhibit categorical perception, too… Even though they don’t know category labels. Chinchillas can do it, too!

37 An Alternative It has been proposed that phoneme categories are defined by prototypes… which we use to identify vowels in speech. One relevant finding: the perceptual magnet effect. Part 1: play listeners a continuum of synthetic vowels in the neighborhood of [i]. Task: judge how much each one sounds like [i]. Some are better = prototypical Others are worse = non-prototypes

38 Perceptual Magnets Part 2: define either a prototype or a non- prototype as a category center. Task: determine whether other vowels on the continuum belong to those categories. Result: more same responses when the category center is a prototype. Prototype = a “perceptual magnet” Same? Different?

39 Prototypes, continued The perceptual magnet prototypes are usually located at a listener’s average F1 and F2 values for [i]. 4-month olds exhibit the perceptual magnet effect… but monkeys do not. Note: the prototype is the only thing that has to be “invariant” about the category. particular properties aren’t important. Testing a prototype model on the Peterson & Barney data yielded 51% correct classification. (Human listeners got 94% correct)  Variability is still hard to deal with.

40 Flipping the Script Another approach to speech perception is to preserve all variability that we hear… Rather than boiling it down to properties or prototypes. In this model, speech categories are defined by extension. = consist of exemplars So, your mental representaton of /b/ consists of every token of /b/ you’ve ever heard in your life. …rather than any particular acoustic or articulatory properties. Analogy: phonetics field project notes (your mind is a pack rat)

41 Exemplar Categorization 1.Stored memories of speech experiences are known as traces. Each trace is linked to a category label. 2.Incoming speech tokens are known as probes. 3.A probe activates the traces it is similar to. Note: amount of activation is proportional to similarity between trace and probe. Traces that closely match a probe are activated a lot; Traces that have no similarity to a probe are not activated much at all.

42 A (pretend) example: traces = vowels from the Peterson & Barney data set. * probe Activation of each trace is proportional to distance (in vowel space) from the probe. highly activated traces low activation

43 Echoes from the Past The combined average of activations from exemplars in memory is summed to create an echo of the perceptual system. This echo is more general features than either the traces or the probe. Inspiration: Francis Galton 

44 Exemplar Categorization II For each category label… The activations of the traces linked to it are summed up. The category with the most total activation wins. Note: we use all exemplars in memory to help us categorize new tokens. Also: any single trace can be linked to different kinds of category labels. Test: Peterson & Barney vowel data Exemplar model classified 81% of vowels correctly.

45 Exemplar Predictions Point: all properties of all exemplars play a role in categorization… Not just the “definitive” ones. Prediction: non-invariant properties of speech categories should have an effect on speech perception. E.g., the voice in which a [b] is spoken. Or even the room in which a [b] is spoken. Is this true? Let’s find out…

46 Another Experiment! Circle whether each word is a new or old word in the list. 1.9.17. 2.10.18. 3.11.19. 4.12.20. 5.13.21. 6.14.22. 7.15.23. 8.16.24.

47 Another Experiment! Circle whether each word is a new or old word in the list. 25.33. 26. 34. 27.35. 28.36. 29.37. 30.38. 31.39. 32.40.

48 Continuous Word Recognition In a “continuous word recognition” task, listeners hear a long sequence of words… some of which are new words in the list, and some of which are repeats. Task: decide whether each word is new or a repeat. Twist: some repeats are presented in a new voice; others are presented in the old (same) voice. Finding: repetitions are identified more quickly and more accurately when they’re presented in the old voice. (Palmeri et al., 1993) Implication: we store voice + word info together in memory.


Download ppt "Auditory Perception April 9, 2009 Auditory vs. Acoustic So far, we’ve seen two different auditory measures: 1.Mels (unit of perceived pitch) Auditory."

Similar presentations


Ads by Google