Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rob van der Willigen Auditory Perception.

Similar presentations


Presentation on theme: "Rob van der Willigen Auditory Perception."— Presentation transcript:

1 Rob van der Willigen http://~robvdw/cnpa04/coll1/AudPerc_2007_P8.ppt Auditory Perception

2 Today’s goal Understanding the problem of Auditory Scene Analysis (ASA): - Higher-levels principles of organization - Complex Waveforms analysis - Neural Activity Patterns (NAP) analysis

3 OBJECTS?

4 Psychoacoustics The Problem of Auditory Scene Analysis (ASA) “Conversion of auditory sensory input into a adequate representation of reality.” ASA allows an organism to obtain and react appropriately to complex sounds from the environment Albert S. Bregman (Auditory scene analysis, 1999; p. 1) “Only by being aware of how sound is created and shaped in the world can we know how to use it to derive the properties of the sound-producing events around us”

5 Psychoacoustics ASA: Objects compared to Streams In vision we intuitively focus on objects. In fact the visual system uses light reflections to form separate descriptions of the individual objects. These descriptions include the object’s shape, size, distance, color etc. But how is object information determined from sound ?

6 Psychoacoustics Information Carrying Capacity and ASA For an ideal coding system, Shannon showed that where C is the channel capacity, B is the channel bandwidth (in Hz), S and N are the average received signal and noise powers respectively, and the noise is additive white Gaussian noise. This Equation is referred to as the Shannon–Hartley law and it acts as a benchmark for various practical modulation/demodulation schemes since it defines the absolute maximum information rate, R, which can be reliably (without error) sent over the channel. Eyes 2x10 8 receptors 2x10 6 Axons Capacity (bits/s) 107 Ears 3x10 4 receptors 2x10 4 Axons Capacity (bits/s) 105

7 Psychoacoustics Information Carrying Capacity and ASA Ears 3x10 4 receptors 2x10 4 Axons Capacity (bits/s) 105 Intensity differences of 1 dB over a range of about 120 dB 120 levels can encode 7 bits (2^7=128). 24 nonoverlaping frequency bands

8

9 Psychoacoustics ASA: Objects compared to Streams

10 Psychoacoustics ASA: Objects compared to Streams

11 Psychoacoustics ASA: Objects compared to Streams

12 Psychoacoustics The Problem of Auditory Scene Analysis (ASA) Complex waveforms are the Superposition of all individual sounds plus the acoustic effects of the environment between transmitter and receiver. Input for the auditory system are complex waveforms. An important part of building a representation of individual sounds is to determine which parts of the sensory stimulation (complex waveforms) originate from the same event and environmental object. The individual waveforms are not easily recognizable from the mixture.

13 The Problem of ASA Problem I: Sound localization can only result from the neural processing of acoustic cues in the tonotopic input of the (two) ear(s)! Problem II: How does the auditory system parse the superposition of distinct sounds into the original acoustic input?

14 The Problem of Hearing Tonotopie blijft in het auditief systeem tot en met de auditieve hersenschors behouden. “De samenstelling van een geluid uit afzonderlijke tonen is te vergelijken met de manier waarop wit licht in afzonderlijke kleuren uiteenvalt wanneer het door een prisma gaat.” John A.J. van Opstal (Al kijkend hoort men, 2006; p. 8)‏

15 Psychoacoustics ASA: Objects compared to Streams Streams play the same role in audition as objects do in vision. The conceptual point is that the notion of objects and streams require a mental representation of the visual and auditory input, respectively. The line drawing shows an object with three legs or does it? Can the same occur in the auditory domain? Gestalt psychologists would argue that “laws” of perceptual organization are innate and transcend modality.

16 Psychoacoustics Gestalt Principles in Vision Proximity: –grouping of nearby dots Similarity: –grouping of similar dots Closure: –recognition of incomplete patterns Good continuation: –e.g. 2 lines crossing

17 Proximity Why perceive rows vs. columns? Psychoacoustics Gestalt Principles in Vision

18 Similarity Why perceive rows vs. columns? Psychoacoustics Gestalt Principles in Vision

19 Closure Psychoacoustics Gestalt Principles in Vision

20 Good continuation Psychoacoustics Gestalt Principles in Vision

21 Psychoacoustics Gestalt Principles in Audition ? Questions: 1. Which factors determine an “auditory object”? 2. Which cues separate “auditory objects” from each other? 3. Which rules organize our auditory perceptions? Seminal book: “Auditory Scene Analysis” (Albert S. Bregman, 1990)

22 Listeners are capable of parsing an acoustic scene (a complex sound) to form a mental representation of each sound source – stream – in the perceptual process of auditory scene analysis (Bregman, 1990) from events to streams Two conceptual processes of ASA: Segmentation. Decompose the acoustic mixture into sensory elements (segments) Grouping. Combine segments into streams, so that segments in the same stream originate from the same source Psychoacoustics Gestalt Principles in Audition ?

23 Psychoacoustics ASA: Objects compared to Streams AI CSAITT STIOTOS The upper pattern of letters appear to be not meaningful The lower pattern HAS meaning due to spatial segregation.

24 Psychoacoustics ASA: Objects compared to Streams Stream segregation in a Cycle of six tones: An example of good continuation / Proximity? http://www.psych.mcgill.ca/labs/auditory/bregmancd.html Compact disk of demonstrations of auditory scene analysis

25 Psychoacoustics ASA: Objects compared to Streams Stream segregation in a Cycle of six tones: An example of good continuation / Proximity?

26 Psychoacoustics ASA: Objects compared to Streams Stream segregation due to Frequency gap: An example of Good continuation/Proximity? Loss of rhythmic information as a result of stream segregation. When a repeating cycle breaks into two streams, the rhythm of the full sequence is lost and replaced by those of the component streams (Panel 1). This change can be heard clearly if the rhythm of the whole sequence is quite different from those of the component streams. In the present example, we use triplets of tones separated by silences, HLH-HLH- HLH-... (where H represents a high tone, L a low one, and the hyphen corresponds to a silence equal in duration to a single tone). We perceive this pattern as having a galloping rhythm. An interesting fact about this pattern is that when it breaks up into high and low streams, neither the high nor the low one has a galloping rhythm. We hear two concurrent streams of sound in each of which the tones are isochronous (equally spaced in time).

27 Psychoacoustics ASA: Objects compared to Streams “Good continuation” dominates “Pitch proximity” dominates

28 Psychoacoustics ASA: Objects compared to Streams Stream segregation in a Cycle of six tones: An example of Good continuation/Proximity? http://www.psych.mcgill.ca/labs/auditory/bregmancd.html Compact disk of demonstrations of auditory scene analysis

29 Psychoacoustics ASA: Objects compared to Streams Gliding tone through a noise burst: An example of good continuation?

30 Psychoacoustics Gestalt Principles in Speech Perception offset synchrony onset synchrony common AM continuity “… pure pleasure … ” harmonicity

31 Psychoacoustics Gestalt Principles in Speech Perception offset synchrony onset synchrony continuity “… pure pleasure … ” harmonicity

32 Psychoacoustics Gestalt Principles in Audition Auditory peripheral processing amounts to a decomposition of the acoustic signal. ASA cues essentially reflect structural coherence of a sound source. A subset of cues believed to be strongly involved in ASA: Simultaneous organization: Periodicity, temporal modulation, onset. Sequential organization: Location, pitch contour and other source characteristics (e.g. vocal tract).

33 Psychoacoustics Gestalt Principles in Audition Sequential (temporal, melodic) integration proximity (pitch, time, location) similarity (timbre, loudness) lack of sudden changes Simultaneous (spectral, harmonic) integration simultaneity of onsets coherence of changes –frequency, SPL, spectral envelope harmonicity

34 Multiple positions have identical ILD, ITD Psychoacoustics Auditory versus Visual Scene Analysis

35 Elevation (deg) -40 -20 0 +20 +40 +60 Frequency kHz Amplitude (dB) In humans mid-frequencies also exhibit a prominent notch that varies in frequency with changes in sound source elevation (6 – 11 kHz) Elevation Psychoacoustics Auditory versus Visual Scene Analysis

36 Distance estimation: Determine how far away a sound is. Cue: relative amounts of direct vs. reverberant energy a closer sound more direct energy Psychoacoustics Auditory versus Visual Scene Analysis

37 Psychoacoustics Auditory versus Visual Scene Analysis Gestalt principles focus on similarities between the different Modalities such as vision and audition, but there are differences as well due to the difference in physical properties. In audition sound-emitting properties rather than sound-reflecting properties of the environment are important. Sound is used to discover the time and frequency pattern of the source not its spatial shape. In other words acoustic events are transparent; they do not occlude energy from what lies behind. Echoes (reflections) obscure the original properties of sounds. Although echoes are delayed copies (containing all the original information) the superposition of the original sound and its echoes creates redundant information. Acoustic information can only effectively used from large objects such as rooms or mountains. Only than effects of the environment between transmitter and receiver are noticeable.

38 Psychoacoustics How does ASA work ? What does the brain Need/Do? Spectrogram: Plot of log energy across time and frequency (linear frequency scale) Cochleogram: Cochlear filtering by the gamma-tone filter-bank (or other models of cochlear filtering), followed by a stage of nonlinear rectification; the latter corresponds to hair cell transduction by either a hair cell model or simple compression operations (log and cube root) Quasi-logarithmic frequency scale, and filter bandwidth is frequency-dependent Previous work suggests better resilience to noise than spectrogram Spectrogram Cochleogram

39 Psychoacoustics How does ASA work ? What does the brain Need/Do? Simulation of (a) the basilar membrane motion, (b) the neural activity pattern (NAP), and (c) the stabilized auditory image produced by a pulse train with a rate of 125 pulses per second. The narrow, low-frequency filters isolate individual harmonics of 125 Hz; the broader high-frequency filters emit impulse responses (a). The transduction process compresses the dynamic range and sharpens the features in the pattern at the same time. The temporal integration mechanism stabilizes the pattern and removes global phase differences (c). The auditory image produced in response to a single acoustic pulse is shown on an expanded time scale in (d). NAP in response to repetitive clicks (b) NAP in response to a single click (d)

40 Psychoacoustics How does ASA work ? What does the brain Need/Do?

41 Psychoacoustics Fundamental problem of ASA Auditory scene analysis requires: Analysis over long time windows Analysis over broad spectral widths A sensitive auditory system requires: Analysis over very short time windows Analysis over narrow frequency bands

42 Psychoacoustics Fundamental problem of ASA Auditory scene analysis requires: Analysis over long time windows Analysis over broad spectral widths A sensitive auditory system requires: Analysis over very short time windows Analysis over narrow frequency bands


Download ppt "Rob van der Willigen Auditory Perception."

Similar presentations


Ads by Google