Presentation is loading. Please wait.

Presentation is loading. Please wait.

Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Similar presentations


Presentation on theme: "Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,"— Presentation transcript:

1 Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division, Agere Systems Juha Merimaa Institut für Kommunikationsakustik, Ruhr-Universität Bochum

2 Complex listening situations Jazz Blaah, blaah, blaah Hum Speech source at -15º, good music at 50º, and noise through an open door at -125º azimuth

3 This work A model to extract binaural cues corresponding to human localization performance in several complex listening situations

4 Outline 1. Model descripiton 2. Simulation results A) Independent sources in free-field B) Precedence effect C) Independent sources and reverberation 3. Comparison with earlier models 4. Summary

5 HRTF/ BRIR 1 Left ear input Stimulus 1 HRTF/ BRIR N Right ear input Gammatone filterbank HRTF/ BRIR N HRTF/ BRIR 1 Stimulus N Internal noise Normalized cross-correlation & level difference calculation Model of neural transduction Exponential time window 10 ms Bernstein et al. 1999

6 Extraction of binaural cues Estimated at each time instant: – Interaural Time Difference (ITD) Time lag of the maximum of the normalized cross-correlation – Interaural Level Difference (ILD) Ratio of signal energies within time window – Interaural coherence (IC) Maximum of the normalized cross-correlation

7 Assumption for correct localization The auditory system needs to acquire ITD and ILD cues similar to those evoked by each source separately in an anechoic environment

8 Example: Two active sound sources Superposition with different level and phase relations at left and right ears For independent or non-stationary source signals: – Time-varying binaural cues – Reduced IC

9 How to obtain correct localization cues? Simply select ITDs and ILDs only when IC is above a set threshold – An adaptive threshold is assumed

10 Simulation results

11 1. Effect of number of sources Speech sources at same overall level (Hawley et al. 1999; Drullman & Bronkhorst 2000) – One or two distracters have little effect on localization performance – Performance is still good for 5 competing sources Simulations with different phonetically balanced sentences recorded by the same male speaker

12 Two talkers, ±40º azimuth 65 and 58 % selected signal power

13 3 and 5 talkers Simulated at 500 Hz critical band 3 talkers: 0º and ±40º azimu th 5 talkers: 0º, ±40º, and ±80º azimuth

14 3 talkers: c 0 = 0.99 p 0 = 54 % 5 talkers: c 0 = 0.99 p 0 = 22 % All cues Selected cues

15 2. Effect of target-to-distracter ratio Click-train target in presence of a white noise distracter – Target is localizable down to a few dB above detection threshold (Good & Gilkey 1996; Good et al. 1997) – High frequencies are more important for localization (Lorenzi & et al. 1999)

16 Simulation 2 kHz critical band White noise at 0º azimuth 100 Hz clicktrain at 30º azimuth -3, -9, and -21 dB absolute target-to- distracter ratios (T/D) – Corresponds to 8, 2, and -10 dB T/D relative to detection threshold, as defined by Good & Gilkey (1996)

17 -3 dB T/D c 0 = 0.990, p 0 = 3 % -9 dB T/D c 0 = 0.992, p 0 = 9 % -21 dB T/D c 0 = 0.992, p 0 = 99 % All cues Selected cues

18 Precedence effect Perception of subsequent sound events – Fusion – Localization dominance by the first event – Suppression of directional discrimination of latter events Depends on interstimulus delay – Summing localization (approx. 0-1 ms) – Localization dominance by first event (stimulus dependent, until 2-50 ms) – Independent localization

19 1. Click pairs Classical precedence effect experiment: Two consecutive clicks with same level from different directions

20 Lead: 40º, lag: -40º, ICI: 5 ms

21 Click pairs as a function of inter- click interval (ICI) Simulations for ICI between 0 - 20 ms Same click sources: ±40º azimuth 500 Hz critical band A single threshold did not predict all cases correctly – Threshold was determined for each ICI such that the standard deviation of ITD is 15 μs

22 Click pairs as a function of ICI

23

24 Note on crossfrequency processing At certain small ICIs the required IC threshold gets very high – Anomalies of precedence effect have been reported for bandpass filtered clicks (Blauert & Cobben 1978) Some characteristic power peaks occur at different ICIs at different critical bands Across frequency band processing would allow extraction of correct cues

25 2. Sinusoidal tones and a reflection Steady state cues are a result of coherent summation of sound at the ears of a listener Localization depends on onset rate (Rakerd & Hartmann 1986) – Correct localization with a fast onset – Localization based on misleading steady state cues for tones with a slow onset

26 Sinusoidal tones: Simulation 500 Hz sinusoidal tone Direct sound from 0º azimuth Reflection after 1.4 ms from 30º Linear onset ramp Steady state level of 65 dB SPL

27

28 Sinusoidal tones: Results The model cannot as such explain discounting of the steady state cues Dependence on onset rate can be explained by considering cues at the time when signal level gets high enough above internal noise

29 Independent sources and reverberation Final test for the model Simulation at 2 kHz critical band – One speech sources at 30º azimuth – Two speech sources at ±30º azimuth BRIRs measured in a hall with RT = 1.4 s at 2 kHz octave band

30 All cues Selected cues 1 talker: c 0 = 0.99 p 0 = 1 % 2 talkers: c 0 = 0.99 p 0 = 1 %

31 Comparison with earlier models

32 Weighting of localization cues with signal power Not done outside 10 ms analysis window Contribution of each time instant to localization is defined by IC Model can neglect information corresponding to high power when due to concurrent activity of several sources Power still affects how often ITDs and ILDs of individual sources are sampled

33 Lindemann (1986) Based on contralateral inhibition using a fixed (10 ms) time constant Tends to hold cross-correlation peaks with high IC Differences – Operation of the cue selection method is not limited to the 10 ms time window – When necessary (complex situations), the “memory” of past cues can last longer

34 Zurek (1987) Localization inhibition controlled by onset detection In precedence effect conditions, the cue selection naturally derives most localization cues from onsets Differences – Cue selection is not limited to getting information from signal onsets

35 Summary A method was proposed for modeling auditory localization in presence of concurrent sound ITD and ILD cues are selected only when they coincide with a large IC Operation of the model was verified with results of several psychoacoustical studies from the literature

36 Thank you!


Download ppt "Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,"

Similar presentations


Ads by Google