This research was supported by Delphi Automotive Systems

Slides:



Advertisements
Similar presentations
Evaluation of Dummy-Head HRTFs in the Horizontal Plane based on the Peak-Valley Structure in One-degree Spatial Resolution Wersényi György SZÉCHENYI ISTVÁN.
Advertisements

SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
Psychoacoustics Perception of Direction AUD202 Audio and Acoustics Theory.
Binaural Hearing Or now hear this! Upcoming Talk: Isabelle Peretz Musical & Non-musical Brains Nov. 12 noon + Lunch Rm 2068B South Building.
Timbre perception. Objective Timbre perception and the physical properties of the sound on which it depends Formal definition: ‘that attribute of auditory.
Periodicity and Pitch Importance of fine structure representation in hearing.
Spatial Perception of Audio J. D. (jj) Johnston Neural Audio Corporation.
3-D Sound and Spatial Audio MUS_TECH 348. Psychology of Spatial Hearing There are acoustic events that take place in the environment. These can give rise.
Hearing Detection Loudness Localization Scene Analysis Music Speech.
Pitch Perception.
3-D Sound and Spatial Audio MUS_TECH 348. Wightman & Kistler (1989) Headphone simulation of free-field listening I. Stimulus synthesis II. Psychophysical.
Reflections Diffraction Diffusion Sound Observations Report AUD202 Audio and Acoustics Theory.
Localizing Sounds. When we perceive a sound, we often simultaneously perceive the location of that sound. Even new born infants orient their eyes toward.
AUDITORY LOCALIZATION Lynn E. Cook, AuD Occupational Audiologist NNMC, Bethesda, MD.
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,
A.Diederich– International University Bremen – Sensation and Perception – Fall Frequency Analysis in the Cochlea and Auditory Nerve cont'd The Perception.
All you have is a pair of instruments (basilar membranes) that measure air pressure fluctuations over time Localization.
Development of sound localization
AUDITORY PERCEPTION Pitch Perception Localization Auditory Scene Analysis.
Spectral centroid 6 harmonics: f0 = 100Hz E.g. 1: Amplitudes: 6; 5.75; 4; 3.2; 2; 1 [(100*6)+(200*5.75)+(300*4)+(400*3.2)+(500*2 )+(600*1)] / = 265.6Hz.
A.Diederich– International University Bremen – USC – MMM – Spring Onset and offset Sounds that stop and start at different times tend to be produced.
Sound source segregation (determination)
© 2010 Pearson Education, Inc. Conceptual Physics 11 th Edition Chapter 21: MUSICAL SOUNDS Noise and Music Musical Sounds Pitch Sound Intensity and Loudness.
L INKWITZ L AB Accurate sound reproduction from two loudspeakers in a living room 13-Nov-07 (1) Siegfried Linkwitz.
Acoustics Reverberation.
Audio Scene Analysis and Music Cognitive Elements of Music Listening
Psychoacoustics: Sound Perception Physics of Music, Spring 2014.
Improved 3D Sound Delivered to Headphones Using Wavelets By Ozlem KALINLI EE-Systems University of Southern California December 4, 2003.
Review Exam III. Chapter 10 Sinusoidally Driven Oscillations.
Sound Waves Sound waves are divided into three categories that cover different frequency ranges Audible waves lie within the range of sensitivity of the.
Applied Psychoacoustics Lecture: Binaural Hearing Jonas Braasch Jens Blauert.
Hearing Chapter 5. Range of Hearing Sound intensity (pressure) range runs from watts to 50 watts. Frequency range is 20 Hz to 20,000 Hz, or a ratio.
Rumsey Chapter 16 Day 3. Overview  Stereo = 2.0 (two discreet channels)  THREE-DIMENSIONAL, even though only two channels  Stereo listening is affected.
Hearing: auditory coding mechanisms. Harmonics/ Fundamentals ● Recall: most tones are complex tones, consisting of multiple pure tones ● The lowest frequency.
3-D Sound and Spatial Audio MUS_TECH 348. Main Types of Errors Front-back reversals Angle error Some Experimental Results Most front-back errors are front-to-back.
Sounds in a reverberant room can interfere with the direct sound source. The normal hearing (NH) auditory system has a mechanism by which the echoes, or.
Fundamentals of Audio Production. Chapter 1 1 Fundamentals of Audio Production Chapter One: The Nature of Sound.
 Space… the sonic frontier. Perception of Direction  Spatial/Binaural Localization  Capability of the two ears to localize a sound source within an.
‘Missing Data’ speech recognition in reverberant conditions using binaural interaction Sue Harding, Jon Barker and Guy J. Brown Speech and Hearing Research.
L INKWITZ L AB S e n s i b l e R e p r o d u c t i o n & R e c o r d i n g o f A u d i t o r y S c e n e s Hearing Spatial Detail in Stereo Recordings.
Jens Blauert, Bochum Binaural Hearing and Human Sound Localization.
Spatial and Spectral Properties of the Dummy-Head During Measurements in the Head-Shadow Area based on HRTF Evaluation Wersényi György SZÉCHENYI ISTVÁN.
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Listeners weighting of cues for lateral angle: The duplex theory of sound localization revisited E. A. MacPherson & J. C. Middlebrooks (2002) HST. 723.
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
Hearing Detection Loudness Localization Scene Analysis Music Speech.
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
3-D Sound and Spatial Audio MUS_TECH 348. Are IID and ITD sufficient for localization? No, consider the “Cone of Confusion”
PSYC Auditory Science Spatial Hearing Chris Plack.
Fletcher’s band-widening experiment (1940)
Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University.
SOUND PRESSURE, POWER AND LOUDNESS
SPATIAL HEARING Ability to locate the direction of a sound. Ability to locate the direction of a sound. Localization: In free field Localization: In free.
3-D Sound and Spatial Audio MUS_TECH 348. What do these terms mean? Both terms are very general. “3-D sound” usually implies the perception of point sources.
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Auditory Localization in Rooms: Acoustic Analysis and Behavior
PSYCHOACOUSTICS A branch of psychophysics
Precedence-based speech segregation in a virtual auditory environment
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
What is stereophony? Stereos = solid (having dimensions: length width, height) Phonics = study of sound stereophony (stereo) is an aural illusion – a.
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
Psychoacoustics: Sound Perception
Hearing Spatial Detail
Speech Perception (acoustic cues)
Localizing Sounds.
Auditory Demonstrations
3 primary cues for auditory localization: Interaural time difference (ITD) Interaural intensity difference Directional transfer function.
Presentation transcript:

Spatially Relocated Frequencies and Their Effect on the Localization of a Stereo Image This research was supported by Delphi Automotive Systems Robert G. Hartman May 2003 Copyright © 2003 Rob Hartman All rights reserved

Content Introduction “Hard to localize?” Localization Cues Localization Cue Salience Auditory Scene Analysis Experimentation Results and Analysis Conclusions References

Introduction

Introduction Rmic Rspkr Lmic Lspkr

? vs. ? Introduction Tweeter Midrange / Woofer Tweeter

Demonstration: Note piano shift from center to right

“Hard to localize?”

“Hard to localize?” Popular opinion Smyth [1] states: “Low” frequencies / subwoofers Smyth [1] states: “experimental evidence suggests that it is difficult to localize mid-to-high frequency signals above about 2.5 kHz, and therefore any stereo imagery is largely dependent on the accurate reproduction of only the low-frequency components of the audio signal” (p. 18). Minimum Audible Angle (MAA) tests [2, 3] Generally, humans are least sensitive to spatial changes in the “middle frequency” (2-4 kHz) range

“Hard to localize?” Do certain conditions make it harder? Steady/continuous sounds are harder to localize than impulsive sounds Transient ILD and ITD localization cues Narrow-band (octaves, sinusoid, etc.) are harder to localize than wide-band (complex) Less bandwidth means less cues to compare Frequency range and the acoustic space In free field or an anechoic space, middle frequency tones are harder to localize than low or high [2,3] Room modes and reflections most affect the ability to localize low frequencies [4]

“Hard to localize?” Ultimately it’s a complex process Depends on type of localization cues present Physical presence of cues and agreement between them Psychoacoustical importance relative to one another Correlation between sources (multi-source) These factors vary with the spectral content of the sources and their relative position to the listener

Localization Cues

Localization Cues 1) Interaural Time Differences (ITD) Due to constant speed of sound with path length differences to the ears. Arrival (IATD), Phase (IPD), Envelope (IETD)

Localization Cues 2) Interaural Level Differences (ILD) Due to acoustical interaction of sound with the head and body. ILD varies significantly with frequency

Localization Cues 3) Monaural / Pinnae Cues Spectral influences in the 5-12 kHz range Differentiates sources with the same ILD or ITD cues (cone of confusion) Helps avoid front/back confusions and determine vertical height of sources Least dominant cue, can use head turn also [5]

Localization Cue Salience

Localization Cue Salience Salience based on Physical and Perceptual factors Physical variations ITD due to Spectral Content

Localization Cue Salience Physical variations cont. ITD due to Spatial Position

Localization Cue Salience Physical variations cont. ILD due to Spectral and Spatial factors

Localization Cue Salience Physical variations cont. In reality, complex patterns exist for spectral and spatial (azimuth) variations [6] 90 60 30

Localization Cue Salience Perceptual Salience Assuming physical level of cues are identical, salience depends on the spectral content and relative dominance of cues “Trading experiments” Tests which remove any physical limitations and study only perceptual sensitivity [6] Sensitivity to IPD is greatest for f <800 Hz. Above this, influence is reduced, having no affect above 1.6 kHz ILD has impact over entire frequency range, with a slight increase in sensitivity around 2 kHz.

Localization Cue Salience “Trading experiments” cont. Generally, low frequency ITDs dominate, followed by ILDs. Pinnae cues have the least influence. For low frequencies, 40 us ITD = 1 dB ILD; whereas higher frequencies exhibit only ILD sensitivity Full lateral displacement occurs at 630-1000 us ITD or 15-20 dB ILD

Localization Cue Salience Summary figures

Auditory Scene Analysis

Auditory Scene Analysis Important to consider the effect that multiple “streams” of sound can have on resulting stereo image Depending on temporal and spectral correlation of streams, resulting image could SHIFT (summing localization), SPLIT, WIDEN, etc. ?

Auditory Scene Analysis Precedence Effect Perceptually suppresses similar events occurring 20-30 msec after original event Delayed events have an effect on the perceived location, as has been shown (summing localization) Above 30 msec, audible “echoes” begin to occur

Auditory Scene Analysis Auditory Stream Segregation [8] Cocktail party effect Temporal Interrelationship Increased time differences causes segregation (precedence effect) Relative Similarity of Fundamental Frequencies Increased difference in perceived pitch causes segregation (binaural beats, etc.) Spectral Distribution (harmonics) Timbre helps differentiate similar instruments Perceptual Location of the Auditory Events Sounds with above differences are more likely segregated with non-coinciding “perceived” spatial locations

Experimentation

Experimentation Test setup was a Stereo pair with additional offset “Spatially Relocated” (SR) speaker

Experimentation Listeners asked to comment on relative shift of central stereo image caused by moving “low” and “high” frequency bands from L to SR channel

Experimentation Frequency bands were chosen based on known localization cue performance [6] Band A = 20 – 800 Hz (dominant ITDs) Band B = 800 – 1,600 Hz (reduced ITDs) Band C = 1,600 – 5,000 Hz (reduced ILD) Band D = 5,000 – 12,000 Hz (ILD / pinnae) Band E = 12,000 – 20,000 Hz (dominant ILD)

Experimentation Actual test variables compare moving “high” vs. “low” frequency bands to SR channel E vs. Stereo (STR), E vs. A, E vs. AB DE vs. STR, DE vs. A, DE vs. AB, DE vs. ABC CDE vs. STR, CDE vs. A, CDE vs. AB Test signals Ideal signal was spectrally-balanced white noise Music track also used, despite typical “low levels” of high frequency energy

Demonstration: White Noise Bursts (E vs. A) Candy Perfume Girl (E vs Demonstration: White Noise Bursts (E vs. A) Candy Perfume Girl (E vs. A) What is Hip? (E vs. A)

Results and Analysis

Results and Analysis

Results and Analysis

Results and Analysis Noise Results Confident “no shift” E vs. STR & “right” for E vs. A Band A causes more shift than Band E DE vs. STR suggests DE is “just right” of STR; whereas A is definitely “right” of DE Band A causes more shift than band DE CDE is right of STR, but A is also to the right of CDE. Band A causes more shift than band CDE!

Results and Analysis

Results and Analysis

Results and Analysis Music Results Confident “no shift” for E vs. STR and “right” vs. A Band A causes more shift than Band E Confident “no shift” for DE vs. STR and “right” DE vs. A Spectrogram shows low energy in band DE CDE is “right” of STR, but A is less confident While band CDE does have some energy, it is less than white noise. Thus, CDE does not cause as much shift.

Results and Analysis How does moving bands to the SR channel affect the localization cues? Change in azimuth (15 ) creates a new path to the ears IATD will decrease, due to smaller path difference IPD is more complex because of dependence on spectral content ILD expected to minimally change; more for HF than LF

Results and Analysis Why do the low frequency (LF) bands create further shifts of the stereo image than the high frequency (HF) bands? SR Band Loudness? If the LF bands are louder than the HF bands Type of Test Signals? Would music w/ more high frequency energy produce similar results as white noise test track? Localization Cue Salience? Are dominance of LF vs. HF cues the cause?

Results and Analysis SR Band LOUDNESS Calculated using Steven’s Mark VII Method [7] Band E is 1.5 times louder than band A! Also performed “loudness” listening experiments showing similar results HF bands are typically louder than LF bands

Demonstration: SR Band Loudness (E, A, DE, AB, ABC)

Results and Analysis Type of Test Signals To study the apparent difficulty in noticing the spatial relocation of high frequencies, ABX testing was performed w/ two music tracks. Compared “stereo” (A) with shifting left signal’s HF (> 10 kHz) to SR channel (B)

Results and Analysis Type of Test Signals Sampled 52 critical listening music tracks, and chose tracks with greatest energy above 10kHz Only ~3% of average energy (i.e. large quantities of HF energy not common in music) Results show inability to reliably notice HF spatial relocation (39% and 49% correct responses, worse than guessing)

Results and Analysis Localization Cue Salience? Moving LF bands affects ITD cues with minimal ILD changes Moving HF bands affects ILD cues with meaningless ITD changes. ITD cues are the most dominant cue [6], and could be expected to create more noticeable changes to the stereo image.

Conclusions

Conclusions Which sounds are “hard to localize?” Continuous more difficult than impulsive Low frequencies in a reflecting room Middle frequencies (2-4 kHz), over high or low, in an anechoic room Narrow band (octaves, sinusoids) more difficult than wideband (i.e. more cues to compare gives better sense of localization)

Conclusions How do we localize? Most sensitive to ITDs (particularly IPD) caused by spectral content and path length differences between the ears Lesser sensitivity to ILDs – although most sensitive in mid. Frequency range (~ 2kHz) Least sensitive to monaural/pinna cues, which help avoid front/back confusion and determine height of sound. Head tilt and turn provides similar information.

Conclusions The RESULTS Will the LF or HF bands cause greater shift to a stereo image? Moving LF caused more shift than HF bands w/ white noise Moving band “E” (>12 kHz) was typically not noticed In ABX music testing, moving HF energy (>10 kHz) was difficult to discern (worse than guessing) Is this explainable? NOT due to band loudness The HF bands were shown to be LOUDER than the LF bands Large proportions of HF energy uncommon in music Most likely due to the relative dominance of low frequency ITD cues

Conclusions Future Research Mono tweeter system Simpler Experiments HF image shift due to improper ILD cues Using HRTF may help reduce noticeability Overall perception still dominated by “stereo” low-mid range speakers Simpler Experiments Relocate frequencies vertically instead of horizontally Will create less noticeable shifts Different variables “equally loud” bands, different loudspeakers, the acoustic space, varied speaker locations More “scientific” approach with analysis of ear recordings

Thank you for your participation! Questions & Answers ? Thank you for your participation!

References [1] Smyth, M. (1999). White paper: an overview of the coherent acoustics coding system. Retrieved October 1, 2001, from http://www.dtsonline.com/whitepaper.pdf [2] Mills, A.W. (1958). On the minimum audible angle. J. Acoust. Soc. Am., 30, 237-246. [3] Stevens, S.S., & Newman, E.B. (1936). Localization of actual sources of sound. Amer. J. Psychol., 48, 297-306. [4] Hartmann, W.M. (1983). Localization of sound in rooms. J. Acoust. Soc. Am., 74 (5), 1380-1392 [5] Fisher, H., & Freedman, S.J. (1968). The role of the pinna in auditory localization. J. Audi. Res., 8, 15-26. [6] Blauert, J. (1999). Spatial hearing: the psychophysics of human sound localization. Cambridge, Mass.: The MIT Press. (Original work published in 1974) [7] Stevens, S.S. (1972). Perceived level of noise by mark VII and decibels (E). J. Acoust. Soc. Am., 51, 575-601. [8] Bregman, A.S. (1999). Auditory scene analysis: the perceptual organization of sound. Cambridge, Mass: MIT Press.