Pitch perception in auditory scenes 2 Papers on pitch perception… of a single sound source of more than one sound source LOTS - too many? Almost none.

Slides:



Advertisements
Similar presentations
Hearing & Deafness (4) Pitch Perception 1. Pitch of pure tones 2. Pitch of complex tones.
Advertisements

Hearing Complex Sounds
Hearing relative phases for two harmonic components D. Timothy Ives 1, H. Martin Reimann 2, Ralph van Dinther 1 and Roy D. Patterson 1 1. Introduction.
Auditory scene analysis 2
Frequency selectivity of the auditory system. Frequency selectivity Important for aspects of auditory perception such as, pitch, loudness, timbre, melody,
Psychoacoustics Riana Walsh Relevant texts Acoustics and Psychoacoustics, D. M. Howard and J. Angus, 2 nd edition, Focal Press 2001.
Psychoacoustics Perception of Direction AUD202 Audio and Acoustics Theory.
Multipitch Tracking for Noisy Speech
Sound source segregation Development of the ability to separate concurrent sounds into auditory objects.
The Physics of Sound Sound begins with a vibration of an object Vibrating object transfers energy to air medium All complex vibration patterns seen as.
Timbre perception. Objective Timbre perception and the physical properties of the sound on which it depends Formal definition: ‘that attribute of auditory.
Pitch of unresolved harmonics: Evidence against autocorrelation Christian Kaernbach and Carsten Bogler Institut für Allgemeine Psychologie, Universität.
Periodicity and Pitch Importance of fine structure representation in hearing.
CS 551/651: Structure of Spoken Language Lecture 11: Overview of Sound Perception, Part II John-Paul Hosom Fall 2010.
Speech Science XII Speech Perception (acoustic cues) Version
Hearing Detection Loudness Localization Scene Analysis Music Speech.
Pitch Perception.
INTRODUCTION TO HEARING. WHAT IS SOUND? amplitude Intensity measured in decibels.
Auditory Scene Analysis (ASA). Auditory Demonstrations Albert S. Bregman / Pierre A. Ahad “Demonstration of Auditory Scene Analysis, The perceptual Organisation.
Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,
A.Diederich– International University Bremen – Sensation and Perception – Fall Frequency Analysis in the Cochlea and Auditory Nerve cont'd The Perception.
Speech perception Relating features of hearing to the perception of speech.
All you have is a pair of instruments (basilar membranes) that measure air pressure fluctuations over time Localization.
Exam and Assignment Dates Midterm 1 Feb 3 rd and 4 th Midterm 2 March 9 th and 10 th Final April 20 th and 21 st Idea journal assignment is due on last.
The Science of Sound Chapter 8
Using Fo and vocal-tract length to attend to one of two talkers. Chris Darwin University of Sussex With thanks to : Rob Hukin John Culling John Bird MRC.
Interrupted speech perception Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota.
Source Segregation Chris Darwin Experimental Psychology University of Sussex.
Hearing & Deafness (4) Pitch Perception 1. Pitch of pure tones 2. Pitch of complex tones.
Localising multiple sounds. Phenomenology Different sounds localised appropriately The whole of a sound is localised appropriately …even when cues mangled.
Tone and Voice: A Derivation of the Rules of Voice- Leading from Perceptual Principles DAVID HURON Music Perception 2001, 19, 1-64.
Auditory Objects of Attention Chris Darwin University of Sussex With thanks to : Rob Hukin (RA) Nick Hill (DPhil) Gustav Kuhn (3° year proj) MRC.
There are several clues you could use: 1.arrival time 2.phase lag (waves are out of sync) 3.sound shadow (intensity difference)- sound is louder at ear.
Hearing & Deafness (3) Auditory Localisation
AUDITORY PERCEPTION Pitch Perception Localization Auditory Scene Analysis.
Auditory Scene Analysis Chris Darwin Need for sound segregation Ears receive mixture of sounds We hear each sound source as having its own appropriate.
Spectral centroid 6 harmonics: f0 = 100Hz E.g. 1: Amplitudes: 6; 5.75; 4; 3.2; 2; 1 [(100*6)+(200*5.75)+(300*4)+(400*3.2)+(500*2 )+(600*1)] / = 265.6Hz.
Sound source segregation (determination)
Chapter 6: The Human Ear and Voice
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.
Hearing.
Physics 1251 The Science and Technology of Musical Sound Unit 2 Session 14 MWF Human Perception: Loudness Unit 2 Session 14 MWF Human Perception: Loudness.
C enter for A uditory and A coustic R esearch Representation of Timbre in the Auditory System Shihab A. Shamma Center for Auditory and Acoustic Research.
Audio Scene Analysis and Music Cognitive Elements of Music Listening
Exam 1 February 6 – 7 – 8 – 9 Moodle testing centre.
Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.
Mr Background Noise and Miss Speech Perception in: by Elvira Perez and Georg Meyer.
Applied Psychoacoustics Lecture: Binaural Hearing Jonas Braasch Jens Blauert.
Hearing: auditory coding mechanisms. Harmonics/ Fundamentals ● Recall: most tones are complex tones, consisting of multiple pure tones ● The lowest frequency.
Studies of Information Coding in the Auditory Nerve Laurel H. Carney Syracuse University Institute for Sensory Research Departments of Biomedical & Chemical.
‘Missing Data’ speech recognition in reverberant conditions using binaural interaction Sue Harding, Jon Barker and Guy J. Brown Speech and Hearing Research.
Additivity of auditory masking using Gaussian-shaped tones a Laback, B., a Balazs, P., a Toupin, G., b Necciari, T., b Savel, S., b Meunier, S., b Ystad,
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
Hearing Detection Loudness Localization Scene Analysis Music Speech.
Nuclear Accent Shape and the Perception of Syllable Pitch Rachael-Anne Knight LAGB 16 April 2003.
Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
SPHSC 462 HEARING DEVELOPMENT Overview Review of Hearing Science Introduction.
Applied Psychoacoustics Lecture 4: Pitch & Timbre Perception Jonas Braasch.
PSYC Auditory Science Spatial Hearing Chris Plack.
Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University.
What can we expect of cochlear implants for listening to speech in noisy environments? Andrew Faulkner: UCL Speech Hearing and Phonetic Sciences.
SPATIAL HEARING Ability to locate the direction of a sound. Ability to locate the direction of a sound. Localization: In free field Localization: In free.
PSYCHOACOUSTICS: SOUND PERCEPTION PHYSICS OF MUSIC, SPRING 2016.
PSYCHOACOUSTICS A branch of psychophysics
Precedence-based speech segregation in a virtual auditory environment
Consistent and inconsistent interaural cues don't differ for tone detection but do differ for speech recognition Frederick Gallun Kasey Jakien Rachel Ellinger.
The influence of hearing loss and age on sensitivity to temporal fine structure Brian C.J. Moore Department of Experimental Psychology, University of Cambridge,
Psychoacoustics: Sound Perception
Speech Perception (acoustic cues)
Presentation transcript:

Pitch perception in auditory scenes

2 Papers on pitch perception… of a single sound source of more than one sound source LOTS - too many? Almost none ± a few

3 Need for sound segregation Ears receive mixture of sounds We hear each sound source as having its own appropriate timbre, pitch, location Stored information about sounds concerns a single source

4 Bach: Musical Offering (strings)

5 Talk outline Low-level grouping cues in pitch perception –harmonic structure Conjoint or disjoint allocation –onset-time –context –spatial cues. Use of a difference in harmonic structure between two sound sources to help identify, track over time, and localise simultaneous sound sources.

6 Excitation pattern of complex tone on basilar membrane base apex log (ish) frequency resolvedunresolved Output of 1600 Hz filterOutput of 200 Hz filter /200s = 5ms /200s = 5ms

7 Mistuned harmonic’s contribution to pitch declines as Gaussian function of mistuning Experimental evidence from mistuning expts: Moore, Glasberg & Peters JASA (1985). Darwin (1992). In M. E. H. Schouten (Ed). The auditory processing of speech: from sounds to words Berlin: Mouton de Gruyter frequency (Hz) Match low pitch ∆F 0 (Hz) Frequency of mistuned harmonic f (Hz) ∆F o = a - k ∆f exp(-∆f 2 / 2 s 2 ) s = 19.9 k = 0.073

8 Harmonic Sieve Only consider frequencies that are close enough to harmonic. Useful as front-end to a Goldstein-type model of pitch perception. Duifhuis, Willems & Sluyter JASA (1982). frequency (Hz) Hz sieve spacing 0 blocked

9 Is “harmonic sieve” necessary with autocorrelation models? Autocorrelation could in principle explain mistuning effect – mistuned harmonic initially shifts autocorrelation peak –then produces its own peak But the numbers do not work out. –Meddis & Hewitt model is too tolerant of mistuning.

10 Disjoint allocation ? Will the 200-Hz series grab the 600-Hz component, and so make it give less of a pitch shift to the 155- Hz series ? Double Complex, Ipsilateral TARGET MATCH 600Hz Darwin, Buffa, Williams and Ciocca (1992). in Auditory physiology and perception edited by Cazals, Horner and Demany (Pergamon, Oxford)

11 Disjoint allocation… (2) No evidence of 600-Hz component being grabbed by "in-tune" 200-Hz complex

12 Pitch determined by conjoint allocation Decision on each simultaneous pitch taken independently.

13 Mistuning & pitch vowel complex Mean pitch shift (Hz) % Mistuning of 4th Harmonic 8 subjects 90 ms

14 Onset asynchrony & pitch vowel complex Onset Asynchrony T (ms) Mean pitch shift (Hz) 8 subjects ±3% mistuning 90 ms T

15 Adaptation ? Increases effect of mistuning

16 Use of onset-tme A long onset-time removes a harmonic from pitch calculation. Not just due to adaptation because of regrouping experiment NB Onset-time much longer for pitch (100s of ms) than for hearing sound out as separate harmonic or for timbre judgements of complex (10s of ms).

17 Repeated context and pitch Darwin, Hukin, & Al-Khatib, B. Y. (1995). "Grouping in pitch perception: evidence for sequential constraints," J. Acoust. Soc. Am. 98,

18 Pitch perception is not a bacon slicer Pitch perception does not operate on independent time- slices It uses, for example, the “Old + New” heuristic to parse which components are temporally relevant Speckschneidermaschine

19 These conclusions apply to resolved harmonics Separate simultaneous pitches only audible with unresolved harmonics when each pitch comes predominantly from a separate frequency region. Problem for autocorrelation models And the Old+New Heuristic breaks down. No advantage for onset-time.

20 Superimposing unresolved harmonics Time (s) 00.1 ミ Time (s) 00.1 ミ Time (s) 00.1 ミ kHz - 3kHz Fo = 100 Hz Fo = 125 Hz sum Carlyon, R. P. (1996). "Encoding the fundamental frequency of a complex tone in the presence of a spectrally overlapping masker," J. Acoust. Soc. Am. 99,

21 Spatial cues ? Contralateral mistuned component contributes almost as much as ipsilateral (k=.073 vs.061) Cf Beerends and Houtsma JASA (1989) ipsilateral contralateral ∆F 0 (Hz) Frequency of mistuned harmonic f (Hz) ∆F o = a - k ∆f exp(-∆f 2 / 2 s 2 ) s = 17.4 k = s = 19.9 k = 0.073

22 Summary of constraints for pitch Harmonic (Gaussian s.d. 3%) Onset time (~ ms) Repeated Context V. weak spatial constraint Independent estimates of different simultaneous F0s (conjoint allocation) N.B. These constraints different for unresolved harmonics and for e.g. timbre / vowel perception Old +New

23 Using pitch differences to separate different objects Helps intelligibility of simultaneous speech Helps to track one voice in presence of others Helps to separate objects for independent localisation

24  Fo between two sentences (Bird & Darwin 1998; after Brokx & Nooteboom, 1982) Two sentences (same talker) only voiced consonants (with very few stops) Thus maximising Fo effect Task: write down target sentence Replicates & extends Brokx & Nooteboom Target sentence Fo = 140 Hz Masking sentence = 140 Hz ± 0,1,2,5,10 semitones

25 Example of using harmonicity for grouping for another property How do we localise complex sounds when more than one sound source is present? Maybe we group sounds first and then pool the localisation estimates of the component frequencies for that sound. More stable than grouping by location?

26 Localisation by ITD: Jeffress / Trahiotis & Stern 500 Hz Narrow-band noise Right ear leads by 1.5 ms 500 Hz Wider-band noise Right ear leads by 1.5 ms Hear on Left (phase ambiguity) Hear on Right (consistent ITD)

27 Resolution of phase ambiguity

28 Asynchrony & localisation Pointer IID (dB) Asynchrony of 500 Hz tone (ms) 6 Ss

29 Mistuning & localisation Pointer IID (dB) Ss percent mistuning