Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gammachirp Auditory Filter

Similar presentations


Presentation on theme: "Gammachirp Auditory Filter"— Presentation transcript:

1 Gammachirp Auditory Filter
Alex Park May 7th, 2003

2 Project Overview Goal: Background: Comparison: Extension:
Investigate use of (non-linear) auditory filters for speech analysis Background: Sound analysis in auditory periphery similar to wavelet transform Comparison: Traditional Short-Time Fourier analysis Gammatone wavelet based analysis (auditory filter) Extension: Gammachirp filter has level-dependent parameters which can model non-linear characteristics of auditory periphery Implementation: Specifics of Gammachirp implementation How to incorporate level dependency

3 Auditory Physiology Sound pressure variation in the air is transduced through the outer and middle ears onto end of cochlea Basilar membrane which runs throughout the cochlea maps place of maximal displacement to frequency Outer ear Middle ear Cochlea Auditory Nerve Low freq (200 Hz) Cortex High freq (20 kHz) Basilar Membrane

4 Motivation – Why better auditory models?
Automatic Speech Recognition (ASR) ASR systems perform adequately in ‘clean’ conditions Robustness is a major problem; degradation in low SNR conditions is much worse than humans Hearing research Build better hearing aids and cochlear implants Hearing impaired subjects with damaged cochlea have trouble understanding speech in noisy environments Current hearing aids perform linear amplification, amplify noise as well as the signal Is the lack of compressive non-linearity in the front-end a common link?

5 Non-stationary Nature of Speech
Why is speech a good candidate for local frequency analysis? Waveform of the word “tapestry” /t/ transient /ae/ tone /s/ noise

6 Time-Frequency Representation
The most common way of representing changing spectral content is the Short Time Fourier Transform (STFT) Power FFT

7 Spectrogram from STFT “tapestry”

8 STFT Characteristics We can think of the STFT as filtering using the following basis In the frequency domain, we are using a filterbank consisting of linearly spaced, constant bandwidth filters Freq (Hz)

9 Auditory Filterbanks Unlike the STFT, physiological data indicates that auditory filters: are spaced more closely at lower freq than at high freq have narrower bandwidths at lower frequencies (constant-Q) The Gammatone filter bank proposed by Patterson, models these characteristics using a wavelet transform. The mother wavelet, or kernel function, is Gamma Envelope Tone carrier

10 Gammatone Characteristics
Unlike the STFT, the Gammatone filterbank uses the following basis The corresponding frequency responses are Freq (Hz)

11 What are we missing? The Gammatone filterbank has constant-Q bandwidths and logarithmic spacing of center frequencies Also, Gamma envelope guarantees compact support But, the filters are 1) symmetric and 2) linear Psychophysical experiments indicate that auditory filter shapes are: 1) Asymmetric Sharper drop-off on high frequency side 2) Non-linear Filter shape and gain change depending on input level Compressive non-linearity of the cochlea Important for hearing in noise and for dynamic range

12 Gammachirp Characteristics
The Gammachirp filter developed by Irino & Patterson uses a modified version of the Gammatone kernel Gamma Envelope Tone carrier Chirp term Frequency response is asymmetric, can fit passive filter Level-dependent parameters can fit changes due to stimulus

13 Implementation Looking in the frequency domain, the Gammachirp can be obtained by cascading a fixed Gammatone filter with an asymmetric filter To fit psychophysical data, a fixed Gammachirp is cascaded with level-dependent asymmetric IIR filters

14 Comparison: Tone vs. Passive Chirp outputs
Gammatone output seems to have better frequency res. Passive Gammachirp output seems to have better time res.

15 Comparison: Tone vs. Active Chirp Outputs

16 Incorporating level dependency
As illustrated in previous slide, passive Gammachirp output offers little advantage on clean speech using fixed stimulus levels We can incorporate parameter control via feedback Compute Passive GC Spectrogram Segment into frames Get stimulus level/channel Filter w/ level specific filter S1 S2 : SN-1 SN For each time frame Reconstruct Frames

17 Sample outputs Clean 30dB SNR 40dB SNR 20dB SNR

18 References Bleeck, S., Patterson, R.D., and Ives, T. (2003) Auditory Image Model for Matlab. Centre for the Neural Basis of Hearing Irino, T. and Patterson, R.D. (2001). “A compressive gammachirp auditory filter for both physiological and psychophysical data,” J. Acoust. Soc. Am. 109, Pickles, J.O. (1988). An Introduction to the Physiology of Hearing (Academic, London). Slaney, M. (1993). “An efficient implementation of the Patterson-Holdsworth auditory filterbank,” Apple Computer Technical Report #35. Slaney, M. (1998). “Auditory Toolbox for Matlab,” Interval Research Technical Report #

19 Sidenote Clean 40 dB SNR 30 dB SNR


Download ppt "Gammachirp Auditory Filter"

Similar presentations


Ads by Google