AES 120 th Convention Paris, France, 2006 Adaptive Time-Frequency Resolution for Analysis and Processing of Audio Alexey Lukin AES Student Member Moscow.

Slides:



Advertisements
Similar presentations
Frequency analysis.
Advertisements

Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
The evaluation and optimisation of multiresolution FFT Parameters For use in automatic music transcription algorithms.
Time-scale and pitch modification Algorithms review Alexey Lukin.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
Loris for Your Cough Roshan Mansinghani, Esmeralda Martinez, James McDougall, Travis McPhail Results: The noise frequencies were completely removed including.
2004 COMP.DSP CONFERENCE Survey of Noise Reduction Techniques Maurice Givens.
Time-Frequency Analysis of Non-stationary Phenomena in Electrical Engineering Antonio Bracale, Guido Carpinelli Universita degli Studi di Napoli “Federico.
Extensions of wavelets
1 Machine learning for note onset detection. Alexandre Lacoste & Douglas Eck.
Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom.
0 - 1 © 2007 Texas Instruments Inc, Content developed in partnership with Tel-Aviv University From MATLAB ® and Simulink ® to Real Time with TI DSPs Wavelet.
An Introduction to S-Transform for Time-Frequency Analysis S.K. Steve Chang SKC-2009.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
1 Speech Parametrisation Compact encoding of information in speech Accentuates important info –Attempts to eliminate irrelevant information Accentuates.
Time and Frequency Representations Accompanying presentation Kenan Gençol presented in the course Signal Transformations instructed by Prof.Dr. Ömer Nezih.
2005/11/101 KOZ Scalable Audio Speaker: 陳繼大 An Introduction.
Biomedical signal processing: Wavelets Yevhen Hlushchuk, 11 November 2004.
Wavelet Transform 國立交通大學電子工程學系 陳奕安 Outline Comparison of Transformations Multiresolution Analysis Discrete Wavelet Transform Fast Wavelet Transform.
DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.
Paul Heckbert Computer Science Department Carnegie Mellon University
System Microphone Keyboard Output. Cross Synthesis: Two Implementations.
Multi-Resolution Analysis (MRA)
A PRE-STUDY OF AUTOMATIC DETECTION OF LEP EVENTS ON THE VLF SİGNALS.
Representing Acoustic Information
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Audio and Music Representations (Part 2) 1.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Details, details… Intro to Discrete Wavelet Transform The Story of Wavelets Theory and Engineering Applications.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
CSE &CSE Multimedia Processing Lecture 8. Wavelet Transform Spring 2009.
Multiresolution STFT for Analysis and Processing of Audio
SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.
Dual-Channel FFT Analysis: A Presentation Prepared for Syn-Aud-Con: Test and Measurement Seminars Louisville, KY Aug , 2002.
Lecture 13 Wavelet transformation II. Fourier Transform (FT) Forward FT: Inverse FT: Examples: Slide from Alexander Kolesnikov ’s lecture notes.
README Lecture notes will be animated by clicks. Each click will indicate pause for audience to observe slide. On further click, the lecturer will explain.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Basics of Neural Networks Neural Network Topologies.
Wavelet Packets  Shortcomings of standard orthogonal (bi-orthogonal) multi-resolution structure of DWT,  Insufficient flexibility for the analysis of.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
ECE472/572 - Lecture 13 Wavelets and Multiresolution Processing 11/15/11 Reference: Wavelet Tutorial
Gammachirp Auditory Filter
“Digital stand for training undergraduate and graduate students for processing of statistical time-series, based on fractal analysis and wavelet analysis.
Wavelets and Multiresolution Processing (Wavelet Transforms)
Pre-Class Music Paul Lansky Six Fantasies on a Poem by Thomas Campion.
Time Frequency Analysis
Ch4 Short-time Fourier Analysis of Speech Signal z Fourier analysis is the spectrum analysis. It is an important method to analyze the speech signal. Short-time.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
COMPARING NOISE REMOVAL IN THE WAVELET AND FOURIER DOMAINS Dr. Robert Barsanti SSST March 2011, Auburn University.
Automatic Equalization for Live Venue Sound Systems Damien Dooley, Final Year ECE Progress To Date, Monday 21 st January 2008.
Analysis of Traction System Time-Varying Signals using ESPRIT Subspace Spectrum Estimation Method Z. Leonowicz, T. Lobos
The Story of Wavelets Theory and Engineering Applications
By Dr. Rajeev Srivastava CSE, IIT(BHU)
Suppression of Musical Noise Artifacts in Audio Noise Reduction by Adaptive 2D filtering Alexey Lukin AES Member Moscow State University, Moscow, Russia.
Signal acquisition A/D conversion Sampling rate  Nyquist-Shannon sampling theorem: If bandlimited signal x(f) holds in [-B;B], then if f s = 1 / T.
Presenter : r 余芝融 1 EE lab.530. Overview  Introduction to image compression  Wavelet transform concepts  Subband Coding  Haar Wavelet  Embedded.
Speech Processing Dr. Veton Këpuska, FIT Jacob Zurasky, FIT.
Wavelet Transform Advanced Digital Signal Processing Lecture 12
CS 591 S1 – Computational Audio
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
CS Digital Image Processing Lecture 9. Wavelet Transform
III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
Analysis of Audio Using PCA
Wavelet Transform Fourier Transform Wavelet Transform
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

AES 120 th Convention Paris, France, 2006 Adaptive Time-Frequency Resolution for Analysis and Processing of Audio Alexey Lukin AES Student Member Moscow State University, Moscow, Russia Jeremy Todd AES Member iZotope, Inc., Cambridge, MA

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 2/15 Short-Time Fourier Transform Most commonly used transform for audio: Most commonly used transform for audio: ► Spectral analysis ► Noise reduction (spectral subtraction algorithms) ► Time-variable filters and other effects Very fast implementation for large number of bands via FFT Very fast implementation for large number of bands via FFT Good energy compaction for many musical signals Good energy compaction for many musical signals Many oscillations in basis functions → ringing (Gibbs phenomenon) Many oscillations in basis functions → ringing (Gibbs phenomenon) Uniform frequency resolution → inadequate resolution at lows Uniform frequency resolution → inadequate resolution at lows + –

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 3/15 Filter banks Idea: Idea: Decompositions of time-frequency plane Decompositions of time-frequency plane Decomposition Processing of subband signals Synthesis x[n]y[n] …… f t STFT f t DWT Uncertainty principle

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 4/15 Suggested approach Transforms must vary their time-frequency resolution in a perceptually motivated way ► Imitation of time-frequency resolution of human hearing ► Adaptation of resolution to local signal features

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 5/15 Spectrograms Problems: Problems: ► Most perceptually meaningful energy is concentrated in the narrow band below 4 kHz → can’t see useful details ► Time/frequency resolution trade-off Conventional STFT spectrogram (linear frequency scale)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 6/15 Spectrograms Problems: Problems: ► Poor frequency resolution at low frequencies → can’t separate bass harmonics from bass drum ► Time/frequency resolution trade-off Mel-scale STFT spectrogram (window size = 12 ms)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 7/15 Spectrograms Problems: Problems: ► Poor time resolution at transients → time-smearing of drums Mel-scale STFT spectrogram (window size = 93 ms)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 8/15 Spectrograms Simple solution: combine spectrograms with different resolutions Simple solution: combine spectrograms with different resolutions ► Take bass from spectrogram with good freq. resolution ► Take treble from spectrogram with good time resolution

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 9/15 Spectrograms Simple solution: Simple solution: ► Combine spectrograms with different resolutions: take bass from spectrogram with good frequency resolution, take treble from spectrogram with good time resolution Combined resolution spectrogram (window sizes from 12 to 93 ms)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 10/15 Spectrograms Better approach: select best resolution for each time-frequency neighborhood Better approach: select best resolution for each time-frequency neighborhood Criteria? Criteria? ► Better frequency resolution at bass (reflects a-priori psychoacoustical knowledge) ► Maximal energy compaction (to minimize spectral smearing in both time and frequency) 6 ms12 ms24 ms48 ms96 ms best STFT window size

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 11/15 Spectrograms Calculation of energy compaction Calculation of energy compaction (energy smearing in the given block for all given resolutions) 6 ms12 ms24 ms48 ms96 ms best STFT window size a i,r Here a i,r are descendingly sorted STFT magnitudes in the block, r S r is the energy smearing for the given resolution r, r 0 r 0 is the resolution with best energy compaction.

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 12/15 Spectrograms Benefits: Benefits: ► Sharper bass drum hits and other transients, even in mid- frequency range ► Sharper guitar harmonics in high frequencies Adaptive resolution spectrogram (window sizes from 12 to 93 ms)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 13/15 Spectrograms Tone onset waveform More examples Conventional STFT spectrogram

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 14/15 Spectrograms Combined resolution spectrogram More examples Adaptive resolution spectrogram

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 15/15 Processing framework General framework for General framework for multi-resolution processing ► Perform processing with several different resolutions ► Adaptively combine (mix) results in time-frequency space ► Mixing is controlled by a-priori knowledge of psychoacoustics and analysis of local signal features (e.g. transience)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 16/15 Noise reduction Spectral subtraction (short windows) Mixer of coefficients y[t] x 3 [t] Spectral subtraction (long windows) STFT Synthesis x 1 [t] x 2 [t] Transience analysis control Spectral subtraction algorithm modifications Spectral subtraction algorithm modifications ► Better frequency resolution at low frequencies (according to the human hearing resolution) ► Better temporal resolution near signal transients (for reduction of Gibbs phenomenon)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 17/15 Noise reduction Results of single-resolution and multi-resolution algorithms Results of single-resolution and multi-resolution algorithms Noisy recording (guitar + castanets)

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 18/15 Noise reduction Results of single-resolution and multi-resolution algorithms Results of single-resolution and multi-resolution algorithms Single resolution Multi-resolution

A. Lukin, J. Todd “Adaptive Time-Frequency Resolution” 19/15 Your questions Demo web page: Poster session P17: Monday, 9:00 – 10:30 a.m. ?