Speech Enhancement with Binaural Cues Derived from a Priori Codebook

Slides:



Advertisements
Similar presentations
Feedback Reliability Calculation for an Iterative Block Decision Feedback Equalizer (IB-DFE) Gillian Huang, Andrew Nix and Simon Armour Centre for Communications.
Advertisements

Speech Enhancement through Noise Reduction By Yating & Kundan.
Advanced Speech Enhancement in Noisy Environments
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
2004 COMP.DSP CONFERENCE Survey of Noise Reduction Techniques Maurice Givens.
Introduction The aim the project is to analyse non real time EEG (Electroencephalogram) signal using different mathematical models in Matlab to predict.
Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,
Modeling of Mel Frequency Features for Non Stationary Noise I.AndrianakisP.R.White Signal Processing and Control Group Institute of Sound and Vibration.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Handwritten Thai Character Recognition Using Fourier Descriptors and Robust C-Prototype Olarik Surinta Supot Nitsuwat.
1 Speech Enhancement Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
Presented by Tienwei Tsai July, 2005
Heart Sound Background Noise Removal Haim Appleboim Biomedical Seminar February 2007.
Speech Enhancement Using Spectral Subtraction
National Aerospace University “Kharkov Aviation Institute” SPIE Remote Sensing Performance prediction for 3D filtering of multichannel images Oleksii.
REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.
Image Restoration using Iterative Wiener Filter --- ECE533 Project Report Jing Liu, Yan Wu.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Rajeev Aggarwal, Jai Karan Singh, Vijay Kumar Gupta, Sanjay Rathore, Mukesh Tiwari, Dr.Anubhuti Khare International Journal of Computer Applications (0975.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Adv DSP Spring-2015 Lecture#9 Optimum Filters (Ch:7) Wiener Filters.
USE OF IMPROVED FEATURE VECTORS IN SPECTRAL SUBTRACTION METHOD Emrah Besci, Semih Ergin, M.Bilginer Gülmezoğlu, Atalay Barkana Osmangazi University, Electrical.
Study of Broadband Postbeamformer Interference Canceler Antenna Array Processor using Orthogonal Interference Beamformer Lal C. Godara and Presila Israt.
Speech Enhancement for ASR by Hans Hwang 8/23/2000 Reference 1. Alan V. Oppenheim,etc., ” Multi-Channel Signal Separation by Decorrelation ”,IEEE Trans.
Performance Comparison of Speaker and Emotion Recognition
ICASSP 2007 Robustness Techniques Survey Presenter: Shih-Hsiang Lin.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
APPLICATION OF A WAVELET-BASED RECEIVER FOR THE COHERENT DETECTION OF FSK SIGNALS Dr. Robert Barsanti, Charles Lehman SSST March 2008, University of New.
Page 0 of 7 Particle filter - IFC Implementation Particle filter – IFC implementation: Accept file (one frame at a time) Initial processing** Compute autocorrelations,
2016/2/171 Image Vector Quantization Indices Recovery Using Lagrange Interpolation Source: IEEE International Conf. on Multimedia and Expo. Toronto, Canada,
WAVELET NOISE REMOVAL FROM BASEBAND DIGITAL SIGNALS IN BANDLIMITED CHANNELS Dr. Robert Barsanti SSST March 2010, University of Texas At Tyler.
1 A Statistical Matching Method in Wavelet Domain for Handwritten Character Recognition Presented by Te-Wei Chiang July, 2005.
语音与音频信号处理研究室 Speech and Audio Signal Processing Lab Multiplicative Update of AR gains in Codebook- driven Speech.
Presented By: Shamil. C Roll no: 68 E.I Guided By: Asif Ali Lecturer in E.I.
Presentation III Irvanda Kurniadi V. ( )
1 Chapter 8 The Discrete Fourier Transform (cont.)
[1] National Institute of Science & Technology Technical Seminar Presentation 2004 Suresh Chandra Martha National Institute of Science & Technology Audio.
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
Speech Enhancement Algorithm for Digital Hearing Aids
Speech Enhancement Summer 2009
Aleksey S. Rubel1, Vladimir V. Lukin1,
Chapter 7. Classification and Prediction
Scatter-plot Based Blind Estimation of Mixed Noise Parameters
ARTIFICIAL NEURAL NETWORKS
Digital Communications Chapter 13. Source Coding
Vocoders.
Analyzing Redistribution Matrix with Wavelet
Instructor :Dr. Aamer Iqbal Bhatti
Mohamed Chibani, Roch Lefebvre and Philippe Gournay
Presenter by : Mourad RAHALI
Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing
朝陽科技大學 資訊工程系 謝政勳 Application of GM(1,1) Model to Speech Enhancement and Voice Activity Detection 朝陽科技大學 資訊工程系 謝政勳
A Tutorial on Bayesian Speech Feature Enhancement
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
EE513 Audio Signals and Systems
第 四 章 VQ 加速運算與編碼表壓縮 4-.
Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
Presented by Chen-Wei Liu
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Random Neural Network Texture Model
Channel Estimation for Orthogonal Time Frequency Space (OTFS) Massive MIMO Good morning everyone! I am very glad to be here to share my work about channel.
Presentation transcript:

Speech Enhancement with Binaural Cues Derived from a Priori Codebook Students and Teachers, good afternoon. I am glad to have the chance to give my presentation here. Today I would like to talk to you  about some of our work in the field of the codebook-based speech enhancement. The tile of my presentation is”…” Reporter:Nan Chen Beijing University of Technology http://www.bjut.edu.cn/sci/voice/index.htm

Results and Conclusions Contents Introduction 1 The Proposed Method 2 Results and Conclusions 3 4 I’d like to give this presentation in three parts. At first, I want to talk about the introduction of the presentation. Then the proposed method is described in detail. At last, we give the experimental results and the conclusions are summarized here. http://www.bjut.edu.cn/sci/voice/index.htm

Introduction 1 http://www.bjut.edu.cn/sci/voice/index.htm

Noise Introduction Street Car Babble office http://www.bjut.edu.cn/sci/voice/index.htm

The traditional method of speech enhancement Introduction Spectral-Subtractive Algorithms Wiener Filtering Statistical-Model-Based Methods Subspace Algorithms 1 2 3 4 The traditional method of speech enhancement Until now the monaural speech enhancement is a challenging task for speech communication, such as speech coding and speech recognition, . The traditional method …have obtained a good performance for stationary noise, but the performance of these methods become worse when the non-stationary noise is introduced. The reason why this problem happens is that we cannot gain the accurate noise estimation from the noisy observation. If we can know some prior information about speech and noise in advance, the performance will be better. http://www.bjut.edu.cn/sci/voice/index.htm

Introduction Binaural Cue Coding(BCC) Framework Purpose: recovering the perception of the original input signals BCC analysis: extract the side information of input signals BCC synthesis: recover the input signals by making use of the side information and the mono signal Now I want to talk some about BCC framework. The figure 1 show the BCC framework. The purpose of BCC is …,From figure 1, we can know that… Figure 1 :Block diagram of analysis and synthesis for BCC http://www.bjut.edu.cn/sci/voice/index.htm

Introduction Once the Discrete Fourier transform (DFT) coefficients of mono signal is known, the DFT coefficients of each output channel Sc,k can be calculated as Where is the ICLD between channel 1 and channel c for the nth sub-band. , is a random variable which is controlled by ICC (1) (2) As can be seen in figure 1,,,, where f is used to determine a level modification of DFT coefficients, c is the index of the channel and n is the frequency index (3) http://www.bjut.edu.cn/sci/voice/index.htm

Introduction BCC : recovering the perception of the original input signals. speech enhancement : separate clean signal from the noisy signal. The BCC principle is introduced to estimate the clean signal. The noisy speech is enhanced by BCC principle where the channel 1 is assumed as the clean speech and the channel 2 is regarded as the noise. Clean speech Clean speech Noisy speech Noise Noise BCC aims at recovering the perception of the original input signals. Meanwhile, the main purpose of speech enhancement is to separate clean signal from the noisy signal. Due to this, we introduce the technique of BCC to the procedure of monaural speech enhancement 。。。But we need to find the appropriate side information. http://www.bjut.edu.cn/sci/voice/index.htm

The Proposed Method 2 4 http://www.bjut.edu.cn/sci/voice/index.htm

The Proposed Method Side Information The Clean Cue speech and noise level difference (SNLD) speech and noise correlation (SNC) The Pre-enhanced Cue pre-enhanced speech and noise level difference (PNLD) pre-enhanced speech and noise correlation(PNC) posterior SNR (PSNR) speech presence probability (SPP) In BCC scheme, the binaural cues are considered as side information, but here. the clean cue, which is corresponding the binaural cues, can not be got directly. We obtain the clean cue through the pre-enhanced cue. So in out method, the side information contain the clean cue and pre---enhanced cue. the clean cue is …,the pre-enhanced cue is … http://www.bjut.edu.cn/sci/voice/index.htm

The Proposed Method Figure 2 describes the proposed method. We can see that The proposed method have two parts. One is offline training stage and the other is online enhancing stage. At training stage, the pre-enhanced speech is obtained through pre-processing. Then we can get the pre-enhanced cue. The clean cue is extracted from clean speech. And the noisy speech and the clean speech is one-to-one corresponding. At last, the pre-enhanced cue and the clean cue is used to train the codebook. At enhancing stage, we obtain the pre-enhanced speech first, then the online clean cue is estimated by weight codebook mapping with the trained codebook and online pre-enhanced cue. Figure 2: Block diagram of the proposed monaural speech enhancement method http://www.bjut.edu.cn/sci/voice/index.htm

The Proposed Method weighted codebook mapping algorithm Figure 3 shows the scheme of weighted codebook mapping (WCBM) algorithm. Figure 3: Block diagram of the weighted codebook mapping http://www.bjut.edu.cn/sci/voice/index.htm

The Proposed Method Estimation of the clean cue: 1) By comparing the Euclidean distance (ED) between the online pre-enhanced cue and the trained pre-enhanced cue, we can choose M code-vectors with relative small ED from the trained codebook. 2) calculate the degree of membership ρ of the chosen code-vectors 3) the weight of each chosen code-vector can be defined as 4) the online clean cue is obtained by weighting the trained clean cue stored in the chosen code-vector. (4) The way to …by wcm algorithm is introduced here. (5) http://www.bjut.edu.cn/sci/voice/index.htm

The Proposed Method Speech Enhancement: According to the BCC principle, we have: where is a random function with zero mean and constant variance. Finally, the noisy speech is enhanced by: (6) (7) after we get the online clean cue, which contain the speech and noise level deffirents and speech and noise correlation. We can enhance the noisy speech. (8) http://www.bjut.edu.cn/sci/voice/index.htm

Results and Conclusions 3 4 http://www.bjut.edu.cn/sci/voice/index.htm

Results SSNR: This table shows the result of the segmental SNR improvement under different input SNR conditions in various noise. denotes the MMSE spectral amplitude estimate method And Ref. B indicates the codebook-based MMSE method From this table, we can see that the proposed method could get a better performance than the other two references in most cases. http://www.bjut.edu.cn/sci/voice/index.htm

Results PESQ: This table gives the test results of PESQ But we can find that the proposed method performs better than the other two references, especially under the noisy condition with high inut SNR. http://www.bjut.edu.cn/sci/voice/index.htm

Results LSD: In table 3, we show the test results of log spectrum distance. According to the results in table 3, the proposed method performs better than Ref. A. However, compared to Ref. B, it cannot have good performance in some noisy conditions. Ref. B models the spectral envelope, which makes it have a good performance in this table. http://www.bjut.edu.cn/sci/voice/index.htm

Results 5dB babble clean Ref.A poposed Ref.B These are some demos for the reference and the two proposed methods, I will show you in the end. poposed Ref.B http://www.bjut.edu.cn/sci/voice/index.htm

Results 10dB babble clean Ref.A Ref.B poposed http://www.bjut.edu.cn/sci/voice/index.htm

Conclusions We enhance the noisy speech by modeling the spectral detail, which is the reason why it can reduce the noise between harmonics. The noise classification is cancelled because we introduce the binaural cues, which are not correlated with the type of noise, as priori information. In my presentation, we present two contributions http://www.bjut.edu.cn/sci/voice/index.htm

Thank You! http://www.bjut.edu.cn/sci/voice/index.htm