By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition.

Slides:



Advertisements
Similar presentations
Nonrecursive Digital Filters
Advertisements

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Filtering Filtering is one of the most widely used complex signal processing operations The system implementing this operation is called a filter A filter.
Digital Signal Processing – Chapter 11 Introduction to the Design of Discrete Filters Prof. Yasser Mostafa Kadah
Digital Signal Processing
Ideal Filters One of the reasons why we design a filter is to remove disturbances Filter SIGNAL NOISE We discriminate between signal and noise in terms.
Properties of continuous Fourier Transforms
Introduction to Wavelets
Angle Modulation Objectives
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
CELLULAR COMMUNICATIONS DSP Intro. Signals: quantization and sampling.
Pulse Modulation 1. Introduction In Continuous Modulation C.M. a parameter in the sinusoidal signal is proportional to m(t) In Pulse Modulation P.M. a.
Speech Signal Processing I Edmilson Morais and Prof. Greg. Dogil October, 25, 2001.
Random Processes and LSI Systems What happedns when a random signal is processed by an LSI system? This is illustrated below, where x(n) and y(n) are random.
EE513 Audio Signals and Systems Digital Signal Processing (Systems) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Discrete-Time and System (A Review)
Chapter 5 Frequency Domain Analysis of Systems. Consider the following CT LTI system: absolutely integrable,Assumption: the impulse response h(t) is absolutely.
DTFT And Fourier Transform
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Lecture 1 Signals in the Time and Frequency Domains
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
The Wavelet Tutorial: Part3 The Discrete Wavelet Transform
Digital Signal Processing
Module 2 SPECTRAL ANALYSIS OF COMMUNICATION SIGNAL.
111 Lecture 2 Signals and Systems (II) Principles of Communications Fall 2008 NCTU EE Tzu-Hsien Sang.
UNIT-5 Filter Designing. INTRODUCTION The Digital filters are discrete time systems used mainly for filtering of arrays. The array or sequence are obtained.
README Lecture notes will be animated by clicks. Each click will indicate pause for audience to observe slide. On further click, the lecturer will explain.
ECE 4710: Lecture #6 1 Bandlimited Signals  Bandlimited waveforms have non-zero spectral components only within a finite frequency range  Waveform is.
Signals CY2G2/SE2A2 Information Theory and Signals Aims: To discuss further concepts in information theory and to introduce signal theory. Outcomes:
Basics of Neural Networks Neural Network Topologies.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Systems (filters) Non-periodic signal has continuous spectrum Sampling in one domain implies periodicity in another domain time frequency Periodic sampled.
Zhongguo Liu_Biomedical Engineering_Shandong Univ. Chapter 8 The Discrete Fourier Transform Zhongguo Liu Biomedical Engineering School of Control.
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin Lecture 4 EE 345S Real-Time.
Chapter 2 Signals and Spectra (All sections, except Section 8, are covered.)
1 Lecture 1: February 20, 2007 Topic: 1. Discrete-Time Signals and Systems.
Wavelets and Multiresolution Processing (Wavelet Transforms)
Z TRANSFORM AND DFT Z Transform
1 Introduction to Digital Filters Filter: A filter is essentially a system or network that selectively changes the wave shape, amplitude/frequency and/or.
Demodulation of DSB-SC AM Signals
EEE 503 Digital Signal Processing Lecture #2 : EEE 503 Digital Signal Processing Lecture #2 : Discrete-Time Signals & Systems Dr. Panuthat Boonpramuk Department.
Lecture#10 Spectrum Estimation
Digital Signal Processing
1 Lecture 4: March 13, 2007 Topic: 1. Uniform Frequency-Sampling Methods (cont.)
DTFT continue (c.f. Shenoi, 2006)  We have introduced DTFT and showed some of its properties. We will investigate them in more detail by showing the associated.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
By Dr. Rajeev Srivastava CSE, IIT(BHU)
Signal Analyzers. Introduction In the first 14 chapters we discussed measurement techniques in the time domain, that is, measurement of parameters that.
Digital Signal Processing Lecture 6 Frequency Selective Filters
Sampling Rate Conversion by a Rational Factor, I/D
Real-time Digital Signal Processing Digital Filters.
In summary If x[n] is a finite-length sequence (n  0 only when |n|
بسم الله الرحمن الرحيم Digital Signal Processing Lecture 14 FFT-Radix-2 Decimation in Frequency And Radix -4 Algorithm University of Khartoum Department.
DSP First, 2/e Lecture 16 DTFT Properties. June 2016 © , JH McClellan & RW Schafer 2 License Info for DSPFirst Slides  This work released under.
Review of DSP.
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
MECH 373 Instrumentation and Measurements
Vocoders.
FFT-based filtering and the
Sampling rate conversion by a rational factor
EE Audio Signals and Systems
Chapter 8 The Discrete Fourier Transform
Z TRANSFORM AND DFT Z Transform
Wavelet transform Wavelet transform is a relatively new concept (about 10 more years old) First of all, why do we need a transform, or what is a transform.
Chapter 8 The Discrete Fourier Transform
Chapter 8 The Discrete Fourier Transform
Lec.6:Discrete Fourier Transform and Signal Spectrum
Review of DSP.
Presentation transcript:

By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition

By Sarita Jondhale2 Introduction Spectral analysis is the process of defining the speech in different parameters for further processing Eg short term energy, zero crossing rates, level crossing rates and so on Methods for spectral analysis are therefore considered as core of the signal processing front end in a speech recognition system

By Sarita Jondhale3 Spectral Analysis methods Two methods: –The Filter Bank spectrum –The Linear Predictive coding (LPC)

By Sarita Jondhale4 Spectral Analysis models Pattern recognition model Acoustic phonetic model

By Sarita Jondhale5 Spectral Analysis Model Parameter measurement is common in both the systems

By Sarita Jondhale6 Pattern recognition Model The three basic steps in pattern recognition model are –1. parameter measurement –2. pattern comparison –3. decision making

By Sarita Jondhale7 1. Parameter measurement To represent the relevant acoustic events in speech signal in terms of compact efficient set of speech parameters The choice of which parameters to use is dictated by other consideration eg –computational efficiency, –type of Implementation, – available memory The way in which representation is computed is based on signal processing considerations

By Sarita Jondhale8 Acoustic phonetic Model

By Sarita Jondhale9 Spectral Analysis Two methods: –The Filter Bank spectrum –The Linear Predictive coding (LPC)

By Sarita Jondhale10 The Filter Bank spectrum Digital i/p Spectral representation The band pass filters coverage spans the frequency range of interest in the signal

By Sarita Jondhale11 1.The Bank of Filters Front end Processor One of the most common approaches for processing the speech signal is the bank-of-filters model This method takes a speech signal as input and passes it through a set of filters in order to obtain the spectral representation of each frequency band of interest.

By Sarita Jondhale12 Eg Hz for telephone quality signal Hz for broadband signal The individual filters generally do overlap in frequency The output of the ith bandpass filter where Wi is the normalized frequency

By Sarita Jondhale13 Each bandpass filter processes the speech signal independently to produce the spectral representation Xn

By Sarita Jondhale14 The Bank of Filters Front end Processor

By Sarita Jondhale15 The Bank of Filters Front end Processor The sampled speech signal, s(n), is passed through a bank of Q Band pass filters, giving the signals

By Sarita Jondhale16 The Bank of Filters Front end Processor The bank-of-filters approach obtains the energy value of the speech signal considering the following steps: Signal enhancement and noise elimination.- To make the speech signal more evident to the bank of filters. Set of bandpass filters.- Separate the signal in frequency bands. (uniform/non uniform filters )

By Sarita Jondhale17 Nonlinearity.- The filtered signal at every band is passed through a non linear function (for example a wave rectifier full wave or half wave) for shifting the bandpass spectrum to the low-frequency band.

By Sarita Jondhale18 The Bank of Filters Front end Processor Low pass filter.- This filter eliminates the high-frequency generated by the non linear function. Sampling rate reduction and amplitude compression.- The resulting signals are now represented in a more economic way by re-sampling with a reduced rate and compressing the signal dynamic range. The role of the final lowpass filter is to eliminate the undesired spectral peaks

By Sarita Jondhale19 The Bank of Filters Front end Processor Assume that the output of the i th bandpass filter is a pure sinusoid at frequency  I If full wave rectifier is used as the nonlinearity

By Sarita Jondhale20 The Bank of Filters Front end Processor * *

By Sarita Jondhale21 Types of Filter Bank Used For Speech Recognition uniform filter bank Non uniform filter bank

By Sarita Jondhale22 uniform filter bank The most common filter bank is the uniform filter bank The center frequency, fi, of the i th bandpass filter is defined as Q is number of filters used in bank of filters

By Sarita Jondhale23 uniform filter bank The actual number of filters used in the filter bank bi is the bandwidth of the i th filter There should not be any frequency overlap between adjacent filter channels

By Sarita Jondhale24 uniform filter bank If bi < Fs/N, then the certain portions of the speech spectrum would be missing from the analysis and the resulting speech spectrum would not be considered very meaningful

By Sarita Jondhale25 nonuniform filter bank Alternative to uniform filter bank is nonuniform filter bank The criterion is to space the filters uniformly along a logarithmic frequency scale. For a set of Q bandpass filters with center frequncies fi and bandwidths bi, 1≤i≤Q, we set

By Sarita Jondhale26 nonuniform filter bank

By Sarita Jondhale27 The most commonly used values of α=2 This gives an octave band spacing adjacent filters And α=4/3 gives 1/3 octave filter spacing

By Sarita Jondhale28 Implementations of Filter Banks Depending on the method of designing the filter bank can be implemented in various ways. Design methods for digital filters fall into two classes: –Infinite impulse response (IIR) (recursive filters) –Finite impulse response

By Sarita Jondhale29 The FIR filter: (finite impulse response) or non recursive filter The present output is depend on the present input sample and previous input samples The impulse response is restricted to finite number of samples

By Sarita Jondhale30 Advantages: –Stable, noise less sever –Excellent design methods are available for various kinds of FIR filters –Phase response is linear Disadvantage: –Costly to implement –Memory requirement and execution time are high –Require powerful computational facilities

By Sarita Jondhale31 The IIR filter: (Infinite impulse response) or recursive filter The present output sample is depends on the present input, past input samples and output samples The impulse response extends over an infinite duration

By Sarita Jondhale32 Advantage: –Simple to design –Efficient Disadvantage: –Phase response is non linear –Noise affects more –Not stable

By Sarita Jondhale33 FIR Filters

By Sarita Jondhale34 FIR Filters Less expensive implementation can be derived by representing each bandpass filter by a fixed low pass window  (n) modulated by the complex exponential

By Sarita Jondhale35 Frequency Domain Interpretation For Short Term Fourier Transform At n=n 0 Where FT[.] denotes Fourier Transform S n0 (e j  i ) is the conventional Fourier transform of the windowed signal, s(m)w(n 0 -m), evaluated at the frequency  =  i A

By Sarita Jondhale36 Frequency Domain Interpretation For Short Term Fourier Transform Shows which part of s(m) are used in the computation of the short time Fourier transform

By Sarita Jondhale37 Frequency Domain Interpretation For Short Term Fourier Transform Since w(m) is an FIR filter with size L then from the definition of S n (e j  i ) we can state that –If L is large, relative to the signal periodicity then S n (e j  i ) gives good frequency resolution –If L is small, relative to the signal periodicity then S n (e j  i ) gives poor frequency resolution

By Sarita Jondhale38 Frequency Domain Interpretation For Short Term Fourier Transform For L=500 points Hamming window is applied to a section of voiced speech. The periodicity of the signal is seen in the windowed time waveform as well as in the short time spectrum in which the fundamental frequency and its harmonics show up as narrow peaks at equally spaced frequencies.

By Sarita Jondhale39 Frequency Domain Interpretation For Short Term Fourier Transform For short windows, the time sequence s(m)w(n-m) doesn’t show the signal periodicity, nor does the signal spectrum. It shows the broad spectral envelop very well.

By Sarita Jondhale40 Frequency Domain Interpretation For Short Term Fourier Transform Shows irregular series of local peaks and valleys due to the random nature of the unvoiced speech

By Sarita Jondhale41 Frequency Domain Interpretation For Short Term Fourier Transform Using the shorter window smoothes out the random fluctuations in the short time spectral magnitude and shows the broad spectral envelope very well

By Sarita Jondhale42 Linear Filtering Interpretation of the short-time Fourier Transform The linear filtering interpretation of the short time Fourier Transform i.e Sn(e jwi ) is a convolution of the low pass window, w(n), with the speech signal, s(n), modulated to the center frequency wi * From A

By Sarita Jondhale43 FFT Implementation of Uniform Filter Bank Based on the Short-Time FT

By Sarita Jondhale44 FFT Implementation of Uniform Filter Bank Based on the Short-Time FT

By Sarita Jondhale45 FFT Implementation of Uniform Filter Bank Based on The Short Time FT The FFT implementation is more efficient than the direct form structure

By Sarita Jondhale46 Nonuniform FIR Filter Bank Implementations The most general form of a nonuniform FIR filter bank

By Sarita Jondhale47 Nonuniform FIR Filter Bank Implementations The k th bandpass filter impulse response, h k (n), represents a filter with a center frequency  k, and bandwidth  k. The set of Q bandpass filters covers the frequency range of interest for the intended speech recognition application

By Sarita Jondhale48 Nonuniform FIR Filter Bank Implementations Each band pass filter is implemented via a direct convolution Each band pass filter is designed via the windowing design method The composite frequency response of the Q-channel filter bank is independent of the number and distribution of the individual filters

By Sarita Jondhale49 Nonuniform FIR Filter Bank Implementations A filter bank with the three filters has the exact same composite frequency response as the filter bank with the seven filters shown in figure above

By Sarita Jondhale50 Nonuniform FIR Filter Bank Implementations The impulse response of the k th bandpass filter The frequency response of the kth bandpass filter FIR window Impulse response of ideal band pass filer *

By Sarita Jondhale51 Nonuniform FIR Filter Bank Implementations Thus the frequency response of the composite filter bank * * 1

By Sarita Jondhale52 Nonuniform FIR Filter Bank Implementations Where w min is the lowest frequency in the filter bank and w max is the highest frequency Equation 1 can be written as Which is independent of the number of ideal filters, Q, and their distribution in the frequency *

By Sarita Jondhale53 FFT-Based Nonuniform Filter Banks By combining two or more uniform channels the nonuniformity can be created Consider taking an N-point DFT of the sequence x(n)

By Sarita Jondhale54 FFT-Based Nonuniform Filter Banks The equivalent kth channel value, Xk’ can be obtained by weighing the sequence, x(n) by the complex sequence 2 exp(-j (  n/N))cos(  n/N). If more than two channels are combined, then a different equivalent weighing sequence results

By Sarita Jondhale55 Tree Structure Realizations of Nonuniform Filter Banks In this method the speech signal is filtered in the stages, and the sampling rate is successively reduced at each stage

By Sarita Jondhale56 Tree Structure Realizations of Nonuniform Filter Banks

By Sarita Jondhale57 Tree Structure Realizations of Nonuniform Filter Banks The original speech signal, s(n), is filtered initially into two bands, a low band and a high band The high band is down sampled by 2 and represents the highest octave band (  /2≤  ≤  ) of the filter bank. The low band is similarly down sampled by 2 and fed into second filtering stage in which the signal is again split into two equal bands. Again the high band of the stage 2 is down sampled by 2 and is used as a next highest filter bank output.

By Sarita Jondhale58 Tree Structure Realizations of Nonuniform Filter Banks The low band is also down sampled by 2 and fed into a third stage of filters These third stage output after down sampling by factor 2, are used as the two lowest filter bands

By Sarita Jondhale59 Summary of considerations for speech recognition filter banks 1 st.Type of digital filter used (IIR (recursive) or FIR (nonrecursive)) IIR: Advantage: simple to implement and efficient. Disadvantage: phase response is nonlinear FIR: Advantage: phase response is linear Disadvantage: expensive in implementation

By Sarita Jondhale60 Summary of considerations for speech recognition filter banks 2 nd. The number of filters to be used in the filter bank. 1.For uniform filter banks the number of filters, Q, can not be too small or else the ability of the filter bank to resolve the speech spectrum is greatly damaged. The value of Q less than 8 are generally avoided 2.The value of Q can not be too large, because the filter bandwidths would eventually be too narrow for some talker (eg. High-pitch females) i.e no prominent harmonics would fall within the band. (in practical systems the value of Q≤32).

By Sarita Jondhale61 Summary of considerations for speech recognition filter banks In order to reduce overall computation, many practical systems have used nonuniform spaced filter banks

By Sarita Jondhale62 Summary of considerations for speech recognition filter banks 3 rd.The choice of nonlinearity and LPF used at the output of each channel Nonlinearity: Full wave or Half wave rectifier LPF: varies from simple integrator to a good quality IIR lowpass filter.

By Sarita Jondhale63