Linear Prediction.

Slides:



Advertisements
Similar presentations
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Advertisements

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Spectral envelope analysis of TIMIT corpus using LP, WLSP, and MVDR Steve Vest Matlab implementation of methods by Tien-Hsiang Lo.
Speech Recognition Chapter 3
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
OPTIMUM FILTERING.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
ELE Adaptive Signal Processing
AGC DSP AGC DSP Professor A G Constantinides©1 A Prediction Problem Problem: Given a sample set of a stationary processes to predict the value of the process.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
Pole Zero Speech Models Speech is nonstationary. It can approximately be considered stationary over short intervals (20-40 ms). Over thisinterval the source.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Analysis & Synthesis The Vocoder and its related technology.
Voice Transformation Project by: Asaf Rubin Michael Katz Under the guidance of: Dr. Izhar Levner.
EE513 Audio Signals and Systems Wiener Inverse Filter Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Linear Prediction Problem: Forward Prediction Backward Prediction
1 A Novel Approach to Speech Coding After Time Scale Modification Presented by, H. Gokhan Ilk, Ph.D.
Digital Systems: Hardware Organization and Design
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
T – Biomedical Signal Processing Chapters
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Definitions Random Signal Analysis (Review) Discrete Random Signals Random.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Adv DSP Spring-2015 Lecture#9 Optimum Filters (Ch:7) Wiener Filters.
Linear Predictive Analysis 主講人:虞台文. Contents Introduction Basic Principles of Linear Predictive Analysis The Autocorrelation Method The Covariance Method.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
A Comparison Of Speech Coding With Linear Predictive Coding (LPC) And Code-Excited Linear Predictor Coding (CELP) By: Kendall Khodra Instructor: Dr. Kepuska.
More On Linear Predictive Analysis
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
Autoregressive (AR) Spectral Estimation
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Lecture 12: Parametric Signal Modeling XILIANG LUO 2014/11 1.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Linear Prediction.
Geology 6600/7600 Signal Analysis 26 Oct 2015 © A.R. Lowry 2015 Last time: Wiener Filtering Digital Wiener Filtering seeks to design a filter h for a linear.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
UNIT-IV. Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of.
Adv DSP Spring-2015 Lecture#11 Spectrum Estimation Parametric Methods.
Figure 11.1 Linear system model for a signal s[n].
Digital Communications Chapter 13. Source Coding
Vocoders.
Linear Prediction Simple first- and second-order systems
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Modern Spectral Estimation
Microcomputer Systems 2
Linear Predictive Coding Methods
The Vocoder and its related technology
Vocoders.
ESTIMATED INVERSE SYSTEM
Digital Systems: Hardware Organization and Design
Digital Systems: Hardware Organization and Design
EE Audio Signals and Systems
Speech Processing Final Project
16. Mean Square Estimation
CH2 Time series.
Presentation transcript:

Linear Prediction

Outline Windowing LPC Introduction to Vocoders Excitation modeling Pitch Detection

Short-Time Processing Speech signal is inherently non-stationary For continuant phonemes there are stationary periods of at least 20-25ms The short-time speech frames are assumed stationary The frame length should be chosen to include just one phoneme or allophone Frame lengths are usually chosen to be between 10-50ms We consider rectangular and Hamming windows here

Rectangular Window

Hamming Window

Comparison of Windows

Comparison of Windows (cont’d)

Linear Prediction Coding (LPC) Based on all-pole model for speech production system: In time domain, we get: In other words, we can predict s[n] as a function of p previous signal samples (and the excitation). The set of {ak} is one way of representing the time varying filter. There are many other ways to represent this filter (e.g., pole value, Lattice filter value, LSP, …).

LPC parameter estimation There are many methods to estimate the LPC parameters: Autocorrelation method: results in the optimization of a in a set of p linear equations. Covariance method Procedures (such as Levinson-Durbin, Burg, Le Roux) obtain efficient estimation of these parameters.

LPC Parameters in Coding (vocoders) Θ0 gain Pitch period, P DT impulse generator G(z) glottal filter voiced unvoiced V UV H(z) vocal tract filter R(z) lip radiation filter s(n) speech signal white noise generator Θ0 gain Pitch period, P DT impulse generator voiced unvoiced V UV all-pole filter s(n) speech signal white noise generator Θ0 gain

Linear Prediction (Introduction): The object of linear prediction is to estimate the output sequence from a linear combination of input samples, past output samples or both : The factors a(i) and b(j) are called predictor coefficients.

Linear Prediction (Introduction): Many systems of interest to us are describable by a linear, constant-coefficient difference equation : If Y(z)/X(z)=H(z), where H(z) is a ratio of polynomials N(z)/D(z), then Thus the predictor coefficients give us immediate access to the poles and zeros of H(z).

Linear Prediction (Types of System Model): There are two important variants : All-pole model (in statistics, autoregressive (AR) model ) : The numerator N(z) is a constant. All-zero model (in statistics, moving-average (MA) model ) : The denominator D(z) is equal to unity. The mixed pole-zero model is called the autoregressive moving-average (ARMA) model.

Linear Prediction (Derivation of LP equations): Given a zero-mean signal y(n), in the AR model : The error is : To derive the predictor we use the orthogonality principle, the principle states that the desired coefficients are those which make the error orthogonal to the samples y(n-1), y(n-2),…, y(n-p).

Linear Prediction (Derivation of LP equations): Thus we require that Or, Interchanging the operation of averaging and summing, and representing < > by summing over n, we have The required predictors are found by solving these equations.

Linear Prediction (Derivation of LP equations): The orthogonality principle also states that resulting minimum error is given by Or, We can minimize the error over all time : where

Linear Prediction (Applications): Autocorrelation matching : We have a signal y(n) with known autocorrelation . We model this with the AR system shown below : σ 1-A(z)

Linear Prediction (Order of Linear Prediction): The choice of predictor order depends on the analysis bandwidth. The rule of thumb is : For a normal vocal tract, there is an average of about one formant per kilo Hertz of BW. One formant requires two complex conjugate poles. Hence for every formant we require two predictor coefficients, or two coefficients per kilo Hertz of bandwidth.

Linear Prediction (AR Modeling of Speech Signal): True Model: Pitch Gain s(n) Speech Signal DT Impulse generator G(z) Glottal Filter Voiced U(n) Voiced Volume velocity H(z) Vocal tract Filter R(z) LP Filter V U Uncorrelated Noise generator Unvoiced Gain

Linear Prediction (AR Modeling of Speech Signal): Using LP analysis : Pitch Gain estimate DT Impulse generator Voiced s(n) Speech Signal All-Pole Filter (AR) V U White Noise generator Unvoiced H(z)

Introduction to Vocoders V/UV pitch filter parameters s(n) original speech signal vocoder analysis Channel (or storage) vocoder synthesizer ŝ(n) synthesized speech signal Beside the estimation of the vocal tract parameters, a vocoder needs excitation estimation. In early vocoders, this has been achieved by the estimation of V/UV, pitch, and gain. More modern vocoders involve more sophisticated estimation of the excitation, such as in CELP, where vector quantization is used.

Pitch Detection Since speech signal in voiced frames is quasi-periodic (and not fully periodic), the pitch detection is not always easy. Especially in some phonemes that manifest less periodic behavior, pitch detection is difficult. Some pitch detection methods: AMDF (Average Magnitude Difference Function) Autocorrelation with center clipping