# Unit Review & Exam Preparation

## Presentation on theme: "Unit Review & Exam Preparation"— Presentation transcript:

Unit Review & Exam Preparation
M513 – Advanced DSP Techniques Unit Review & Exam Preparation

M513 – Main Topics Covered in 2011/2012 ay
Review of DSP Basics Random Signal Processing Optimal and Adaptive Filters (PSD) Spectrum Estimation Techniques (exam questions will mainly come from parts 2 and 3 but good knowledge of part 1 is needed !!!)

Part 1 – Review of DSP Basics
DSP = Digital Signal Processing = Signal Analysis + Signal Processing … performed in discrete-time domain Fourier Transform Family More general transform (z-transform) LTI Systems and Convolution Guide to LTI Systems

Signal Analysis To analyse signals in time domain we can use appropriate member of Fourier Transform family

Fourier Transforms - Summary
time  continuous discrete Fourier Transform Discrete-Time Fourier Transform aperiodic periodic continuous discrete Fourier Series Discrete Fourier Transform aperiodic periodic  frequency

Fourier Transforms Following analogies can be seen:
Periodic in time ↔ Discrete in Frequency Aperiodic in time ↔ Continuous in Frequency Continuous in Time ↔ Aperiodic in Frequency Discrete in Time ↔ Periodic in Frequency

More general transforms
Two more transforms are introduced in order to generalise Fourier transforms for both continuous and discrete-time domain signals To understand their region of operation it is important to recognise that both CTFT and DTFT only operated on one limited part of the whole complex plane (plane of complex values) CTFT operates on the frequency axis, i.e. line s=0 from the complex plane s=s+jW (i.e. s=W). DTFT operates on the frequency circle, i.e. curve r=1 from the complex plane z=rejw (i.e. z=ejw ).

From Laplace to Z-Transform
Evaluate Laplace transform of sampled signal xs(t) substitute: 8

From Laplace to Z-Transform
Consider again substitution we made on the previous slide: i.e. left half of the s-plane (s<0) maps into the interior of the unit circle in the z-plane |z|<1 9

j Axis in s to z Mapping S-plane Z-plane Im Im (jw) Re Re

Signal Processing Delay … signal Scale … signal
Add … two or more samples (from the same or different signals)  Signal Filtering  Convolution

Convolution Gives the system input – system output relationship for LTI type systems (both DT and CT). System x(t) y(t) x(n) y(n)

Impulse Response of the System
Let h(n) be the response of the system to d(n) impulse type input (i.e. Impulse Response of the System) we note this as LTI System d(n) h(n)

Time-invariance For LTI system, if than
(this is so called time-invariance property of the system) LTI System d(n-k) h(n-k)

Linearity Linearity implies the following system behavior: LTI System
x(n) y(n) LTI System ax(n) ay(n) LTI System x1(n)+ x2(n) y1(n)+y2(n)

Linearity and Time-Invariance
We can now combine time-invariance and linearity: LTI System Sd(n-k) Sh(n-k) LTI System Sx(k)d(n-k) Sx(k)h(n-k)

Convolution Sum I.e., if than and:
i.e. system output is the sum of lots of delayed impulse responses (i.e. responses to individual, scaled impulse signals which are making up the whole DT input signal) This sum is called CONVOLUTION SUM Sometimes we use  to denote convolution operation, i.e.

Convolution Sum for CT Similarly, for continuous time signals and systems (but a little bit more complicated) The above expression basically describes the analogue (CT) input signal x(t) as an integral (i.e. sum) of an infinite number of time-shifted and scaled impulse functions.

Convolution in t domain ↔ Multiplication in f domain but we also have Multiplication in t domain ↔ Convolution in f domain

Discrete LTI Systems in Two Domains
h(n) x(n) y(n) H(z) X(z) Y(z) h(n) – impulse response, H(z) – transfer function of DT LTI system

Summary DT: H(z) is z Transform of the System Impulse Response - System Transfer Function. H(w) is Discrete Time Fourier Transform of the System Impulse Response – System Frequency Response. CT: H(s) is Laplace Transform of the System Impulse Response - System Transfer Function. H(W) is Fourier Transform of the System Impulse Response – System Frequency Response.

Guide to Discrete LTI Systems
Impulse Response h(n) ZT DTFT IZT IDTFT Transfer Function z=ejw Frequency Response H(z) H(w) IZT ZT Difference Equation Including some mathematical manipulations

Guide to Continuous LTI Systems
Impulse Response h(t) LT FT ILT IFT s=jw Transfer Function Frequency Response H(s) H(w) ILT LT Differential Equation Including some mathematical manipulations

Example(s) - Use guide to LTI systems to move between the various descriptions of the system.
E.g. Calculate transfer function and frequency response for the IIR filter given with the following difference equation.

Example(s) - Use guide to LTI systems to move between the various descriptions of the system.
Having obtained frequency response, system response to any frequency F1 can be easily calculated, e.g. for the ¼ of sampling frequency Fs we have: (note, in general this is a complex number so both phase and amplitude/gain can be calculated)

Example(s) - Use guide to LTI systems to move between the various descriptions of the system.
Opposite problem can also be easily solved, e.g. for the IIR filter with transfer function: find the corresponding difference equation to implement the system.

Part 2 – Random Signals Random signals – unpredictable type of signals ( … well, more or less). Moments of random signals – mx, rxx(m) Autocorrelation ↔ PSD Filtering Random Signals –Spectral Factorisation Equations (in three domains) Deconvolution and Inverse Filtering Minimum and Non-minimum phase systems/filters

Signal Classification
Deterministic signals can be characterised by mathematical equation Random (Nondeterministic, Stochastic) signals can not be characterised by mathematical equation usually characterised by their statistics Random (Nondeterministic, Stochastic) signals can be further classified as: Stationary signals if their statistics does not change with time Nonstationary signals if their statistics changes with time

Signal Classification
Wide Sense Stationary (random) signals – random signals with constant signals statistics up to a 2nd order Ergodic (random) signals – random signals whose statistics can be measured by Time Averaging rather then Ensemble Averaging (i.e. expectation of the ergodic signal is its average) For simplicity reasons, we study Wide-Sense Stationary (WSS) and Ergodic Signals

1st Order Signal Statistics
Mean Value m of the signal x(n) is its 1st order statistics: more general eq. If mx is constant over time, we are talking about stationary signal x(n). Single waveform averaging, so signal is ergodic (no need for ensemble averages). Expected value of x(n) (E – expectation operator)

2nd Order Statistics Autocovariance of the signal is the 2nd order signal statistics: It is calculated according to: where * denotes a complex conjugate in case of complex signals For equal lags, i.e. k=l, the autocovariance reduces to variance

2nd Order Statistics Variance s2 of the signal is:
Variance can be considered as some kind of measure of signal dispersion around its mean.

Analogies to Electrical signals
Two zero-mean signals, with different variances mx – mean – DC component of electrical signal mx2 – mean squared – DC power E[x2(n)] – mean square – total average power sx2 – variance – AC power sx – standard deviation – rms value

Autocorrelation This is also a 2nd order statistics of signal and is very similar (in some cases identical) to autocovariance Autocorrelation of the signal is basically a product of signal and its shifted version: where m=l-k. Autocorrelation is a measure of signal predictability – correlated signal is one that has redundancy (it is compressible, e.g. speech, audio or video signals)

Autocorrelation for m=l-k we sometimes use notation rxx(m) or even rx(m) instead of rxx(k,l), i.e.

Autocorrelation and Autocovariance
For zero mean, stationary signals those two quantities are identical: where mx = 0 Also notice that the variance of the zero-mean signal then corresponds to zero-lag autocorrelation:

Autocorrelation Two important autocorrelation properties:
rxx(0) is essentially a signal power so it must be larger than any other autocorrelation of that signal (other way of looking at this property of the autocorrelation is to realise that the sample is best correlated with itself).

Example (Tutorial 1, Problems 2, 3)
Random phase family of sinusoids: A and wk are fixed constants and q is a uniformly distributed random variable (i.e. equally likely to have any value in the interval –p to p). Prove the stationarity of this process, i.e. a) Find the mean and variance values (should be const.) b) Find the autocorrelation function (ACF) (should only depend on time lag value m, otherwise const.)

Example (Tutorial 1, Problems 2, 3)

Example (Tutorial 1, Problems 2, 3)

Example (Tutorial 1, Problems 2, 3)
Random phase family of sinusoids: A and wk are fixed constants and q is a uniformly distributed random variable (i.e. equally likely to have any value in the interval –p to p). Discuss two approaches to calculate ACF for this process. We can use: or we can go via:

Power Spectral Density (PSD)
Power Spectral Density is the Discrete Time Fourier Transform of the autocorrelation function PSD contains power info (i.e. distribution of signal power across the frequency spectrum) but has no phase information (PSD is “phase blind”). PSD is always real and non-negative.

White Noise White noise signal has a perfectly flat power spectrum (equal to the variance of the signal s2). Autocorrelation of white noise is a unit impulse with amplitude sw2 – white noise is perfectly uncorrelated signal (not realisable in practice, we usually use pseudorandom noise with PSD almost flat over a finite frequency range) Rww rww DFT IDFT w m

Filtering Random Signals
Filter scales the mean value of the input signal. In time domain the scaling value is the sum of the impulse response. In frequency domain the scaling value is the frequency response of the filter at w=0. Time Domain mx DIGITAL FILTER my Frequency Domain

Filtering Random Signals
Cross-correlation between the filter input and output signals: or

Filtering Random Signals
Autocorrelation of the filter output: using

Filtering Random Signals
The autocorrelation of the filter output therefore depends only on k, difference between the indices n+k and n, i.e.: Combining with: we have:

Spectral Factorisation Equations
rx h(k) ryx h*(-m) ry Taking the DTFT of above equation: Taking the ZT of above equation:

Filtering Random Signals
In terms of z-transform: If H(z) is real, H(z)=H*(z*) so: This is a special case of spectral factorisation.

Example – Tutorial 2, Problem 2
A zero mean white noise signal x(n) is applied to an FIR filter with impulse response sequence {0.5, 0, 0.75}. Derive an expression for the PSD of the signal at the output of the filter.

Example - What about IIR type filter and coloured noise signal
Example - What about IIR type filter and coloured noise signal? (not in Tutorial handouts) An IIR filter described by the following difference equation: Is used to process WSS signal with PSD: Find the PSD of the filter output. Solution – PSD of the output is required, so spectral factorisation equation in W domain can be used.

Example - What about IIR type filter and coloured noise signal
Example - What about IIR type filter and coloured noise signal? (not in Tutorial handouts) First calculate the transfer function of the filter, then find the frequency response: … then apply spectral factorisation:

Inverse Filters Consider the deconvolution problem as represented on the figure below: Our task is to design a filter which will reconstruct or reconstitute the original input x(n) from the observed output y(n). The reconstruction may have the arbitrary gain A and a delay of d, hence the definition of the required inverse filter is: x(n) H(z) y(n) H-1(z) Ax(n-d)

Inverse Filters The inverse system is said to ‘equalise’ the amplitude and phase response of H(z), or to ‘deconvolve’ the output y(n) in order to reconstruct the input x(n). x(n) H(z) y(n) H-1(z) Ax(n-d)

Inverse Filters - Problem
If H(z) is a non-minimum phase system, the zeros outside the unit circle become poles outside the unit circle and the inverse filter is unstable !!! x(n) H(z) y(n) H-1(z) Ax(n-d)

Noise Whitening With inverse filtering x(n) does not have to be a white random sequence but the inverse filter H0-1(z) has to produce the same sequence x(n). For noise whitening, input x(n) has to be a white random sequence as well as output of H0-1(z), u(n) but sequences x(n) and u(n) are not the same. x(n) H0(z) y(n) H0-1(z) x(n) Inverse filtering x(n) H1(z) y(n) H0-1(z) u(n) Noise whitening

Deconvolution using autocorrelation
Consider the filtering process again: The matrix equation to be solved in order to estimate the impulse response h(k) before attempting the deconvolution is given on the next slide: x(n) h(k) y(n)

Deconvolution using autocorrelation
using matrix notation: The aim is to obtain coefficients b(0), b(1), …, b(L) which can be done by inverting the matrix Rxx. This matrix is known as autocorrelation matrix and can be used as important 2nd order characterisation of random signal.

Deconvolution using autocorrelation
Solution of the equation from the previous slide: is obviously given with: Important to note is the structure of autocorrelation matrix Rx.

Toeplitz matrices If Ai,j is an element of the matrix in i-th row and j-th column then Ai,j=ai-j. Another important DSP operation – convolution also has strong relation to Toeplitz type matrix, called convolution matrix.

Convolution matrix To form the convolution matrix we need to represent the convolution operation in vector form. For example, the output of the FIR filter of length N can be written as: where x is a vector of input samples to a filter and h is a vector of filter coefficients (impulse response in case of FIR filter) The above equation represents the case where x(n)=0 for n<0.

Decomposition of Autocorrelation Matrix
The Toeplitz structure of autocorrelation matrix is used in diagonalisation of the autocorrelation matrix – this is the important process of decomposition of the autocorrelation matrix in the form: here: L – diagonal matrix containing the eigenvalues of R Q – modal matrix containing the eigenvectors of associated with eigenvalues in L

White noise What would be the form of the autocorrelation matrix for the case of white noise signal? Assuming ideal white noise sequence, i.e. perfectly uncorrelated signal, its autocorrelation is a unit impulse with amplitude sw2. The autocorrelation matrix is in this case diagonal (all non-diagonal elements are zero). rww m

More on minimum and non-minimum phase systems
Non-minimum phase systems (sometimes also called mixed-phase systems) are the systems with some of its zeros inside the unit circle and the remaining zeros outside the unit circle. If all of its zeros are outside the unit circle, non-minimum phase system is called a maximum phase system. The minimum, non-minimum and maximum phase systems can also be recognised by their phase characteristics. The phase characteristic of the minimum phase system has a zero net phase change between w=0 and w=p frequencies, while the non-minimum phase system has non-zero phase change between those frequencies.

More on minimum and non-minimum phase systems
Maximum phase system has the maximum phase change between w=0 and w=p frequencies amongst all possible systems with the same amplitude response.

Example (Tutorial 2, Problem 4)
A zero-mean stationary white noise x(n) is applied to a filter with a transfer function: Find all filters that can produce the same PSD as the above filter. Are those filters minimum or maximum phase filters? Using spectral factorisation equation

Example (Tutorial 2, Problem 4)
Same PSD would be obtained using filter H0(z): H0(z) has two zeros 0.5 and 1/3 = , both inside the unit circle i.e. this is a minimum phase filter has poles at z=2 and z=3, both outside the unit circle i.e. this is a maximum phase filter

Part 3 Optimal and Adaptive Digital Filters

Part 3 – Optimal and Adaptive Digital Filters
Best filters for the task in hand Wiener Filter and Equation Finding the minimum of the cost function (MSE) = MMSE Steepest Descent algorithm LMS and RLS algorithms Optimal and Adaptive Filter Configurations (i.e. applications)

Optimal filters are the “best” filters for the particular task. We use knowledge about the signals to design those filters and apply them to the task. Adaptive Filters change their coefficients to improve their performance for the given task. They are not fixed and can therefore change their (statistical) properties over time. Adaptive Filters may not be optimal but are constantly striving to become optimal

Optimal (Wiener) Filter Design
System Identification problem: We want to estimate the impulse response h(n) of the “unknown” discrete-time system. We can use the equation for the cross-correlation between the filter input and output to obtain the estimate for h(n). x(n) h(n) = ? y(n)

Optimal (Wiener) Filter Design
convolution form: matrix form: matrix form in short-notation:

Optimal (Wiener) Filter Design
matrix form in short-notation: – cross-correlation vector (between input and output signals) – autocorrelation matrix (of input signal) – estimated impulse response vector From the above equation we can easily obtain vector

Optimal (Wiener) Filter Design
Equation is also known as Wiener-Hopf equation Using this equation we have actually estimated (or designed) a filter with the impulse response close (or equal) to the impulse response of the unknown system. This type of optimal filter is also known as Wiener filter.

Optimal (Wiener) Filter Design
We can approach the problem of designing the Wiener filter estimate of the unknown system in a slightly different way. Consider a block diagram given below: A good estimate of the unknown filter impulse response h(n) can be obtained, if the difference/error signal between two outputs (real and estimated system) is minimal (ideally zero). d(n) + e(n) x(n) y(n) - S

Optimal (Wiener) Filter Design
We use the following notation: d(n) – output of the unknown system (desired signal) y(n) – output of the system estimate x(n) – input signal (same for both systems) e(n) – error signal, e(n)= d(n) – y(n) For e(n)→0, we expect to achieve a good estimate of the unknown system, i.e.: d(n) + e(n) x(n) y(n) - S

Optimal (Wiener) Filter Design
Wiener filter design is actually a much more general problem desired signal d(n) does not have to be the output of the unknown system d(n) + e(n) x(n) y(n) - S

Optimal (Wiener) Filter Design
Another Wiener filter estimation example: d(n) + d(n) Signal Distorting System + x(n) Optimal Filter y(n) - S S + e(n) w(n) h(n) w(n) – noise signal Task: Design (determine) h(n) in order to minimise error e(n) !

Optimal (Wiener) Filter Design
d(n) + e(n) x(n) y(n) - S Rather than minimising current value of the error signal - e(n), we can choose more effective approach – minimise the expected value of the square error – mean square error (MSE) function. Function to be minimised (cost function) is therefore MSE function defined as:

Mathematical Analysis
d(n) + e(n) x(n) y(n) - S Filter output Error signal MSE (cost) function We can try to minimise this expression or switch to matrix/vector notation

Mathematical Analysis
Using vector notation:

Mathematical Analysis
Scalar Cross-correlation vector Autocorrelation matrix

Mathematical Analysis
To find the minimum error take the derivative with respect to the coefficients, h(k), and set equal to zero. Solving for h: Wiener-Hopf Equation … again Wiener-Hopf equation therefore determines the set of optimal filter coefficients in the mean-square sense.

Example (Tutorial 3, Problems 1 and 2)
Derive the Wiener-Hopf equation for the Wiener FIR filter working as a noise canceller. Detailed derivation of Wiener-Hopf equation is shown in Tutorial (3); for ANC application, we can start from

Example – Tutorial 3, Problems 1 and 2
Derivation of the Wiener-Hopf equation for the FIR noise canceller.

Example – Tutorial 3, Problems 1 and 2
Derivation of the Wiener-Hopf equation for the FIR noise canceller. or in matrix/vector form:

MSE Surface Mean Square Error Surface Example for 2 weights Wiener filter E[e2(n)] represents the expected value of squared filter error e(n), i.e. mean-square error (MSE). For the N coefficients filter this is an N dimensional surface with Wiener-Hopf solution positioned at the bottom of this surface (i.e. this is the minimum error point) We can plot it for the case of 2-coefficient filter (more than that - impossible to draw in 2D). MMSE – Wiener optimum

MMSE Once the coefficients of the Wiener filter (i.e. coordinates of the MMSE point) are known, the actual MMSE value is easy to calculate – we need to evaluate J(n) for h=hopt.

Example (Tutorial 3, Problems 3 and 4)
Alternative derivation of the MMSE equation is shown in Tutorial 3, Problem 3 Use of both Wiener-Hopf and MMSE equations is demonstrated in Tutorial 3, Problem 4 Two-coefficient Wiener filter is used to filter zero-mean, unit variance noisy signal, v(n) uncorrelated with the desired signal d(n). Find: rdx, optimal solution (Wiener-Hopf) wopt and MMSE Jmin assuming:

Example (Tutorial 3, Problems 3 and 4)

Example (Tutorial 3, Problems 3 and 4)

Example (Tutorial 3, Problems 3 and 4)
MMSE

MMSE Another very important observation can be made after rearranging the basic equation for the error signal:

The Steepest Descent Algorithm
The Steepest Descent method iteratively estimates the solution to the Weiner-Hopf equation using a method called gradient descent. This minimization method finds a minima by estimating the gradient of the MSE surface and forcing the step in the opposite direction of the gradient. The basic equation in gradient descent is: Step size parameter Gradient vector that makes hn+1(k) approach hopt

The Steepest Descent Algorithm
Notice that the expression for the gradient has already been obtained in the process of calculating Wiener filter coefficients, i.e.: This is a significant improvement in our search for more efficient solutions – coefficients are now determined iteratively and no inverse of the autocorrelation matrix is needed.

The Steepest Descent Algorithm
We still need to estimate autocorrelation matrix Rxx and crosscorrelation vector Rdx (for every iteration step !!!) Further simplification of the algorithm can be achieved by using the instantaneous estimates of Rxx and Rdx.

The LMS (Least Mean Squares) Algorithm for Adaptive Filtering

Example – Tutorial 4, Problem 3
4 coefficients LMS based FIR adaptive filter working in the system identification application trying to identify system with transfer function: Write the equations for the signals d(n) and e(n) and the update equation for each adaptive filter coefficient, i.e. w1(n)… w4(n).

Example – Tutorial 4, Problem 3
weights update equations: i=0, 1, 2, 3

Applications Before looking into details of Matlab implementation of LMS update algorithm, some practical applications for adaptive filters are considered first. Those are: System Identification Inverse System Estimation Adaptive Noise Cancellation Linear Prediction

Applications: System Identification
Unknown System d(n) + x(n) Digital Filter y(n) - S e(n) h(n) Adaptive Algorithm Definitions of signals: x(n) – input applied to unknown system and adaptive filter y(n) – filter output d(n) – system (desired) output e(n) – estimation error Identifying the response of the unknown system.

Applications: System Identification
The unknown system is placed in parallel with the adaptive filter. Adaptive algorithm drives e(n) towards zero. When e(n) is very small, the adaptive filter response is close to the response of the unknown system. In this case the same input feeds both the adaptive filter and the unknown system. Using this configuration, adaptive filters can be used to identify an unknown system, such as the response of an unknown communications channel or the frequency response of an auditorium.

Applications: Inverse Estimation
Delay d(n) + System x(n) x(n) Digital Filter y(n) - S e(n) h(n) Adaptive Algorithm Definitions of signals: x(n) – input applied to system y(n) – filter output d(n) – desired output e(n) – estimation error Estimating the inverse of the system. Delay block ensures the causality of the estimated inverse

Applications: Inverse Estimation
By placing the unknown system in series with adaptive filter, filter adapts to become the inverse of the unknown system as e(k) becomes very small. As shown in the figure the process requires a delay inserted in the desired signal d(n) path to keep the data at the summation point synchronized. Adding the delay keeps the system causal.

Applications: Noise Cancellation
Signal source d(n)=s(n)+n(n) + x(n) Digital Filter y(n) - S Noise source e(n) h(n) Adaptive Algorithm Definitions of signals: x(n) – noise (so called reference signal) y(n) – noise estimate d(n) – signal + noise e(n) – signal estimate Removing background noise from the useful signals

Applications: Noise Cancellation
In noise cancellation configuration, adaptive filter removes noise from a signal. Desired signal combines noise and desired information. To remove the noise, a signal x(n) which represents noise that is correlated to the noise to remove from the desired signal is filtered through the adaptive filter. So long as the input noise to the filter remains correlated to the unwanted noise accompanying the desired signal, the adaptive filter adjusts its coefficients to reduce the value of the difference between y(n) and d(n), removing the noise and resulting in a clean signal in e(n). Output of the adaptive filter is the estimate of the noise signal contained in the desired signal d(n). Notice that in this application, the error signal actually converges to the input data signal s(n), rather than to zero.

Applications: Linear Predictor
d(n) + AR Process Delay x(n) x(n) Digital Filter y(n) - S e(n) h(n) Adaptive Algorithm Definitions of signals: x(n) – signal to be predicted y(n) – filter output (signal prediction) d(n) – desired output e(n) – estimation error Estimating the future samples of the signal.

Applications: Linear Predictor
Assuming that the signal x(n): is periodic Is steady or varies slowly over time the adaptive filter will can be used to predict the future values of the desired signal based on past values. When x(n) is periodic and the filter is long enough to remember previous values, this structure with the delay in the input signal, can perform the prediction. This configuration can also be used to remove a periodic signal from stochastic noise signals.

Example Have a look into Tutorials 3 and 4 for examples of each discussed configuration.

LMS algorithm can easily be implemented in software. Main steps of this algorithm are: Read in the next sample, x(n), and perform the filtering operation with the current version of the coefficients. Take the computed output and compare it with the expected output, i.e. calculate the error. Update the coefficients (obtain the next set of coefficients) using the following computation. This algorithm is performed in a loop so that with each new sample, a new coefficient vector, hn+1(k) is created. In this way, the filter coefficients change and adapt.

Before the LMS algorithm “kicks in” we also need to initialise filter coefficients; the safest option is to initialise them all to zero.

PSD estimation of observed signal Foetal ECG monitoring – cancelling of maternal ECG Removal of mains interference in medical signals Radar signal processing Background noise removal RX-TX crosstalk reduction Adaptive jammer suppression Separation of the speech from the background noise Echo cancellation for speaker phones Beamforming

Part 4 PSD Estimation and Signal Modelling Techniques

Part 4 – PSD Estimation There are more ways to find PSD of the signal
Nonparametric techniques (periodogram and correlogram) Parametric Techniques (AR, MA and ARMA models) Yule-Walker Equations and Signal Predictors

Approaches to PSD estimation
Classical, Non-parametric Techniques – based on Fourier Transform robust, require no previous knowledge about the data assume zero data values outside the data window -results are distorted, resolution can be low not suitable for short data records Modern, Parametric Techniques – include a priori model information concerning the spectrum to be estimated. Modern, Non-parametric methods – use singular value decomposition (SVD) of the signal to separate correlated and uncorrelated signal components for an easier analysis

Non-parametric PSD estimation techniques (DFT/FFT based)
PSD is estimated directly from the signal itself, with no previous knowledge about signal Periodogram: Based on the following formula Correlogram: - estimate of the autocorrelation of signal x(n)

Periodogram and Correlogram
note that since results obtained with those two estimators should coincide variations on the basic periodogram approach are usually used in practice (note - not a strict mathematical derivation)

Blackman-Tukey method
Since correlation function at its extreme lag values is not reliable (less data points enter the computation) it is recommended to use lag values of about 30%-40% of the total length of the data Blackman-Tukey is windowed correlogram given by: w(n) is the window with zero values for |m|>L-1 also, L<<N

Bartlett Method This is an improved periodogram method (note that previously discussed, Blackman-Tukey is a correlogram method) Bartlett’s method reduces the fluctuation of the periodogram by splitting up the available data of N observations into K=N/L subsections of L observations each. Spectral densities of produced K periodograms are then averaged.

Bartlett Method segment 1 segment 2 … segment K L samples L samples …
periodogram 1 + periodogram 2 periodogram K = Total/K PSD Estimate

Welch Method Welch proposed further modification to Bartlett method and introduced overlapped and windowed data segments defined as: where: w(n) - window of length M D - offset distance K - number of sections that the sequence x(n) is divided into

Welch Method i-th periodogram is averaged periodogram is

Welch Method data segment 1 segment 2 periodogram 1 + segment K
D samples segment 2 periodogram 1 + segment K periodogram 2 periodogram K = Total/K PSD Estimate

Modified Welch Method Data segments taken from the data record are progressively getting longer thus introducing a better frequency resolution. Due to an averaging procedure periodogram variance decreases and smoother periodograms are obtained

Modified (Symmetric) Welch Method
data segment 1 segment 2 periodogram 1 segment K + periodogram 2 periodogram K = Total/K PSD Estimate

Modified (Assymetric) Welch Method
data segment 1 segment 2 periodogram 1 segment K + periodogram 2 periodogram K = Total/K PSD Estimate

Comparison of nonparametric PSD estimators
We use quality factor Q to evaluate different nonparametric methods This is a ratio if the square if the mean of the power spectral density to its variance

Comparison of nonparametric PSD estimators
Method Conditions Q Comments Periodogram N→∞ 1 Inconsisten, independent of N Bartlett N,L→∞ 1.11Nf Quality improves with data length Welch N,L→∞, 50% overlapp 1.39Nf Quality improves with data length Blackman-Tukey N,L→∞, triangular window 2.34Nf Quality improves with data length f is a 3 dB main lobe of the associated windows

Parametric PSD estimation techniques (DFT based)
use a priori model information about the spectrum to be estimated

Steps for parametric spectrum estimation
Select a suitable model for the procedure. This step may be based on: - a priori knowledge of the physical mechanism that generates the random process. - trial and error, by testing various parametric models. (if wrong model is selected, results can be worse than when using non-parametric methods for PSD estimation) Estimate the (p,q) order of the model (from the collected data and/or from a priori information). Use collected data to estimate model parameters, coefficients.

Stochastic signal modelling

Deterministic signal modelling

Possible Models y(n) x(n) b(n)  a(n) x(n) y(n) x(n) y(n) b(n)  a(n)
Most General Model: Autoregressive Moving Average (ARMA) - a and b coefficients (i.e. IIR) also known as Pole-Zero Model x(n) + b(n) + a(n) x(n) y(n) x(n) y(n) b(n) + + Moving Average (MA) - b coefficients only (i.e. FIR) All Zero Model a(n) Autoregressive (AR) - a coefficients only All Pole Model

Model Equations ARMA MA AR

Model Equations in z-domain (i.e. Model Transfer Functions)
ARMA apply z-transform

Model Equations in z-domain
for a0=1: ARMA MA AR

Model Equations in W-domain
for a0=1: ARMA MA AR

So how do we get the signal PSD from the estimated model?
If the white noise signal w(n) is the input to our model (i.e. x(n)=w(n)) the output signal y(n) is a WSS (wide sense stationary) signal with PSD given as: or

PSD for ARMA modelled signal
using vector notation:

PSD for ARMA modelled signal
where H denotes Hermitian (transpose + complex conjugate) and:

PSD for AR & MA modelled signals
Similarly, for AR modelled signals we have: and for MA models:

Statistical Signal Modelling Yule-Walker Equations
In statistical signal modelling problem of determining the model coefficients boils down to solving a set of nonlinear equations called Yule Walker equations. Next couple of slides show how those equations are obtained starting from the general expression for ARMA model Assuming a0=1 we have:

Yule-Walker Equations
Multiplying both sides of the ARMA equation with y(n-i) and taking the expectation we have: Since both x(n) and y(n) are jointly wide sense stationary processes, we can rewritte the last part of this equation using the following reasoning:

Yule-Walker Equations
i.e.

Yule-Walker Equations
Further, for causal h(n) we can obtain the standard form of Yule-Walker equations Introducing we have:

Yule-Walker Equations
in matrix form:

Yule Walker Equations for AR model
in matrix form: