ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.

Slides:



Advertisements
Similar presentations
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Periodograms Bartlett Windows Data Windowing Blackman-Tukey Resources:
Component Analysis (Review)
CHAPTER 3 CHAPTER 3 R ECURSIVE E STIMATION FOR L INEAR M ODELS Organization of chapter in ISSO –Linear models Relationship between least-squares and mean-square.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The FIR Adaptive Filter The LMS Adaptive Filter Stability and Convergence.
ELE Adaptive Signal Processing
Visual Recognition Tutorial
Lecture 11: Recursive Parameter Estimation
1/44 1. ZAHRA NAGHSH JULY 2009 BEAM-FORMING 2/44 2.
Function Optimization Newton’s Method. Conjugate Gradients
Motion Analysis (contd.) Slides are from RPI Registration Class.
CSci 6971: Image Registration Lecture 4: First Examples January 23, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI Dr.
Advanced Topics in Optimization
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Adaptive Signal Processing
Normalised Least Mean-Square Adaptive Filtering
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Adaptive Noise Cancellation ANC W/O External Reference Adaptive Line Enhancement.
RLSELE Adaptive Signal Processing 1 Recursive Least-Squares (RLS) Adaptive Filters.
Chapter 5ELE Adaptive Signal Processing 1 Least Mean-Square Adaptive Filtering.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Equalization in a wideband TDMA system
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Algorithm Taxonomy Thus far we have focused on:
Introduction to Adaptive Digital Filters Algorithms
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Discriminant Functions
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Signal and Noise Models SNIR Maximization Least-Squares Minimization MMSE.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Definitions Random Signal Analysis (Review) Discrete Random Signals Random.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION ASEN 5070 LECTURE 11 9/16,18/09.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
CY3A2 System identification
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
LEAST MEAN-SQUARE (LMS) ADAPTIVE FILTERING. Steepest Descent The update rule for SD is where or SD is a deterministic algorithm, in the sense that p and.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Derivation Computational Simplifications Stability Lattice Structures.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Dept. E.E./ESAT-STADIUS, KU Leuven
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
Professors: Eng. Diego Barral Eng. Mariano Llamedo Soria Julian Bruno
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Autoregressive (AR) Spectral Estimation
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
Recursive Least-Squares (RLS) Adaptive Filters
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Neural Networks 2nd Edition Simon Haykin 柯博昌 Chap 3. Single-Layer Perceptrons.
METHOD OF STEEPEST DESCENT ELE Adaptive Signal Processing1 Week 5.
ELG5377 Adaptive Signal Processing Lecture 15: Recursive Least Squares (RLS) Algorithm.
Impulse Response Measurement and Equalization Digital Signal Processing LPP Erasmus Program Aveiro 2012 Digital Signal Processing LPP Erasmus Program Aveiro.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
ECE 8443 – Pattern Recognition ECE 3163 – Signals and Systems Objectives: Eigenfunctions Fourier Series of CT Signals Trigonometric Fourier Series Dirichlet.
DSP-CIS Part-III : Optimal & Adaptive Filters Chapter-9 : Kalman Filters Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven
STATISTICAL ORBIT DETERMINATION Kalman (sequential) filter
Modeling and Simulation Dr. Mohammad Kilani
STATISTICAL ORBIT DETERMINATION Coordinate Systems and Time Kalman Filtering ASEN 5070 LECTURE 21 10/16/09.
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Equalization in a wideband TDMA system
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
By Viput Subharngkasen
Instructor :Dr. Aamer Iqbal Bhatti
METHOD OF STEEPEST DESCENT
Bayes and Kalman Filter
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Kalman Filter: Bayes Interpretation
Presentation transcript:

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted RLS Comparison to LMS Resources: Wiki: Recursive Least Squares Wiki: Newton’s Method IT: Recursive Least Squares YE: Kernel-Based RLS Wiki: Recursive Least Squares Wiki: Newton’s Method IT: Recursive Least Squares YE: Kernel-Based RLS URL:.../publications/courses/ece_8423/lectures/current/lecture_09.ppt.../publications/courses/ece_8423/lectures/current/lecture_09.ppt MP3:.../publications/courses/ece_8423/lectures/current/lecture_09.mp3.../publications/courses/ece_8423/lectures/current/lecture_09.mp3 LECTURE 09: RECURSIVE LEAST SQUARES

ECE 8423: Lecture 09, Slide 1 Newton’s Method The main challenge with the steepest descent approach of the LMS algorithm is its slow and non-uniform convergence. Another concern is the use of a single, instantaneous point estimate of the gradient (and we discussed an alternative block estimation approach). We can derive a more powerful iterative approach that uses all previous data and is based on Newton’s method for finding the zeroes of a function. Consider a function having a single zero: Start with an initial guess, x 0. The next estimate is obtained by projecting the tangent to the curve to where it crosses the x-axis: The next estimate is formed as x 1, and general iterative formula is:

ECE 8423: Lecture 09, Slide 2 Application to Adaptive Filtering To apply this to the problem of least-squares minimization, we must find the zero of the gradient of the mean-squared error. Since the mean-squared error is a quadratic function, the gradient is linear, and hence convergence takes place in a single step. Noting that, recall our error function: We find the optimal solution by equating the gradient of the error to zero: and the optimum solution is: We can demonstrate that the Newton algorithm is given by: by substituting our expression for the gradient: Note that this still requires an estimate of the autocorrelation and derivative.

ECE 8423: Lecture 09, Slide 3 Estimating the Gradient In practice, we can use an estimate of the gradient,, as we did for the LMS gradient. The update equation becomes: The noisy estimate of the gradient will produce excess mean-squared error. To combat this, we can introduce an adaptation constant: Of course, convergence no longer occurs in one step, and we are somewhat back to where we started with the iterative LMS algorithm, and we have to worry about estimation the autocorrelation matrix. To compare this solution to the LMS algorithm, we can rewrite the update equation in terms of the error signal: Taking the expectation of both sides, and invoking independence: Note that if  = 1:. Our solution is simply a weighted average of the previous value and newest estimate.

ECE 8423: Lecture 09, Slide 4 Analysis of Convergence Once again we can define an error vector: The solution to this first-order difference equation is: We can observe the following:  The algorithm converges in the mean provided:  Convergence proceeds exponentially at a rate determined by.  The convergence rate of each coefficient is identical and independent of the eigenvalue spread of the autocorrelation matrix, R. The last point is a crucial difference between the Newton algorithm and LMS. We still need to worry about our estimate of the autocorrelation matrix, R: and we assume x(n) = 0 for n < 0. We can write an update equation as a function of n:

ECE 8423: Lecture 09, Slide 5 Estimation of the Autocorrelation Matrix and Its Inverse The effort to estimate the autocorrelation matrix and its inverse is still considerable. We can easily derive an update equation for the autocorrelation: To reduce the computational complexity of the inverse, we can invoke the matrix inversion lemma: Applying this to the update equation for the autocorrelation function: Note that no matrix inversions are required ( is a scalar). The computation is proportional to L 2 rather than L 3 for the inverse. The autocorrelation is never calculated; its estimate is simple updated.

ECE 8423: Lecture 09, Slide 6 Summary of the Overall Algorithm 1)Initialize 2)Iterate for n = 0, 1, … There are a few approaches to the initialization in step (1). The most straightforward thing to do is: where the  2 is chosen to be a small positive constant (and can often be estimated based on a priori knowledge of the environment). This approach has been superseded by recursive-in-time least squares solutions, which we will study next.

ECE 8423: Lecture 09, Slide 7 Recursive Least Squares (RLS) Consider minimization of a finite duration version of the error: The objective of the RLS algorithm is to maintain a solution which is optimal at each iteration. Differentiation of the error leads to the normal equation: Note that we can now write recursive-in-time equations for R and g: We seek solutions of the form: We can apply the matrix inversion lemma for computation of the inverse:

ECE 8423: Lecture 09, Slide 8 Recursive Least Squares (Cont.) Define an intermediate vector variable, : Define another intermediate scalar variable, : Define the a priori error as: reflecting that this is the error obtained using the old filter and the new data. Using this definition, we can rewrite the RLS algorithm update equation as:

ECE 8423: Lecture 09, Slide 9 Summary of the RLS Algorithm 1)Initialize 2)Iterate for n = 0, 1, … Compare this to the Newton method: The RLS algorithm can be expected to converge more quickly because the use of an aggressive, adaptive step size.

ECE 8423: Lecture 09, Slide 10 Exponentially-Weighted RLS Algorithm We can define a weighted error function: This gives more weight to the most recent errors. The RLS algorithm can be modified in this case: 1)Initialize 2)Iterate for n = 1, 2, … RLS is computationally more complex than simple LMS because it is O(L 2 ). In principle, convergence is independent of the eigenvalue structure of the signal due to the premultiplication by the inverse of the autocorrelation matrix.

ECE 8423: Lecture 09, Slide 11 Example: LMS and RLS Comparison An IID sequence, x(n), is input to a filter: Measurement noise was assumed to be zero-mean Gaussian noise with unit variance, and a gain such that the SNR was 40 dB. The norm of the coefficient error vector is plotted in the top figure for 1000 trials. The filter length, L, was set to 8; the LMS adaptation constant, , was set to The adaptation step-size was set to the largest value for which the LMS algorithm would give stable results, and yet the RLS algorithm still outperforms LMS. The lower figure corresponds to the same analysis with an input sequence: Why is performance in this case degraded?

ECE 8423: Lecture 09, Slide 12 Introduced Newton’s method as an alternative to simple LMS. Derived the update equations for this approach. Introduced the Recursive Least Squares (RLS) approach and an exponentially-weighted version of RLS. Briefly discussed convergence and computational complexity. Next: IIR adaptive filters. Summary