Learning Using Augmented Error Criterion Yadunandana N. Rao Advisor: Dr. Jose C. Principe.

Slides:



Advertisements
Similar presentations
Slides from: Doug Gray, David Poole
Advertisements

1 Closed-Form MSE Performance of the Distributed LMS Algorithm Gonzalo Mateos, Ioannis Schizas and Georgios B. Giannakis ECE Department, University of.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Adaptive Filters S.B.Rabet In the Name of GOD Class Presentation For The Course : Custom Implementation of DSP Systems University of Tehran 2010 Pages.
Adaptive IIR Filter Terry Lee EE 491D May 13, 2005.
A Practical Guide to Troubleshooting LMS Filter Adaptation Prepared by Charles H. Sobey, Chief Scientist ChannelScience.com June 30, 2000.
Manifold Sparse Beamforming
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The FIR Adaptive Filter The LMS Adaptive Filter Stability and Convergence.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.
Lecture 11: Recursive Parameter Estimation
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
3/24/2006Lecture notes for Speech Communications Multi-channel speech enhancement Chunjian Li DICOM, Aalborg University.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Goals of Adaptive Signal Processing Design algorithms that learn from training data Algorithms must have good properties: attain good solutions, simple.
Prediction and model selection
Adaptive FIR Filter Algorithms D.K. Wise ECEN4002/5002 DSP Laboratory Spring 2003.
EE513 Audio Signals and Systems Wiener Inverse Filter Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Adaptive Signal Processing
Normalised Least Mean-Square Adaptive Filtering
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Adaptive Noise Cancellation ANC W/O External Reference Adaptive Line Enhancement.
Sparsity-Aware Adaptive Algorithms Based on Alternating Optimization and Shrinkage Rodrigo C. de Lamare* + and Raimundo Sampaio-Neto * + Communications.
RLSELE Adaptive Signal Processing 1 Recursive Least-Squares (RLS) Adaptive Filters.
Chapter 5ELE Adaptive Signal Processing 1 Least Mean-Square Adaptive Filtering.
Digital Communications Fredrik Rusek Chapter 10, adaptive equalization and more Proakis-Salehi.
Yuan Chen Advisor: Professor Paul Cuff. Introduction Goal: Remove reverberation of far-end input from near –end input by forming an estimation of the.
Equalization in a wideband TDMA system
Algorithm Taxonomy Thus far we have focused on:
Introduction to Adaptive Digital Filters Algorithms
By Asst.Prof.Dr.Thamer M.Jamel Department of Electrical Engineering University of Technology Baghdad – Iraq.
Eigenstructure Methods for Noise Covariance Estimation Olawoye Oyeyele AICIP Group Presentation April 29th, 2003.
Correntropy as a similarity measure Weifeng Liu, P. P. Pokharel, Jose Principe Computational NeuroEngineering Laboratory University of Florida
Independent Component Analysis Zhen Wei, Li Jin, Yuxue Jin Department of Statistics Stanford University An Introduction.
Mathematical Preliminaries. 37 Matrix Theory Vectors nth element of vector u : u(n) Matrix mth row and nth column of A : a(m,n) column vector.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
Comparison and Analysis of Equalization Techniques for the Time-Varying Underwater Acoustic Channel Ballard Blair PhD Candidate MIT/WHOI.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Least SquaresELE Adaptive Signal Processing 1 Method of Least Squares.
Method of Least Squares. Least Squares Method of Least Squares:  Deterministic approach The inputs u(1), u(2),..., u(N) are applied to the system The.
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
Semi-Blind (SB) Multiple-Input Multiple-Output (MIMO) Channel Estimation Aditya K. Jagannatham DSP MIMO Group, UCSD ArrayComm Presentation.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Adv DSP Spring-2015 Lecture#9 Optimum Filters (Ch:7) Wiener Filters.
EE513 Audio Signals and Systems
LEAST MEAN-SQUARE (LMS) ADAPTIVE FILTERING. Steepest Descent The update rule for SD is where or SD is a deterministic algorithm, in the sense that p and.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Derivation Computational Simplifications Stability Lattice Structures.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
A Semi-Blind Technique for MIMO Channel Matrix Estimation Aditya Jagannatham and Bhaskar D. Rao The proposed algorithm performs well compared to its training.
CpSc 881: Machine Learning
Dept. E.E./ESAT-STADIUS, KU Leuven
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
Professors: Eng. Diego Barral Eng. Mariano Llamedo Soria Julian Bruno
Overview of Adaptive Filters Quote of the Day When you look at yourself from a universal standpoint, something inside always reminds or informs you that.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
Autoregressive (AR) Spectral Estimation
Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
METHOD OF STEEPEST DESCENT ELE Adaptive Signal Processing1 Week 5.
1 Consensus-Based Distributed Least-Mean Square Algorithm Using Wireless Ad Hoc Networks Gonzalo Mateos, Ioannis Schizas and Georgios B. Giannakis ECE.
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Variable Step-Size Adaptive Filters for Acoustic Echo Cancellation Constantin Paleologu Department of Telecommunications
State-Space Recursive Least Squares with Adaptive Memory College of Electrical & Mechanical Engineering National University of Sciences & Technology (NUST)
Tijl De Bie John Shawe-Taylor ECS, ISIS, University of Southampton
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Orthogonal Subspace Projection - Matched Filter
Instructor :Dr. Aamer Iqbal Bhatti
لجنة الهندسة الكهربائية
By Viput Subharngkasen
METHOD OF STEEPEST DESCENT
Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.
Unfolding with system identification
Presentation transcript:

Learning Using Augmented Error Criterion Yadunandana N. Rao Advisor: Dr. Jose C. Principe

2 Overview Linear Adaptive Systems Criterion Algorithm Topology MSE LMS/RLS FIR, IIR AEC Algorithms

3 Why another criterion? MSE gives biased parameter estimates with noisy data x(n) Adaptive Filter e(n) d(n) w v(n) u(n) T. Söderström, P. Stoica. “System Identification.” Prentice-Hall, London, United Kingdom, 1989.

4 Is the Wiener-MSE solution optimal? white input noise: W=(R+σ 2 I) -1 P Unknown σ 2 Assumptions: 1. v(n), u(n) are uncorrelated with input & desired 2. v(n) and u(n) are uncorrelated with each other colored input noise: W=(R+V) -1 P Unknown V Solution will change with changing noise statistics

5 An example Input SNR = 0dB taps

6 Existing solutions… Gives exact unbiased estimate Total Least Squares iff v(n) and u(n) are iid with equal variances !! Input is noisy and desired is noise-free Y.N. Rao, J.C. Principe. “Efficient Total Least Squares Method for System Modeling using Minor Component Analysis.” IEEE Workshop on Neural Networks for Signal Processing XII, 2002.

7 Existing solutions … Extended Total Least Squares Gives exact unbiased estimate with colored v(n) and u(n) iff noise statistics are known!! J. Mathews, A. Cichocki. “Total Least Squares Estimation.” Technical Report, University of Utah, USA and Brain Science Institute Riken, 2000.

8 Going beyond MSE - Motivation Assumption: 1. v(n) and u(n) are white The input covariance matrix is, R=R x +σ 2 I Only the diagonal terms are corrupted! We will exploit this fact

9 Going beyond MSE - Motivation w = estimated weights ( length L ) w T = True weights ( length M ) If Δ ≥ L, w = w T ρ e (Δ) = 0 J.C. Principe, Y.N. Rao, D. Erdogmus. “Error Whitening Wiener Filters: Theory and Algorithms.” Chapter- 10, Least-Mean-Square Adaptive Filters, S. Haykin, B. Widrow, (eds.), John Wiley, New York, 2003.

10 Augmented Error Criterion (AEC) Define AEC MSE Error penalty

11 AEC can be interpreted as… β > 0 Error constrained (penalty) MSE Error smoothness constraint Joint MSE and error entropy

12 From AEC to Error Whitening With β = -0.5, AEC cost function reduces to, β < 0 Simultaneous minimization of MSE and maximization of error entropy When J(w) = 0, the resulting w partially whitens the error signal! and is unbiased (Δ>L) even with white noise

13 Optimal AEC solution w * Irrespective of β, the stationary point of the AEC cost function is Choose a suitable lag L

14 In summary AEC… β=0 β=-0.5β>0 MSEEWCAEC Minimization Root finding! Shape of Performance Surface

15 Searching for AEC-optimal w β>0

16 Searching for AEC-optimal w 2 β<0

17 Searching for AEC-optimal w β<0

18 Stochastic search – AEC-LMS Problem The stationary point for AEC with β < 0 can be a global min, global max or a saddle point Theoretically, a saddle point is unstable and a single sign step-size can never converge to a saddle point Use sign information

19 Convergence in MS sense iff AEC-LMS: β = -0.5 Y.N. Rao, D. Erdogmus, G.Y. Rao, J.C. Principe. “Stochastic Error Whitening Algorithm for Linear Filter Estimation with Noisy Data.” Neural Networks, June 2003.

20 SNR: 10dB

21 Noisy system ID with EWC-LMS Problem Given an input and output time series, estimate the parameters of the unknown system Metric Error norm =

22

23

24 Quasi-Newton AEC Problem Optimal solution requires matrix inversion Solution Matrices R and S are positive-definite, symmetric and allow rank-1 recursion Overall, T = R + βS has a rank-2 update

25 Quasi-Newton AEC T(n) = R(n) + βS(n) Use Sherman-Morrison-Woodbury identity Y.N. Rao, D. Erdogmus, G.Y. Rao, J.C. Principe. “Fast Error Whitening Algorithms for System Identification and Control.” IEEE Workshop on Neural Networks for Signal Processing XIII, September 2003.

26 Quasi-Newton AEC Initialize c is a large positive constant Initialize At every iteration, compute

27

28 Quasi-Newton AEC analysis Fact 1: Convergence achieved in finite number of steps Fact 2: Estimation error covariance is bound from above Fact 3: Trace of error covariance is mainly dependent on the smallest eigenvalue of R+βS Y.N. Rao, D. Erdogmus, G.Y. Rao, J.C. Principe. “Fast Error Whitening Algorithms for System Identification and Control with Noisy Data.” NeuroComputing, to appear in 2004.

29

30 Minor Components based EWC The vector x that minimizes [ A;b T ][x T ;-1 ] is Formulate EWC using TLS principles

31 Minor Components based EWC Augmented Data Matrix Optimal EWC solution Symmetric, indefinite matrix motivated from TLS

32 Minor Components based EWC Problem Computing eigenvector corresponding to zero eigenvalue of an indefinite matrix Inverse iteration EWC-TLS Y.N. Rao, D. Erdogmus, J.C. Principe. “Error Whitening Criterion for Adaptive Filtering: Theory and Algorithms.” IEEE Transactions on Signal Processing, to appear.

33 Comparisons

34 Inverse control using EWC Adaptive controller Plant (model) Reference Model - AR plant FIR model noise

35

36 Going beyond white noise… EWC can be extended to handle colored noise if Noise correlation depth is known Noise covariance structure is known Otherwise the results will be biased by the noise terms Exploit the fact that the output and desired signals have independent noise terms

37 Modified cost function N – filter length (assume sufficient order) e – error signal with noisy data d – noisy desired signal Δ – lags chosen (need many!) Y.N. Rao, D. Erdogmus, J.C. Principe. “Accurate Linear Parameter Estimation in Colored Noise.” International Conference on Acoustics, Speech and Signal Processing, May 2004.

38 Cost function… If noise in the desired signal is white Input Noise drops out completely!

39 Optimal solution by root-finding There is a single unique solution for the. and equation

40 Stochastic algorithm Asymptotically converges to the optimal solution iff

41 Local stability 10dB input SNR 10dB output SNR

42 System ID in colored input noise -10dB input SNR & 10dB output SNR (white noise)

43 Extensions to colored noise in desired signal If the noise in desired signal is colored, then Introduce a penalty term in the cost function such that the overall cost converges to

44 But, we do not know Introduce estimators of in the cost! Define The constants α and β are positive real numbers that control the stability

45 Gradients…

46 Parameter updates

47 Convergence 0dB SNR for both input and desired data

48 Summary Noise is everywhere MSE is not optimal even for linear systems Proposed AEC and its extensions handle noisy data Simple online algorithms optimize AEC

49 Proposal-current 1.Further analysis of AEC Multiple lags Faster algorithms using augmented Lagrangian Minimum-norm update rules Recursive EWC using minor components analysis 2.Analysis of under-modeling and overestimation effects 3.Proposed new methods for parameter estimation in colored noise 4.Mathematical analysis of convergence Work in progress In dissertation

50 Proposal-current 5.Application to the design of inverse controllers 6.Application of the proposed criteria/algorithms Model-order estimation based on error correlations Model-order estimation using sparseness constraints AEC with β > 0 for error smoothing Work in progress In dissertation

51 Future Thoughts Complete analysis of the modified algorithm Extensions to non-linear systems Difficult with global non-linear models Using Multiple Models ? Unsupervised learning Robust subspace estimation Clustering ? Other applications

52 Selected publications Book chapter Error Whitening Wiener Filters: Theory and Algorithms (Chapter 10) in Least-Mean-Square Adaptive Filters, Haykin, Widrow (eds), Wiley, Sep Journal papers 1.Stochastic Error Whitening Algorithm for Linear Filter Estimation with Noisy Data Neural Networks, vol. 16, no. 5-6, pp , Jun Error Whitening Criterion for Adaptive Filtering – Theory and Algorithms IEEE Transactions on Signal Processing (to appear) 3. On Newton-type and Minor Components Based Learning Algorithms for Parameter Estimation with Noisy Data NeuroComputing, (invited, due March 2004)

53 Selected publications 4.Accurate Linear Parameter Estimation in Colored Noise (in preparation) Patents Error Whitening Criterion for Linear Parameter Estimation in White Noise ( submitted – Feb, 2003) 2.Algorithms for parameter estimation in colored noise using Augmented Lagrangian (in preparation) Conference papers 1.Fast Error Whitening Algorithms for System Identification and Control, Proceedings of NNSP’03, pp , Sep Error Whitening Criterion for Linear Filter Estimation, Proceedings of IJCNN’03, vol. 2, pp , Jul Accurate Linear Parameter Estimation in Colored Noise, ICASSP’04 (accepted)

54 Acknowledgements Dr. Jose C. Principe Dr. Deniz Erdogmus Dr. Petre Stoica

55 Thank You!