Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments:

Similar presentations


Presentation on theme: "1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments:"— Presentation transcript:

1 1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments: NSF grants no. CCF-0830480, 1016605 EECS-0824007, 1002180 May 24, 2011

2 2 2 Nonparametric regression If one trusts data more than any parametric model  Then go nonparametric regression:  lives in a (possibly -dimensional) space of “ smooth ’’ functions Our focus  Nonparametric regression robust against outliers  Robustness by controlling sparsity Ill-posed problem  Workaround: regularization [Tikhonov ’ 77], [Wahba ’ 90]  RKHS with reproducing kernel and norm Given, function estimation allows predicting  Estimate unknown from a training data set

3 3 3 Our work in context Robust nonparametric regression  Huber ’ s function [Zhu et al ’ 08]  No systematic way to select thresholds Robustness and sparsity in linear (parametric) regression  Huber ’ s M-type estimator as Lasso [Fuchs ‘ 99]; contamination model  Bayesian framework [Jin-Rao ‘ 10][Mitra et al ’ 10]; rigid choice of Noteworthy applications  Load curve data cleansing [Chen et al ’ 10]  Spline-based PSD cartography [Bazerque et al ’ 09]

4 4 4 Variational LTS Least-trimmed squares (LTS) regression [Rousseeuw ’ 87] Variational (V)LTS counterpart  is the -th order statistic among Simple but intractable beyond small problems (VLTS)  residuals discarded Q: How should we go about minimizing ? (VLTS) is nonconvex; existence of minimizer(s)? A : Try all subsamples of size, solve, and pick the best

5 5 5 Modeling outliers  Nominal data obey ; outliers something else Remarks  Both and are unknown  If outliers sporadic, then vector is sparse! Natural (but intractable) nonconvex estimator Outlier variables s.t. outlier otherwise

6 6 6 VLTS as sparse regression Lagrangian form (P0) The equivalence  Formally justifies the regression model and its estimator (P0)  Ties sparse regression with robust estimation  Tuning parameter controls sparsity in number of outliers Proposition 1: If solves (P0) with chosen s.t., then solves (VLTS) too.

7 7 7 Just relax! (P1)  (P1) convex, and thus efficiently solved  Role of sparsity controlling is central Q: Does (P1) yield robust estimates ? A: Yap! Huber estimator is a special case where (P0) is NP-hard relax

8 8 8 Alternating minimization (P1) jointly convex in AM solver Remarks  Single Cholesky factorization of  Soft-thresholding  Reveals the intertwining between Outlier identification Function estimation with outlier compensated data (P1)

9 9 9 Lassoing outliers Enables effective methods to select  Lasso solvers return entire robustification path (RP) Cross-validation (CV) fails with multiple outliers [Hampel ’ 86] Proposition 2: as and, with Minimizers of (P1) are fully determined by w/ Alternative to AM solve Lasso [Tibshirani ’ 94]

10 10 Robustification paths Lasso path of solutions is piecewise linear  LARS returns whole RP [Efron ’ 03]  Same cost of a single LS fit ( ) Lasso is simple in the scalar case  Coordinate descent is fast! [Friedman ‘ 07]  Exploits warm starts, sparsity  Other solvers: SpaRSA [Wright et al ’ 09], SPAMS [Mairal et al ’ 10] Coeffs. Leverage these solvers consider 2-D grid  values of  For each, values of

11 11 Selecting and  Number of outliers known: from RP, obtain range of s.t.. Discard outliers (known), and use CV to determine Relies on RP and knowledge on the data model  Variance of the nominal noise known: from RP, for each on the grid, obtain an entry of the sample variance matrix as The best are s.t.  Variance of the nominal noise unknown: replace above with a robust estimate, e.g., median absolute deviation (MAD)

12 12 Nonconvex regularization Nonconvex penalty terms approximate better in (P0) Options: SCAD [Fan-Li ’ 01], or sum-of-logs [Candes et al ’ 08] Iterative linearization-minimization of around Remarks  Initialize with, use and  Bias reduction (cf. adaptive Lasso [Zou ’ 06])

13 13 Robust thin-plate splines Specialize to thin-plate splines [Duchon ’ 77], [Wahba ’ 80] Smoothing penalty only a seminorm in Still, Proposition 2 holds for appropriate Solution:  Radial basis function  Augment w/ member of the nullspace of  Given, unknowns found in closed form

14 14 Simulation setup Training set : noisy samples of Gaussian mixture examples, i.i.d. Outliers: i.i.d. for True function Data Nominal: w/ i.i.d. ( known)

15 15 Robustification paths Grid parameters:  grid: Outlier Inlier Paths obtained using SpaRSA [Wright et al ’ 09]

16 16 Results True function Nonrobust predictions Robust predictionsRefined predictions Effectiveness in rejecting outliers is apparent

17 17 Generalization capability In all cases, 100% outlier identification success rate Figures of merit  Training error:  Test error: Nonconvex refinement leads to consistently lower

18 18 Load curve data cleansing Load curve: electric power consumption recorded periodically Reliable data: key to realize smart grid vision Uruguay ’ s aggregate power consumption (MW) Deviation from nominal models (outliers) Faulty meters, communication errors Unscheduled maintenance, strikes, sporting events B-splines for load curve prediction and denoising [Chen et al ’ 10]

19 19 Real data tests Nonrobust predictions Robust predictionsRefined predictions

20 20 Concluding summary Robust nonparametric regression  VLTS as -(pseudo)norm regularized regression (NP-hard)  Convex relaxation variational M-type estimator Lasso Real data tests for load curve cleansing Controlling sparsity amounts to controlling number of outliers  Sparsity controlling role of is central  Selection of using the Lasso robustification paths  Different options dictated by available knowledge on the data model Refinement via nonconvex penalty terms  Bias reduction and improved generalization capability


Download ppt "1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments:"

Similar presentations


Ads by Google