Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active Set Support Vector Regression

Similar presentations


Presentation on theme: "Active Set Support Vector Regression"— Presentation transcript:

1 Active Set Support Vector Regression
David R. Musicant Alexander Feinberg NIPS 2001 Workshop on New Directions in Kernel-Based Learning Methods Friday, December 7, 2001 Carleton College

2 Active Set Support Vector Regression
Fast algorithm that utilizes an active set method Requires no specialized solvers or software tools, apart from a freely available equation solver Inverts a matrix of the order of the number of features (in the linear case) Guaranteed to converge in a finite number of iterations

3 The Regression Problem
“Close points” may be wrong due to noise only Line should be influenced by “real” data, not noise Ignore errors from those points which are close!

4 Measuring Regression Error
Given m points in the n dimensional space Rn Represented by an m x n matrix A Associated with each point Ai is an observation yi Consider a plane to fit the data, and a “tube” of width e around the data. Measure error outside the tube: where e is a vector of ones.

5 Support Vector Regression
Traditional support vector regression: Minimize the error made outside of the tube Regularize the fitted plane by minimizing the norm of w The parameter C balances two competing goals

6 Our reformulation Allow regression error ( ) to contribute in a quadratic fashion, instead of linearly. Regularize regression plane with respect to location (b) in addition to orientation (w). Non-negativity constraints for slack variables are no longer necessary. plane “orientation” regression error plane “location”

7 Wolfe Dual Formulation
The dual formulation can be represented as: where Non-negativity constraints only = dual variables Nasty objective function

8 Simpler Dual Formulation
At optimality, Add this as a constraint, and simplify objective: I = identity matrix Complementarity condition introduced to simplify objective function The only constraints are non-negativity and complementarity

9 Active Set Algorithm: Idea
Partition dual variables into nonbasic variables: basic variables: Algorithm is an iterative method. Choose a working set of variables corresponding to active constraints to be nonbasic Choose variables so as to preserve complementarity Calculate the global minimum on basic variables Appropriately update working set Goal is to find appropriate working set. When found, global minimum on basic variables is solution to problem

10 Active Set Algorithm: Basics
Definition: At each iteration, redefine basic and nonbasic sets: Define:

11 Active Set Algorithm: Basics
Optimization problem, on an active set, becomes: Complementarity constraint is implicit by choice of basic and nonbasic sets. Find global minimum on basic set, then project.

12 Active Set Algorithm: Basics
Converting back from u: When computing M-1, we use Sherman-Morrison-Woodbury identity: To restate: Like ASVM, the ASVR basic approach finds the minimum on a set of basic variables, then projects onto the feasible region. This differs from other active set methods, which “backtrack” onto the feasible region.

13 Graphical Comparison Basic ASVR Step Standard Active Set Approach
Feasible Region Feasible Region Initial point Initial point Projection Projection Minimum Minimum

14 Some additional details
When the basic ASVR step fails to make progress, we fall back on the standard active set approach. When we no longer make any progress on the active set, we free all variables and use a gradient projection step. Note: This step may violate complementarity! Complementarity can be immediately restored with a shift.

15 Preserving Complementarity
Suppose there exists i such that Define and redefine Then all terms of objective function above remain fixed, apart from last term which is reduced further. Shift preserves complementarity and improves objective.

16 Experiments Compared ASVR and its formulation with standard formulation via SVMTorch and mySVM measured generalization error and running time mySVM experiments only completed on smallest dataset, as it ran much more slowly than SVMTorch Used tuning methods to find appropriate values for C Synthetic dataset generated for largest test All experiments run on: 700 MHz Pentium III Xeon, 2 Gigabytes available memory Red Hat Linux 6.2, egcs C++ Data was in core for these experiments. The algorithm can easily be extended for larger datasets. Convergence is guaranteed in a finite number of iterations.

17 Experiments on Public Datasets
(*) indicates that we stopped tuning early due to long running times. The more we improved generalization error, the longer SVMTorch took to run. ASVR has comparable test error to SVMTorch, and runs dramatically faster on the larger examples.

18 Experiment on Massive Dataset
SVMTorch did not terminate after several hours on this dataset, under a variety of parameter settings.

19 Conclusions Conclusions: Future work
ASVR is an active set method that requires no external optimization tools apart from a linear equation solver Performs competitively with other well-known SVR tools (linear kernels) Only a single matrix inversion in n+1 dimensions (where n is usually small) is required Future work Out-of-core implementation Parallel processing of data Kernel implementation Integrating reduced SVM or other methods for reducing the number of columns in kernel matrix


Download ppt "Active Set Support Vector Regression"

Similar presentations


Ads by Google