Presentation is loading. Please wait.

Presentation is loading. Please wait.

Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.

Similar presentations


Presentation on theme: "Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute."— Presentation transcript:

1

2 Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute

3 Best Linear Separator: Supporting Plane Method Maximize distance Between two parallel supporting planes Distance = “Margin” =

4 Soft Margin SVM Just add non-negative error vector z.

5 Method 2: Find Closest Points in Convex Hulls c d

6 Plane Bisects Closest Points d c

7 Find using quadratic program Many existing and new QP solvers.

8 Dual of Closest Points Method is Support Plane Method Solution only depends on support vectors:

9 One bad example? Convex Hulls Intersect! Same argument won’t work.

10 Don’t trust a single point! Each point must depend on at least two actual data points.

11 Depend on >= two points Each point must depend on at least two actual data points.

12 Depend on >= two points Each point must depend on at least two actual data points.

13 Depend on >= two points Each point must depend on at least two actual data points.

14 Depend on >= two points Each point must depend on at least two actual data points.

15 Final Reduced/Robust Set Each point must depend on at least two actual data points. Called Reduced Convex Hull

16 Reduced Convex Hulls Don’t Intersect Reduce by adding upper bound D

17 Find Closest Points Then Bisect No change except for D. D determines number of Support Vectors.

18 Dual of Closest Points Method is Soft Margin Method Solution only depends on support vectors:

19 What will linear SVM do?

20 Linear SVM Fails

21 High Dimensional Mapping trick http://www.slideshare.net/ankitksh arma/svm-37753690

22

23 Nonlinear Classification: Map to higher dimensional space IDEA: Map each point to higher dimensional feature space and construct linear discriminant in the higher dimensional space. Dual SVM becomes:

24 Kernel Calculates Inner Product

25 Final Classification via Kernels The Dual SVM becomes:

26 Generalized Inner Product By Hilbert-Schmidt Kernels (Courant and Hilbert 1953) for certain  and K, e.g. Also kernels for nonvector data like strings, histograms, dna,…

27 Solve Dual SVM QP Recover primal variable b Classify new x Final SVM Algorithm Solution only depends on support vectors :

28 SVM AMPL DUAL MODEL

29

30 S5: Recal linear solution

31 RBF results on Sample Data

32 Have to pick parameters Effect of C

33 Effect of RBF parameter

34 General Kernel methodology Pick a learning task Start with linear function and data Define loss function Define regularization Formulate optimization problem in dual space/inner product space Construct an appropriate kernel Solve problem in dual space

35 Extensions Many Inference Tasks Regression One-class Classification, novelty detection Ranking Clustering Multi-Task Learning Learning Kernels Cannonical Correlation Analysis Principal Component Analysis

36 Algorithms Algorithms Types: General Purpose solvers CPLEX by ILOG Matlab optimization toolkit Special purpose solvers exploit structure of the problem Best linear SVM take time linear in the number of training data points. Best kernel SVM solvers take time quadratic in the number of training data points. Good news since convex, algorithm doesn’t really matter as long as solvable.

37 Hallelujah! Generalization theory and practice meet General methodology for many types of inference problems Same Program + New Kernel = New method No problems with local minima Few model parameters. Avoids overfitting Robust optimization methods. Applicable to non-vector problems. Easy to use and tune Successful Applications BUT…

38 Catches Will SVMs beat my best hand-tuned method Z on problem X? Do SVM scale to massive datasets? How to chose C and Kernel? How to transform data? How to incorporate domain knowledge? How to interpret results? Are linear methods enough?


Download ppt "Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute."

Similar presentations


Ads by Google