Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance.

Similar presentations


Presentation on theme: "1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance."— Presentation transcript:

1 1 3.6 Support Vector Machines K. M. Koo

2 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance between a separating hyperplane and the sets of or

3 3 Goal of SVM Find Maximum Margin assume that are linearly separable margin find separating hyperplane with maximum margin

4 4 Calculate margin separating hyperplane and are not uniquely determined under the constraint, and are uniquely determine

5 5 Calculate margin distance between a point x and is given by thus, the margin is given by

6 6 Optimization of margin maximization of margin

7 7 Optimization of margin therefore, we want to separating hyperplane with maximal margin separating hyperplane with minimum This is an optimization-problem with inequality constraints

8 8 optimization with constraints constraint cost function min-value optimization with equality constraints optimization with inequality constraints constraint

9 9 Lagrange Multiplier optimization problem under constraints can be solved by the method of Lagrange Multipliers let be real valued functions, let and,and let, the level set for with value. assume.if has a local minimum or maximum on at, which is called a critical point of,then there is a real number,called a Lagrange multiplier, such that

10 10 The Method of Lagrange Multiplier

11 11 Lagrange Multiplier Lagrangian is obtained as follows: for equality constraints for inequality constraints In our case Inequality constraints

12 12 Convex a subset is convex iff for any, the line segment joining and is also a subset of, i.e. for any, a real-valued function on is convex iff for any two points and for any,

13 13 Convex convex set concave set convex functionconcave function neither convex nor concave

14 14 Convex Optimization an optimization problem is said to be convex iff the cost function as well as the constraints are convex the optimization problem for SVM is convex the solution to a convex problem, if it exist, is unique. that is, there is no local optimum! for convex optimization problem, KKT(Karush- Kuhn-Tucker) condition is necessary and sufficient for the solution

15 15 KKT(Karush-Kuhn-Tucker) condition KKT condition The gradient of the Lagrangian with respect to the original variable is 0 The original constraints are satisfied Multipliers for inequality constraints (Complementary KKT) product of multiplier and constraints equal to 0 for convex optimize problems,1-4 are necessary and sufficient for the solution

16 16 KKT condition for the optimization of margin recall KKT condition (3.62) (3.63) (3.64) (3.65) (3.66)

17 17 KKT condition for the optimization of margin Combining (3.66) with (3.62) (3.67) (3.68)

18 18 Remarks-support vector of the optimal solution is a linear combination of feature vectors which are associated with support vectors are associated with

19 19 Remarks-support vector The resulting hyperplane classifier is insensitive to the number and position of non-support vector

20 20 Remark-computation w 0 can be implicitly obtaines by any of the condition satisfying strict complement (i.e. ) In practice, is computed as an average value obtained using all conditions of the type

21 21 Remark-optimal hyperplane is unique the optimal hyperplane classifier of a support vector machine is unique under two condition the cost function is convex the inequality constraints consist of linear functions constraints are convex an optimization problem is said to be convex iff the target(or cost) function as well as the constraints are convex (the optimization problem for SVM is convex) the solution to a convex problem, if it exist, is unique. that is, there is no local optimum!

22 22 Computation optimal Lagrange multiplier optimization problem belongs to the convex programming family (convex optimization problem) of problems It can be solved by considering the so called Lagrangian duality and can be stated equivalently by its Wolfe dual representation form Lagrangian duality Wolfe dual representation

23 23 Wolfe dual representation form

24 24 Computation optimal Lagrange multiplier once the optimal Lagrangian multipliers have been computed, the optimal hyperplane is obtained (3.75) (3.76)

25 25 Remarks the cost function does not depend explicitly on the dimensionality of the input space this allows for efficient generalizations in the case of nonlinearly separable classes although the resulting optimal hyperplane is unique, there is no guarantee about Lagrange multipliers

26 26 Simple example consider the two classification task that consists of the following points its Lagrangian function KKT condition

27 27 Simple example Lagrangian duality optimize with equality constraint result more then one solution

28 28 SVM for Non-separable Classes in the case of non-separable, the training feature vector belong to one of the following three categories

29 29 SVM for Non-separable Classes All three cases can be treated under a single type constraints

30 30 SVM for Non-separable Classes goal is make the margin as large as possible keep the number of points with as small as possible (3.79) is intractable because of discontinuous function (3.79)

31 31 SVM for Non-separable Classes as common case, we choose to optimize a closely related cost function

32 32 SVM for Non-separable Classes to Lagrangian

33 33 SVM for Non-separable Classes The corresponding KKT condition (3.85) (3.86) (3.87) (3.90) (3.88) (3.89)

34 34 SVM for Non-separable Classes The associated Wolfe dual representation now becomes

35 35 SVM for Non-separable Classes equivalent to

36 36 Remarks-difference with the linearly separable case Lagrange multipliers( ) need to be bounded by C the slack variables,, and their associated Lagrange multipliers,, do not enter into the problem explicitly reflected indirectly though C

37 37 Remarks M-class problem SVM for M-class problem design M separating hyperplanes so that separate class from all the others assign


Download ppt "1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance."

Similar presentations


Ads by Google