Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP.

Similar presentations


Presentation on theme: "Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP."— Presentation transcript:

1 Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP

2 PART I : General duality theory PART II : Solving the SVM dual Basics of Mathematical Optimization The algebra The geometry Examples General Decomposition Algorithm Good Working Set Implementation Details

3 Mathematical Optimization min f 0 (x) s.t. f i (x) ≤ 0 h i (x) = 0 Objective function Inequality constraints Equality constraints x is a feasible point  f i (x) ≤ 0, h i (x) = 0 x is a strictly feasible point  f i (x) < 0, h i (x) = 0 Feasible region - set of all feasible points

4 Convex Optimization min f 0 (x) s.t. f i (x) ≤ 0 h i (x) = 0 Objective function Inequality constraints Equality constraints Objective function is convex Feasible region is convex Convex set??? Convex function???

5 Convex Set x1x1 x2x2 c x 1 + (1 - c) x 2 c  [0,1] Line Segment Endpoints

6 Convex Set x1x1 x2x2 All points on the line segment lie within the set For all line segments with endpoints in the set

7 Non-Convex Set x1x1 x2x2

8 Examples of Convex Sets x1x1 x2x2 Line Segment

9 Examples of Convex Sets x1x1 x2x2 Line

10 Examples of Convex Sets Hyperplane a T x - b = 0

11 Examples of Convex Sets Halfspace a T x - b ≤ 0

12 Examples of Convex Sets Second-order Cone ||x|| ≤ t t x2x2 x1x1

13 Operations that Preserve Convexity Intersection Polyhedron / Polytope

14 Operations that Preserve Convexity Intersection

15 Operations that Preserve Convexity Affine Transformation x  Ax + b

16 Convex Function x f(x) Blue point always lies above red point x1x1 x2x2

17 Convex Function x f(x) f( c x 1 + (1 - c) x 2 ) ≤ c f(x 1 ) + (1 - c) f(x 2 ) x1x1 x2x2 Domain of f(.) has to be convex

18 Convex Function x f(x) x1x1 x2x2 -f(.) is concave f( c x 1 + (1 - c) x 2 ) ≤ c f(x 1 ) + (1 - c) f(x 2 )

19 Convex Function Once-differentiable functions f(y) +  f(y) T (x - y) ≤ f(x) x f(x) (y,f(y)) f(y) +  f(y) T (x - y) Twice-differentiable functions  2 f(x) 0

20 Convex Function and Convex Sets x f(x) Epigraph of a convex function is a convex set

21 Examples of Convex Functions Linear function a T x p-Norm functions (x 1 p + x 2 p + x n p ) 1/p, p ≥ 1 Quadratic functions x T Q x Q 0

22 Operations that Preserve Convexity Non-negative weighted sum x f 1 (x) w1w1 x f 2 (x) + w 2 + …. x T Q x + a T x + b Q 0

23 Operations that Preserve Convexity Pointwise maximum x f 1 (x) max x f 2 (x), Pointwise minimum of concave functions is concave

24 Convex Optimization min f 0 (x) s.t. f i (x) ≤ 0 h i (x) = 0 Objective function Inequality constraints Equality constraints Objective function is convex  Feasible region is convex 

25 PART I : General duality theory PART II : Solving the SVM dual Basics of Mathematical Optimization The algebra The geometry Examples General Decomposition Algorithm Good Working Set Implementation Details

26 Lagrangian min f 0 (x) s.t. f i (x) ≤ 0 h i (x) = 0 f0(x)f0(x) + ∑ i i f i (x) i ≥ 0 + ∑ i i h i (x) L(x,, )

27 Lagrangian Dual + ∑ i i f i (x) i ≥ 0 + ∑ i i h i (x) L(x,, ) f0(x)f0(x) min x L(x,, )g(, ) x belongs to intersection of domains of f 0, f i and h i x  Dx  D

28 Lagrangian Dual + ∑ i i f i (x) i ≥ 0 + ∑ i i h i (x) f0(x)f0(x) min x g(, ) = Pointwise minimum of affine (concave) functions Dual function is concave

29 Lagrangian Dual + ∑ i i f i (x) i ≥ 0 + ∑ i i h i (x) f0(x)f0(x) min f 0 (x) s.t. f i (x) ≤ 0 h i (x) = 0 p* = min x g(, ) = ≥ For all (, )

30 The Dual Problem The lower bound could be far from p* Best lower bound? + ∑ i i f i (x) i ≥ 0 + ∑ i i h i (x) f0(x)f0(x) min x max, Easy to obtain d* = p* - d* ≥ 0Duality Gap

31 The Geometric Interpretation (f i (x), h i (x), f 0 (x)) uvt x  Dx  D G t G u p*

32 The Geometric Interpretation (u, v, t) G t u (,, 1) T ≥ g(, ) p* g( ) d*

33 The Duality Gap + ∑ i i f i (x) i ≥ 0 + ∑ i i h i (x) f0(x)f0(x) min f 0 (x) s.t. f i (x) ≤ 0 h i (x) = 0 p* = max, min x d* = ≥

34 The Duality Gap p* - d*Duality Gap p* - d* ≥ 0 Weak Duality p* - d* = 0 Strong Duality

35 Problem is convex There exists a strictly feasible point Slater’s Condition Taken care of by most solvers

36 At Strong Duality f 0 (x*) = g( *, *) = min x ( f 0 (x) + ∑ i i *f i (x) + ∑ i i *h i (x) ) ≤ f 0 (x*) + ∑ i i *f i (x*) + ∑ i i *h i (x*) ≤ f 0 (x*) Inequalities hold with equality x* minimizes the Lagrangian at ( *, *)

37 At Strong Duality f 0 (x*) = g( *, *) = min x ( f 0 (x) + ∑ i i *f i (x) + ∑ i i *h i (x) ) ≤ f 0 (x*) + ∑ i i *f i (x*) + ∑ i i *h i (x*) ≤ f 0 (x*) Inequalities hold with equality i *f i (x*) = 0

38 KKT Conditions f i (x*) ≤ 0h i (x*) = 0 i * ≥ 0 Primal feasible Dual feasible i *f i (x*) = 0 Complementary Slackness  f 0 (x*) + ∑ i i *  f i (x*) + ∑ i i *  h i (x*) = 0 Necessary conditions for strong duality

39 KKT Conditions f i (x*) ≤ 0h i (x*) = 0 i * ≥ 0 Primal feasible Dual feasible i *f i (x*) = 0 Complementary Slackness  f 0 (x*) + ∑ i i *  f i (x*) + ∑ i i *  h i (x*) = 0 Necessary and sufficient for convex problems

40 PART I : General duality theory PART II : Solving the SVM dual Basics of Mathematical Optimization The algebra The geometry Examples General Decomposition Algorithm Good Working Set Implementation Details

41 Linear Program min c T x s.t. A x = b x ≥ 0

42 QCQP min (1/2)x T P 0 x + q 0 x + r 0 s.t. (1/2)x T P i x + q i x + r i

43 Entropy Maximization min ∑ i x i log(x i ) s.t. A x ≤ b ∑ i x i = 1

44 The SVM Framework Points X = {x i } Labels y= {y i } w T x + b = 0 y i  {-1, +1} y i (w T x i + b) ≥ 1 -  i  i ≥ 0 min C   i 2/||w|| 1/2 w T w + Convex Quadratic Program

45 The SVM Dual min (1/2)  T Q  -  T 1 s.t.  T y = 0 0 ≤  ≤ C1 Q ij = y i y j x i T x j = y i y j k(x i,x j )

46 PART I : General duality theory PART II : Solving the SVM dual Basics of Mathematical Optimization The algebra The geometry Examples General Decomposition Algorithm Good Working Set Implementation Details

47 The SVM Dual min (1/2)  T Q  -  T 1 s.t.  T y = 0 0 ≤  ≤ C1 Choose ‘q’ variables. Fix the rest. Change unfixed variables, satisfying constraints, to decrease objective function (small problem). Repeat. Minimum ‘q’ ???Till When ??? Best set B?

48 KKT Conditions min (1/2)  T Q  -  T 1 s.t.  T y = 0 0 ≤  ≤ C1 eq i lo i up -1 + Q  + eq y - lo + up = 0 i lo  i = 0 i up (  i - C) = 0 i lo ≥ 0 i up ≥ 0 g(  )

49 KKT Conditions -1 + g(  ) + eq y - lo + up = 0 i lo  i = 0 i up (  i - C) = 0 i lo ≥ 0 i up ≥ 0 For all 0 <  i < C -1 + g i (  ) + eq y i = 0 For all  i = 0 -1 + g i (  ) + eq y i - i lo = 0 For all  i = C -1 + g i (  ) + eq y i + i up = 0

50 KKT Conditions -1 + g(  ) + eq y - lo + up = 0 i lo  i = 0 i up (  i - C) = 0 i lo ≥ 0 i up ≥ 0 g i (  ) = y i ∑ j  j y j k(x i,x j ) g i t (  ) = g i (  t-1 ) + y i ∑ j  B (  j t -  j t-1 )y j k(x i,x j ) Best set of ‘q’ variables (Working set)

51 PART I : General duality theory PART II : Solving the SVM dual Basics of Mathematical Optimization The algebra The geometry Examples General Decomposition Algorithm Good Working Set Implementation Details

52 Working Set g i (  ) = y i ∑ j  j y j k(x i,x j ) d : feasible direction of descent  t =  t-1 + d Choose steepest descent direction First order approximation of objective (-1 + g(  t-1 )) T d

53 Working Set min d (-1 + g(  t-1 )) T d s.t. y T d = 0 d i ≥ 0 if  i t-1 = 0 d i ≤ 0 if  i t-1 = C Card{d} = q -1 ≤ d i ≤ 1

54 Working Set s i = y i (-1 + g i (  t-1 )) Sort according decreasing values of s i Choose q/2 from top if 0 <  i t-1 < C, or d i = -y i satisfies feasibility of direction Choose q/2 from bottom if 0 <  i t-1 < C, or d i = y i satisfies feasibility of direction

55 Working Set min d (-1 + g(  t-1 )) T d s.t. y T d = 0 d i ≥ 0 if  i t-1 = 0 d i ≤ 0 if  i t-1 = C Card{d} = q -1 ≤ d i ≤ 1

56 PART I : General duality theory PART II : Solving the SVM dual Basics of Mathematical Optimization The algebra The geometry Examples General Decomposition Algorithm Good Working Set Implementation Details

57 Shrinking For all 0 <  i < C -1 + g i (  ) + eq y i = 0 For all  i = 0 -1 + g i (  ) + eq y i - i lo = 0 For all  i = C -1 + g i (  ) + eq y i + i up = 0 If i lo > 0 or i up > 0 for n consecutive iterations Drop  i from problem (temporarily)

58 Caching Kernel evaluation can be expensive Cache them in a least-recently-used manner Choose q’ variables where cache available

59 Results Those who have used SVM light : You know that it works very well. Those who haven’t used SVM light : It works very well. See paper. Download.

60 Questions???


Download ppt "Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP."

Similar presentations


Ads by Google