Download presentation

Presentation is loading. Please wait.

Published byKatelynn Baskett Modified about 1 year ago

1
A KTEC Center of Excellence 1 Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion Presenter: Brian Quanz

2
A KTEC Center of Excellence 2 About today’s discussion… Last time: discussed convex opt. Today: Will apply what we learned to 4 pattern analysis problems given in book: (1) Smallest enclosing hypersphere (one-class SVM) (2) SVM classification (3) Support vector regression (SVR) (4) On-line classification and regression

3
A KTEC Center of Excellence 3 About today’s discussion… This time for the most part: Describe problems Derive solutions ourselves on the board! Apply convex opt. knowledge to solve Mostly board work today

4
A KTEC Center of Excellence 4 Recall: KKT Conditions What we will use: Key to remember ch. 7: Complementary slackness -> sparse dual rep. Convexity -> efficient global solution

5
A KTEC Center of Excellence 5 Novelty Detection: Hypersphere Train data – learn support Capture with hypersphere Outside – ‘novel’ or ‘abnormal’ or ‘anomaly’ Smaller sphere = more fine-tuned novelty detection

6
A KTEC Center of Excellence 6 1 st : Smallest Enclosing Hypersphere Given: Find center, c, of smallest hypersphere containing S

7
A KTEC Center of Excellence 7 S.E.H. Optimization Problem O.P.: Let’s solve using Lagrangian and KKT and discuss

8
A KTEC Center of Excellence 8 Cheat

9
A KTEC Center of Excellence 9 S.E.H.: Solution H(x) = 1 if x>=0, 0 o.w.

10
A KTEC Center of Excellence 10 Theorem on bound of false positive

11
A KTEC Center of Excellence 11 Hypersphere that only contains some data – soft hypersphere Balance missing some points and reducing radius Robustness –single point could throw off Introduce slack variables (repeated approach) 0 within sphere, squared distance outside

12
A KTEC Center of Excellence 12 Hypersphere optimization problem Now with trade off between radius and training point error: Let’s derive solution again

13
A KTEC Center of Excellence 13 Cheat

14
A KTEC Center of Excellence 14 Soft hypersphere solution

15
A KTEC Center of Excellence 15 Linear Kernel Example

16
A KTEC Center of Excellence 16 Similar theorem

17
A KTEC Center of Excellence 17 Remarks If data lies in subspace of feature space: Hypersphere overestimates support in perpendicular dir. Can use kernel PCA (next week discussion) If normalized data (k(x,x)=1) Corresponds to separating hyperplane, from origin

18
A KTEC Center of Excellence 18 Maximal Margin Classifier Data and linear classifier Hinge loss, gamma margin Linear separable if

19
A KTEC Center of Excellence 19 Margin Example

20
A KTEC Center of Excellence 20 Typical formulation Typical formulation fixes gamma (functional margbin) to 1 and allows w to vary since scaling doesn’t affect decision, margin proportional to 1/norm(w) to vary. Here we fix w norm, and vary functional margin gamma

21
A KTEC Center of Excellence 21 Hard Margin SVM Arrive at optimization problem Let’s solve

22
A KTEC Center of Excellence 22 Cheat

23
A KTEC Center of Excellence 23 Solution Recall:

24
A KTEC Center of Excellence 24 Example with Gaussian kernel

25
A KTEC Center of Excellence 25 Soft Margin Classifier Non-separable - Introduce slack variables as before Trade off with 1-norm of error vector

26
A KTEC Center of Excellence 26 Solve Soft Margin SVM Let’s solve it!

27
A KTEC Center of Excellence 27 Soft Margin Solution

28
A KTEC Center of Excellence 28 Soft Margin Example

29
A KTEC Center of Excellence 29 Support Vector Regression Similar idea to classification, except turned inside-out Epsilon-insensitive loss instead of hinge Ridge Regression: Squared-error loss

30
A KTEC Center of Excellence 30 Support Vector Regression But, encourage sparseness Need inequalities epsilon-insensitive loss

31
A KTEC Center of Excellence 31 Epsilon-insensitive Defines band around function for 0-loss

32
A KTEC Center of Excellence 32 SVR (linear epsilon) Opt. problem: Let’s solve again

33
A KTEC Center of Excellence 33 SVR Dual and Solution Dual problem

34
A KTEC Center of Excellence 34 Online So far batch: processed all at once Many tasks require data processed one at a time from start Learner: Makes prediction Gets feedback (correct value) Updates Conservative only updates if non-zero loss

35
A KTEC Center of Excellence 35 Simple On-line Alg.: Perceptron Threshold linear function At t+1 weight updated if error Dual update rule: If

36
A KTEC Center of Excellence 36 Algorithm Pseudocode

37
A KTEC Center of Excellence 37 Novikoff Theorem Convergence bound for hard-margin case If training points contained in ball of radius R around origin w* hard margin svm with no bias and geometric margin gamma Initial weight: Number of updates bounded by:

38
A KTEC Center of Excellence 38 Proof From 2 inequalities: Putting these together we have: Which leads to bound:

39
A KTEC Center of Excellence 39 Kernel Adatron Simple modification to perceptron, models hard margin SVM with 0 threshold alpha stops changing, either alpha positive and right term 0, or right term negative

40
A KTEC Center of Excellence 40 Kernel Adatron – Soft Margin 1-norm soft margin version Add upper bound to the values of alpha (C) 2-norm soft margin version Add constant to diagonal of kernel matrix SMO To allow a variable threshold, updates must be made on pair of examples at once Results in SMO Rate of convergence both algs. sensitive to order Good heuristics, e.g. choose points most violate conditions first

41
A KTEC Center of Excellence 41 On-line regression Also works for regression case Basic gradient ascent with additional constraints

42
A KTEC Center of Excellence 42 Online SVR

43
A KTEC Center of Excellence 43 Questions Questions, Comments?

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google