Presentation is loading. Please wait.

Presentation is loading. Please wait.

Support Vector Machine (SVM) Based on Nello Cristianini presentation

Similar presentations


Presentation on theme: "Support Vector Machine (SVM) Based on Nello Cristianini presentation"— Presentation transcript:

1 Support Vector Machine (SVM) Based on Nello Cristianini presentation http://www.support-vector.net/tutorial.html

2 Basic Idea Use Linear Learning Machine (LLM). Overcome the linearity constraints:  Map to non-linearly to higher dimension. Select between hyperplans  Use margin as a test Generalization depends on the margin.

3 General idea Original Problem Transformed Problem

4 Kernel Based Algorithms Two separate learning functions Learning Algorithm:  in an imbedded space Kernel function  performs the embedding

5 Basic Example: Kernel Perceptron Hyperplane classification  f(x)= +b =  h(x)= sign(f(x)) Perceptron Algorithm:  Sample: (x i,t i ), t i  {-1,+1}  If t i < 0 THEN /* Error*/  w k+1 = w k + t i x i  k=k+1

6 Recall Margin of hyperplan w Mistake bound

7 Observations Solution is a linear combination of inputs  w =  a i t i x i  where a i >0 Mistake driven  Only points on which we make mistake influence! Support vectors  The non-zero a i

8 Dual representation Rewrite basic function:  f(x) = +b =  a i t i +b  w =  a i t i x i Change update rule:  IF t j (  a i t i +b) < 0  THEN a j = a j +1 Observation:  Data only inside inner product!

9 Limitation of Perceptron Only linear separations Only converges for linearly separable data Only defined on vectorial data

10 The idea of a Kernel Embed data to a different space Possibly higher dimension Linearly separable in the new space. Original Problem Transformed Problem

11 Kernel Mapping Need only to compute inner-products. Mapping: M(x) Kernel: K(x,y) = Dimensionality of M(x): unimportant! Need only to compute K(x,y) Using it in the embedded space:  Replace by K(x,y)

12 Example x=(x 1, x 2 ); z=(z 1, z 2 ); K(x,z) = ( ) 2

13 Polynomial Kernel Original Problem Transformed Problem

14 Kernel Matrix

15 Example of Basic Kernels Polynomial  K(x,z)= ( ) d Gaussian  K(x,z)= exp{- ||x-z ||2 /2  }

16 Kernel: Closure Properties K(x,z) = K 1 (x,z) + c K(x,z) = c*K 1 (x,z) K(x,z) = K 1 (x,z) * K 2 (x,z) K(x,z) = K 1 (x,z) + K 2 (x,z) Create new kernels using basic ones!

17 Support Vector Machines Linear Learning Machines (LLM) Use dual representation Work in the kernel induced feature space  f(x) =  a i t i K(x i, x) +b Which hyperplane to select

18 Generalization of SVM PAC theory:  error = O( Vcdim / m)  Problem: Vcdim >> m  No preference between consistent hyperplanes

19 Margin based bounds H: Basic Hypothesis class conv(H): finite convex combinations of H D: Distribution over X and {+1,-1} S: Sample of size m over D

20 Margin based bounds THEOREM: for every f in conv(H)

21 Maximal Margin Classifier Maximizes the margin Minimizes the overfitting due to margin selection. Increases margin  Rather than reduce dimensionality

22 SVM: Support Vectors

23 Margins Geometric Margin: min i t i f(x i )/ ||w|| Functional margin: min i t i f(x i ) f(x)

24 Main trick in SVM Insist on functional marginal at least 1.  Support vectors have margin 1. Geometric margin = 1 / || w|| Proof.

25 SVM criteria Find a hyperplane (w,b) That Maximizes: || w || 2 = Subject to:  for all i  t i ( +b)  1

26 Quadratic Programming Quadratic goal function. Linear constraint. Unique Maximum. Polynomial time algorithms.

27 Dual Problem Maximize  W(a) =  a i - 1/2  i,j a i t i a j t j K(x i, x j ) +b Subject to  i a i t i =0  a i  0

28 Applications: Text Classify a text to given categories  Sports, news, business, science, … Feature space  Bag of words  Huge sparse vector!

29 Applications: Text Practicalities:  M w (x) = tf w log (idf w ) / K  ft w = text frequency of w  idf w = inverse document frequency  idf w = # documents / # documents with w Inner product  sparse vectors SVM: finds a hyperplan in “document space”


Download ppt "Support Vector Machine (SVM) Based on Nello Cristianini presentation"

Similar presentations


Ads by Google