Presentation is loading. Please wait.

Presentation is loading. Please wait.

Support Vector Machines Part 2. Recap of SVM algorithm Given training set S = {(x 1, y 1 ), (x 2, y 2 ),..., (x m, y m ) | (x i, y i )   n  {+1, -1}

Similar presentations


Presentation on theme: "Support Vector Machines Part 2. Recap of SVM algorithm Given training set S = {(x 1, y 1 ), (x 2, y 2 ),..., (x m, y m ) | (x i, y i )   n  {+1, -1}"— Presentation transcript:

1 Support Vector Machines Part 2

2 Recap of SVM algorithm Given training set S = {(x 1, y 1 ), (x 2, y 2 ),..., (x m, y m ) | (x i, y i )   n  {+1, -1} 1.Choose a cheap-to-compute kernel function k(x,z) 2. Apply quadratic programming procedure (using the kernel function k) to find {  i }and bias b, where  i ≠ 0 only if x i is a support vector.

3 3.Now, given a new instance, x, find the classification of x by computing

4 Clarifications from last time

5 http://nlp.stanford.edu/IR-book/html/htmledition/img1260.png Without changing the problem, we can rescale our data to set a = 1 Length of the margin margin

6 w is perpendicular to decision boundary

7 More on Kernels So far we’ve seen kernels that map instances in  n to instances in  z where z > n. One way to create a kernel: Figure out appropriate feature space Φ(x), and find kernel function k which defines inner product on that space. More practically, we usually don’t know appropriate feature space Φ(x). What people do in practice is either: 1.Use one of the “classic” kernels (e.g., polynomial), or 2.Define their own function that is appropriate for their task, and show that it qualifies as a kernel.

8 How to define your own kernel Given training data (x 1, x 2,..., x n ) Algorithm for SVM learning uses kernel matrix (also called Gram matrix): We can choose some function k, and compute the kernel matrix K using the training data. We just have to guarantee that our kernel defines an inner product on some feature space. Not as hard as it sounds.

9 What counts as a kernel? Mercer’s Theorem: If the kernel matrix K is “symmetric positive definite”, it defines a kernel on the training data, that is, it defines an inner product in some feature space. We don’t even have to know what that feature space is! It can have a huge number of dimensions.

10

11

12 In-class exercises Note for part (c):


Download ppt "Support Vector Machines Part 2. Recap of SVM algorithm Given training set S = {(x 1, y 1 ), (x 2, y 2 ),..., (x m, y m ) | (x i, y i )   n  {+1, -1}"

Similar presentations


Ads by Google