Presentation is loading. Please wait.

Presentation is loading. Please wait.

Support Vector Machines and Kernel Methods Machine Learning March 25, 2010.

Similar presentations


Presentation on theme: "Support Vector Machines and Kernel Methods Machine Learning March 25, 2010."— Presentation transcript:

1 Support Vector Machines and Kernel Methods Machine Learning March 25, 2010

2 Last Time Recap of the Support Vector Machines

3 Kernel Methods Points that are not linearly separable in 2 dimension, might be linearly separable in 3.

4 Kernel Methods Points that are not linearly separable in 2 dimension, might be linearly separable in 3.

5 Kernel Methods We will look at a way to add dimensionality to the data in order to make it linearly separable. In the extreme. we can construct a dimension for each data point May lead to overfitting.

6 Remember the Dual? 6 Primal Dual

7 Basis of Kernel Methods The decision process doesn’t depend on the dimensionality of the data. We can map to a higher dimensionality of the data space. Note: data points only appear within a dot product. The objective function is based on the dot product of data points – not the data points themselves. 7

8 Basis of Kernel Methods Since data points only appear within a dot product. Thus we can map to another space through a replacement The objective function is based on the dot product of data points – not the data points themselves. 8

9 Kernels The objective function is based on a dot product of data points, rather than the data points themselves. We can represent this dot product as a Kernel – Kernel Function, Kernel Matrix Finite (if large) dimensionality of K(x i,x j ) unrelated to dimensionality of x

10 Kernels Kernels are a mapping

11 Kernels Gram Matrix: Consider the following Kernel:

12 Kernels Gram Matrix: Consider the following Kernel:

13 Kernels In general we don’t need to know the form of ϕ. Just specifying the kernel function is sufficient. A good kernel: Computing K(x i,x j ) is cheaper than ϕ (x i )

14 Kernels Valid Kernels: – Symmetric – Must be decomposable into ϕ functions Harder to show. Gram matrix is positive semi-definite (psd). Determining psd: – all eigenvalues are positive – diagonal entries are larger than the sum of the abs.values of the off diagonal entries in each row.

15 Kernels Given a valid kernels, K(x,z) and K’(x,z), more kernels can be made from them. – cK(x,z) – K(x,z)+K’(x,z) – K(x,z)K’(x,z) – exp(K(x,z)) – …and more

16 Incorporating Kernels in SVMs Optimize α i ’s and bias w.r.t. kernel Decision function:

17 Some popular kernels Polynomial Kernel Radial Basis Functions String Kernels Graph Kernels

18 Polynomial Kernels The dot product is related to a polynomial power of the original dot product. if c is large then focus on linear terms if c is small focus on higher order terms Very fast to calculate 18

19 Radial Basis Functions The inner product of two points is related to the distance in space between the two points. Placing a bump on each point. 19

20 String kernels Not a gaussian, but still a legitimate Kernel – K(s,s’) = difference in length – K(s,s’) = count of different letters – K(s,s’) = minimum edit distance Kernels allow for infinite dimensional inputs. – The Kernel is a FUNCTION defined over the input space. Don’t need to specify the input space exactly We don’t need to manually encode the input. 20

21 Graph Kernels Define the kernel function based on graph properties – These properties must be computable in poly-time Walks of length < k Paths Spanning trees Cycles Kernels allow us to incorporate knowledge about the input without direct “feature extraction”. – Just similarity in some space. 21

22 Where else can we apply Kernels? Anywhere that the dot product of x is used in an optimization. Perceptron:

23 Kernels in Clustering In clustering, it’s very common to define cluster similarity by the distance between points – k-nn (k-means) This distance can be replaced by a kernel. We’ll return to this more in the section on unsupervised techniques

24 Bye Next time – Logistic Regression


Download ppt "Support Vector Machines and Kernel Methods Machine Learning March 25, 2010."

Similar presentations


Ads by Google