Presentation is loading. Please wait.

Presentation is loading. Please wait.

Support Vector Machine (SVM) Presented by Robert Chen.

Similar presentations


Presentation on theme: "Support Vector Machine (SVM) Presented by Robert Chen."— Presentation transcript:

1 Support Vector Machine (SVM) Presented by Robert Chen

2 Introduction High level explanation of SVM SVM is a way to classify data We are interested in text classification

3 What is a SVM “In essence, an SVM is a mathematical entity, an algorithm (or recipe) for maximizing a particular mathematical function with respect to a given collection of data.” William S Noble

4 What is a SVM It is a computer algorithm that learns through the training data we provide in order to categorize new data in future cases. SVM can’t cluster data, it can only classify data: we use SVD to cluster the data.

5 SVM hyperplanes 1)Seperating hyperplane –1d, 2d, 3d 2)Maximum-margin hyperplane –Separates classes, while maintaining the maximal distance from any one of the given expression profiles 3)Soft margin hyperplane –Generalized Optimal hyperplane (name used in Vapnik’s book)

6 Soft Margin Hyperplane Allows some outlier data points to push their way through the margin of the separating hyperplane without affecting the final result “Soft margin parameter specifies a trade- off between hyperplane violations and the size of the margin.” W. Noble

7 Soft Margin Hyper plane Suggested by Corinna Cortes and Vladimir Vapnik in 1995Corinna CortesVladimir Vapnik Won the 2008 ACM Paris Kanellakis AwardACM Paris Kanellakis Award

8 Kernal function Mathematical solution to determining the hyperplane when: –1) No clear boundary –2) Soft margin doesn’t help

9 Kernal Function Projects data from a low dimensional state to a high dimensional state We then project the SVM hyperplane in that state back to a lower drawable state such as 2-D. Kernals that have a very high-dimension can result in the SVM overfitting the data.

10 Types of Kernels linear: K(x i, x j ) = x i T x j. polynomial: K(x i, x j ) = (γ x i T x j + r) d, γ > 0. radial basis function (RBF): K(x i, x j ) = exp(−γ |x i − x j |^2), > 0 sigmoid: K(x i, x j ) = tanh(γx i T x s + r).

11 Notes radial basis function (RBF): K(x i, x j ) = exp(−γ |x i − x j |^2), > 0 A radial basis function (rbf) is equivalent to mapping the data into an infinite dimensional Hilbert space

12 Example Data Set: 1 dimensional set Class, X 1 +1, 0 -1, 1 -1, 2 +1, 3 Φ(X 1 ) = (X 1, X 1 )

13 Support Vectors + b = +1 (positive labels) (1) + b = -1 (negative labels) (2) + b = 0 (hyperplane) (3) Any vectors on expressions (1) or (2) are support vectors.

14 Importance of SVM in Support Vector Machines Complexity of SVM depends on the number of support vectors rather that on the dimensionality of the feature space

15 Positive label w 1 x 1 + w 2 x 2 + b = +1 w 1 0 + w 2 0 + b = +1 w 1 3 + w 2 9 + b = +1

16 Negative label w 1 1 + w 2 1 + b = -1 w 1 2 + w 2 4 + b = -1 w1 = -3, w2 = 1, b = 1

17 Hyperplane w 1 x 1 + w 2 x 2 + b = 0 -3x 1 + 1x 2 + 1 = 0 x 2 = -1 + 3x 1 X1; X2 0, -1 1, 2 2, 5 3, 8

18 Maximum-Margin Hyperplane 2/sqrt( w · w) 2/sqrt(-3 2 + 1 2 ) margin = 0.632456

19 Recommended Article What is a support vector machine? By William S Noble

20 Recommended Article Support Vector Machines for Text Categorization A. Basu, C. Watters, and M. Shepherd Faculty of Computer Science Dalhousie University Halifax, Nova Scotia, Canada B3H 1W5 {basu | watters | shepherd@cs.dal.ca}

21 Recommended Book The Nature of Statistical Learning Theory By Vladimir N. Vapnik

22 Library doesn’t have this book Author:Thorsten Joachims

23 Thank you Questions? Comments?

24 Multiclass SVM Multiclass ranking SVMs, in which one SVM decision function attempts to classify all classes. One-against-all classification, in which there is one binary SVM for each class to separate members of that class from members of other classes. Pairwise classification, in which there is one binary SVM for each pair of classes to separate members of one class from members of the other.

25 Types of Kernels linear: K(x i, x j ) = x i T x j. polynomial: K(x i, x j ) = (γ x i T x j + r) d, γ > 0. radial basis function (RBF): K(x i, x j ) = exp(−γ |x i − x j |^2), > 0 sigmoid: K(x i, x j ) = tanh(γx i T x s + r).

26 Notes radial basis function (RBF): K(x i, x j ) = exp(−γ |x i − x j |^2), > 0 A radial basis function (rbf) is equivalent to mapping the data into an infinite dimensional Hilbert space


Download ppt "Support Vector Machine (SVM) Presented by Robert Chen."

Similar presentations


Ads by Google