Support Vector Machines

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

Lecture 9 Support Vector Machines
ECG Signal processing (2)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Classification / Regression Support Vector Machines
Linear Classifiers/SVMs
SOFT LARGE MARGIN CLASSIFIERS David Kauchak CS 451 – Fall 2013.
An Introduction of Support Vector Machine
Support Vector Machines
SVM—Support Vector Machines
SVMs Reprised. Administrivia I’m out of town Mar 1-3 May have guest lecturer May cancel class Will let you know more when I do...
Support Vector Machines
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Support Vector Machines Kernel Machines
Recent Results in Support Vector Machines Dave Musicant Graphic generated with Lucent Technologies Demonstration 2-D Pattern Recognition Applet at
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
SVMs Finalized. Where we are Last time Support vector machines in grungy detail The SVM objective function and QP Today Last details on SVMs Putting it.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lecture 10: Support Vector Machines
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)
SVMs, cont’d Intro to Bayesian learning. Quadratic programming Problems of the form Minimize: Subject to: are called “quadratic programming” problems.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
This week: overview on pattern recognition (related to machine learning)
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Nonlinear Data Discrimination via Generalized Support Vector Machines David R. Musicant and Olvi L. Mangasarian University of Wisconsin - Madison
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Support Vector Machines Project מגישים : גיל טל ואורן אגם מנחה : מיקי אלעד נובמבר 1999 הטכניון מכון טכנולוגי לישראל הפקולטה להנדסת חשמל המעבדה לעיבוד וניתוח.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
An Introduction to Support Vector Machine (SVM)
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Data Mining and Machine Learning via Support Vector Machines
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
SVMs in a Nutshell.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
Support Vector Machine
Support Vector Machines
An Introduction to Support Vector Machines
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Other Classification Models: Support Vector Machine (SVM)
COSC 4368 Machine Learning Organization
University of Wisconsin - Madison
Presentation transcript:

Support Vector Machines Graphic generated with Lucent Technologies Demonstration 2-D Pattern Recognition Applet at http://svm.research.bell-labs.com/SVT/SVMsvt.html

Separating Line (or hyperplane) Class 1 Class -1 Goal: Find the best line (or hyperplane) to separate the training data. How to formalize? In two dimensions, equation of the line is given by: Better notation for n dimensions:

Simple Classifier The Simple Classifier: Points that fall on the right are classified as “1” Points that fall on the left are classified as “-1” Therefore: using the training set, find a hyperplane (line) so that This is a perceptron! How can we improve on the perceptron? Class 1 Class -1

Finding the Best Plane Not all planes are equal. Which of the two following planes shown is better? Class -1 Class 1 Both planes accurately classify the training set. The green plane is the better choice, since it is more likely to do well on future test data. The green plane is further away from the data.

Separating the planes Construct the bounding planes: Class -1 Class 1 Draw two parallel planes to the classification plane. Push them as far apart as possible, until they hit data points. The classification plane with bounding planes furthest apart is the best one. Class -1 Class 1

Recap: Finding the Best Plane Details All points in class 1 should be to the right of bounding plane 1. All points in class -1 should be to the left of bounding plane -1. Pick yi to be +1 or -1 depending on the classification. Then the above two inequalities can be written as one: The distance between bounding planes should be maximized. The distance between bounding planes is given by: Class 1 Class -1

The Optimization Problem The previous slide can be rewritten as: such that This is a mathematical program. Optimization problem subject to constraints More specifically, this is a quadratic program There are high powered software tools for solving this kind of problem (both commercial and academic) No special algorithms are necessary (in theory...) Just enter this problem and the associated data into a quadratic programming solver (like CPLEX), and let it find an answer.

Data Which is Not Linearly Separable What if a separating plane does not exist? error Class -1 Class 1 Find the plane that maximizes the margin and minimizes the errors on the training points. Take original inequality and add a slack variable to measure error:

The Support Vector Machine Push the planes apart and minimize the error at the same time: such that C is a positive number that is chosen to balance these two goals. This problem is called a Support Vector Machine, or SVM. The SVM is one of many techniques for doing supervised machine learning Others: Neural networks, decision trees, k-nearest neighbor

Terminology Those points that touch the bounding plane, or lie on the wrong side, are called support vectors. If all the data points except the support vectors were removed, the solution would turn out the same. The SVM is mathematically equivalent to force and torque equilibrium (hence the name support vectors).

Research: Solving Massive SVMs The standard SVM is solved using a canned quadratic programming (QP) solver. Problem: Standard tools bring all the data into memory. If dataset is bigger than memory, out of luck. How do other supervised learning techniques handle data that does not fit in memory? Why not use virtual memory? Let the operating system manage which data the QP solver is using. Answer: The QP solver accesses data in a random, not a continuous fashion. The cost to page data in and out of memory is enormous.

What about nonlinear surfaces? Some datasets may not be best separated by a plane. How can we do nonlinear separating surfaces? Simple method: Map into a higher dimensional space, and do the same thing we have already done. Generated with Lucent Technologies Demonstration 2-D Pattern Recognition Applet at http://svm.research.bell-labs.com/SVT/SVMsvt.html

Finding nonlinear surfaces How to modify algorithm to find nonlinear surfaces? First idea (simple and effective): map each data point into a higher dimensional space, and find a linear fit there Example: Find a quadratic surface for Use new coordinates in regular linear SVM A plane in this quadratic space is equivalent to a quadratic surface in our original space.

Problem & Solution If dimensionality of space is high, lots of calculations For a high polynomial space, combinations of coordinates explodes Need to do all these calculations for all training points, and for each testing point Infinite dimensional spaces impossible Nonlinear surfaces can be used without these problems through the use of a kernel function. Demonstration: http://svm.cs.rhul.ac.uk/pagesnew/GPat.shtml

Example: Checkerboard

5-Nearest Neighbor

Sixth degree polynomial kernel