CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.

Slides:



Advertisements
Similar presentations
Lecture 9 Support Vector Machines
Advertisements

ECG Signal processing (2)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machine & Its Applications Mingyue Tan The University of British Columbia Nov 26, 2004 A portion (1/3) of the slides are taken from Prof.
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Appearance-based recognition & detection II Kristen Grauman UT-Austin Tuesday, Nov 11.

Pattern Recognition and Machine Learning
An Introduction of Support Vector Machine
Support Vector Machines
Quiz 1 on Wednesday ~20 multiple choice or short answer questions
Support vector machine
Machine learning continued Image source:
LOGO Classification IV Lecturer: Dr. Bo Yuan
LPP-HOG: A New Local Image Descriptor for Fast Human Detection Andy Qing Jun Wang and Ru Bo Zhang IEEE International Symposium.
Classification and Decision Boundaries
Discriminative and generative methods for bags of features
Support Vector Machine
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class.
Support Vector Machines Kernel Machines
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
Support Vector Machines and Kernel Methods
Support Vector Machines
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Discriminative and generative methods for bags of features
An Introduction to Support Vector Machines Martin Law.
Machine learning Image source:
Classification III Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
This week: overview on pattern recognition (related to machine learning)
Classification 2: discriminative models
Support Vector Machine & Image Classification Applications
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE: Support Vector Machines.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
An Introduction to Support Vector Machines (M. Law)
Lecture 27: Recognition Basics CS4670/5670: Computer Vision Kavita Bala Slides from Andrej Karpathy and Fei-Fei Li
An Introduction to Support Vector Machine (SVM)
Linear Models for Classification
CS 1699: Intro to Computer Vision Bias-Variance Trade-off + Other Models and Problems Prof. Adriana Kovashka University of Pittsburgh November 3, 2015.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Dec 21, 2006For ICDM Panel on 10 Best Algorithms Support Vector Machines: A Survey Qiang Yang, for ICDM 2006 Panel Partially.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.
An Introduction of Support Vector Machine In part from of Jinwei Gu.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
An Introduction of Support Vector Machine Courtesy of Jinwei Gu.
Support Vector Machine & Its Applications. Overview Intro. to Support Vector Machines (SVM) Properties of SVM Applications  Gene Expression Data Classification.
Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
Lecture IX: Object Recognition (2)
Neural networks and support vector machines
Object detection as supervised classification
CS 2770: Computer Vision Intro to Visual Recognition
CS 2750: Machine Learning Support Vector Machines
COSC 4335: Other Classification Techniques
Presentation transcript:

CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015

Homework HW3: due Nov. 3 – Note: You need to perform k-means on features from multiple frames joined together, not one k- means per frame! HW4: due Nov. 24 HW5: half-length, still 10% of grade, out Nov. 24, due Dec. 10

Plan for today Support vector machines Bias-variance tradeoff Scene recognition: Spatial pyramid matching

The machine learning framework y = f(x) Training: given a training set of labeled examples {(x 1,y 1 ), …, (x N,y N )}, estimate the prediction function f by minimizing the prediction error on the training set Testing: apply f to a never before seen test example x and output the predicted value y = f(x) outputprediction function image feature Slide credit: L. Lazebnik

Nearest neighbor classifier f(x) = label of the training example nearest to x All we need is a distance function for our inputs No training required! Test example Training examples from class 1 Training examples from class 2 Slide credit: L. Lazebnik

K-Nearest Neighbors classification k = 5 Slide credit: D. Lowe For a new point, find the k closest points from training data Labels of the k points “vote” to classify If query lands here, the 5 NN consist of 3 negatives and 2 positives, so we classify it as negative. Black = negative Red = positive

Scene Matches [Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.] Slides: James Hays

Nearest Neighbors: Pros and Cons NN pros: +Simple to implement +Decision boundaries not necessarily linear +Works for any number of classes +Nonparametric method NN cons: –Need good distance function –Slow at test time (large search problem to find neighbors) –Storage of data Adapted from L. Lazebnik

Discriminative classifiers Learn a simple function of the input features that correctly predicts the true labels on the training set Training Goals 1.Accurate classification of training data 2.Correct classifications are confident 3.Classification function is simple Slide credit: D. Hoiem

Linear classifier Find a linear function to separate the classes f(x) = sgn(w 1 x 1 + w 2 x 2 + … + w D x D ) = sgn(w  x) Svetlana Lazebnik

Lines in R 2 Let Kristen Grauman

Lines in R 2 Let Kristen Grauman

Lines in R 2 Let Kristen Grauman

Lines in R 2 Let distance from point to line Kristen Grauman

Lines in R 2 Let distance from point to line Kristen Grauman

Linear classifiers Find linear function to separate positive and negative examples Which line is best? C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition

Support vector machines Discriminative classifier based on optimal separating line (for 2d case) Maximize the margin between the positive and negative training examples C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition

Support vector machines Want line that maximizes the margin. Margin Support vectors C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition For support, vectors, wx+b=-1 wx+b=0 wx+b=1

Support vector machines Want line that maximizes the margin. Support vectors For support, vectors, wx+b=-1 wx+b=0 wx+b=1 Distance between point and line: For support vectors: C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition Margin

Support vector machines Want line that maximizes the margin. Margin Support vectors For support, vectors, wx+b=-1 wx+b=0 wx+b=1 Distance between point and line: Therefore, the margin is 2 / ||w|| C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition

Finding the maximum margin line 1.Maximize margin 2/||w|| 2.Correctly classify all training data points: Quadratic optimization problem: Minimize Subject to y i (w·x i +b) ≥ 1 One constraint for each training point. Note sign trick. C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition

Finding the maximum margin line Solution: Support vector Learned weight C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition

Finding the maximum margin line Solution: b = y i – w·x i (for any support vector) Classification function: Notice that it relies on an inner product between the test point x and the support vectors x i (Solving the optimization problem also involves computing the inner products x i · x j between all pairs of training points) If f(x) < 0, classify as negative, otherwise classify as positive. C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998A Tutorial on Support Vector Machines for Pattern Recognition

Datasets that are linearly separable work out great: But what if the dataset is just too hard? We can map it to a higher-dimensional space: 0x 0 x 0 x x2x2 Nonlinear SVMs Andrew Moore

Φ: x → φ(x) General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable: Andrew Moore Nonlinear SVMs

Nonlinear kernel: Example Consider the mapping x2x2 Svetlana Lazebnik

The “Kernel Trick” The linear classifier relies on dot product between vectors K(x i, x j ) = x i · x j If every data point is mapped into high-dimensional space via some transformation Φ: x i → φ(x i ), the dot product becomes: K(x i, x j ) = φ(x i ) · φ(x j ) A kernel function is similarity function that corresponds to an inner product in some expanded feature space The kernel trick: instead of explicitly computing the lifting transformation φ(x), define a kernel function K such that: K(x i, x j ) = φ(x i ) · φ(x j ) Andrew Moore

Examples of kernel functions Linear: Gaussian RBF: Histogram intersection: Andrew Moore

Allowing misclassifications Maximize marginMinimize misclassification Slack variable The w that minimizes… Misclassification cost # data samples

What about multi-class SVMs? Unfortunately, there is no “definitive” multi-class SVM formulation In practice, we have to obtain a multi-class SVM by combining multiple two-class SVMs One vs. others – Training: learn an SVM for each class vs. the others – Testing: apply each SVM to the test example, and assign it to the class of the SVM that returns the highest decision value One vs. one – Training: learn an SVM for each pair of classes – Testing: each learned SVM “votes” for a class to assign to the test example Svetlana Lazebnik

SVMs for recognition 1.Define your representation for each example. 2.Select a kernel function. 3.Compute pairwise kernel values between labeled examples 4.Use this “kernel matrix” to solve for SVM support vectors & weights. 5.To classify a new example: compute kernel values between new input and support vectors, apply weights, check sign of output. Kristen Grauman

Example: learning gender with SVMs Moghaddam and Yang, Learning Gender with Support Faces, TPAMI Moghaddam and Yang, Face & Gesture Kristen Grauman

Training examples: –1044 males –713 females Experiment with various kernels, select Gaussian RBF Learning gender with SVMs Kristen Grauman

Support Faces Moghaddam and Yang, Learning Gender with Support Faces, TPAMI 2002.

Gender perception experiment: How well can humans do? Subjects: –30 people (22 male, 8 female) –Ages mid-20’s to mid-40’s Test data: –254 face images (6 males, 4 females) –Low res and high res versions Task: –Classify as male or female, forced choice –No time limit Moghaddam and Yang, Face & Gesture 2000.

Gender perception experiment: How well can humans do? Error

Human vs. Machine SVMs performed better than any single human test subject, at either resolution Kristen Grauman

Hardest examples for humans Moghaddam and Yang, Face & Gesture 2000.

SVMs: Pros and cons Pros Many publicly available SVM packages: or use built-in Matlab version (but slower) Kernel-based framework is very powerful, flexible Often a sparse set of support vectors – compact at test time Work very well in practice, even with very small training sample sizes Cons No “direct” multi-class SVM, must combine two-class SVMs Can be tricky to select best kernel function for a problem Computation, memory –During training time, must compute matrix of kernel values for every pair of examples –Learning can take a very long time for large-scale problems Adapted from Lana Lazebnik