Trading Convexity for Scalability Marco A. Alvarez CS7680 Department of Computer Science Utah State University.

Slides:



Advertisements
Similar presentations
CHAPTER 13: Alpaydin: Kernel Machines
Advertisements

7. Support Vector Machines (SVMs)
Lecture 9 Support Vector Machines
ECG Signal processing (2)
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Classification / Regression Support Vector Machines
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
Pattern Recognition and Machine Learning
SVM—Support Vector Machines
Support vector machine
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Optimization Tutorial
Large Scale Manifold Transduction Michael Karlen Jason Weston Ayse Erkan Ronan Collobert ICML 2008.
Learning Structural SVMs with Latent Variables Xionghao Liu.
Separating Hyperplanes
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Support Vector Machines (and Kernel Methods in general)
Support Vector Machines and Kernel Methods
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Dual Problem of Linear Program subject to Primal LP Dual LP subject to ※ All duality theorems hold and work perfectly!
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Efficient Convex Relaxation for Transductive Support Vector Machine Zenglin Xu 1, Rong Jin 2, Jianke Zhu 1, Irwin King 1, and Michael R. Lyu 1 4. Experimental.
Duality Dual problem Duality Theorem Complementary Slackness
Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.
Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,
Support Vector Machine (SVM) Classification
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
Reformulated - SVR as a Constrained Minimization Problem subject to n+1+2m variables and 2m constrains minimization problem Enlarge the problem size and.
Announcements  Project teams should be decided today! Otherwise, you will work alone.  If you have any question or uncertainty about the project, talk.
Support Vector Machines
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:
Support Vector Machines
Machine Learning Queens College Lecture 13: SVM Again.
SVM by Sequential Minimal Optimization (SMO)
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Linear Programming Boosting by Column and Row Generation Kohei Hatano and Eiji Takimoto Kyushu University, Japan DS 2009.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
Stochastic Subgradient Approach for Solving Linear Support Vector Machines Jan Rupnik Jozef Stefan Institute.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Considering Cost Asymmetry in Learning Classifiers Presented by Chunping Wang Machine Learning Group, Duke University May 21, 2007 by Bach, Heckerman and.
CISC667, F05, Lec22, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Support Vector Machines I.
An Introduction to Support Vector Machine (SVM)
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
A Parallel Mixture of SVMs for Very Large Scale Problems Ronan Collobert Samy Bengio Yoshua Bengio Prepared : S.Y.C. Neural Information Processing Systems,
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Parametric Quadratic Optimization Oleksandr Romanko Joint work with Alireza Ghaffari Hadigheh and Tamás Terlaky McMaster University January 19, 2004.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
PREDICT 422: Practical Machine Learning
Geometrical intuition behind the dual problem
Nonnegative polynomials and applications to learning
An Introduction to Support Vector Machines
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Machine Learning Week 2.
Statistical Learning Dong Liu Dept. EEIS, USTC.
Machine Learning Week 3.
Basic Concepts of Optimization
Support Vector Machines
Usman Roshan CS 675 Machine Learning
Minimal Kernel Classifiers
Presentation transcript:

Trading Convexity for Scalability Marco A. Alvarez CS7680 Department of Computer Science Utah State University

Paper Collobert, R., Sinz, F., Weston, J., and Bottou, L Trading convexity for scalability. In Proceedings of the 23rd International Conference on Machine Learning (Pittsburgh, Pennsylvania, June , 2006). ICML '06, vol ACM Press, New York, NY,

Introduction Previously in Machine Learning  Non-convex cost function in MLP Difficult to optimize Work efficiently  SVM are defined by a convex function Easier optimization (algorithms) Unique solution (we can write theorems) Goal of the paper  Sometimes non-convexity has benefits Faster == training and testing (less support vectors)  Non-convex SVMs (faster and sparser)  Fast transductive SVMs

From SVM Decision function Primal formulation  Minimize ||w|| so that margin is maximized  w is a combination of a small number of data (sparsity)  Decision boundary is determined by the support vectors Dual formulation s.t.

SVM problem Number of support vectors increases linearly with L Cost attributed to one example (x,y): From:

Ramp Loss Function Given: Outliers Non SV

Concave-Convex Procedure (CCCP) Given a cost function: Decompose into a convex part and a concave part Is guaranteed to decrease at each iteration

Using the Ramp Loss

CCCP for Ramp Loss

Results

Speedup

Time and Number of SVs

Transductive SVMs

Loss Function Cost to be minimized:

Balancing Constraint Necessary for TSVMs

Results

Training Time

Quadratic Fit