Concave Minimization for Support Vector Machine Classifiers

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Machine learning continued Image source:
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Y.-J. Lee, O. L. Mangasarian & W.H. Wolberg
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
Reduced Support Vector Machine
Active Learning with Support Vector Machines
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Support Vector Machines
Difficulties with Nonlinear SVM for Large Problems  The nonlinear kernel is fully dense  Computational complexity depends on  Separating surface depends.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Mathematical Programming in Support Vector Machines
Evaluating Performance for Data Mining Techniques
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Active Learning for Class Imbalance Problem
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
Support Vector Machines in Data Mining AFOSR Software & Systems Annual Meeting Syracuse, NY June 3-7, 2002 Olvi L. Mangasarian Data Mining Institute University.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
Spam Detection Ethan Grefe December 13, 2013.
Mathematical Programming in Data Mining Author: O. L. Mangasarian Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
Support Vector Machine Data Mining Olvi L. Mangasarian with Glenn M. Fung, Jude W. Shavlik & Collaborators at ExonHit – Paris Data Mining Institute University.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison.
Data Mining via Support Vector Machines Olvi L. Mangasarian University of Wisconsin - Madison IFIP TC7 Conference on System Modeling and Optimization Trier.
Support Vector Machines
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute.
Survival-Time Classification of Breast Cancer Patients and Chemotherapy Yuh-Jye Lee, Olvi Mangasarian & W. H. Wolberg UW Madison & UCSD La Jolla Computational.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Generalization Error of pac Model  Let be a set of training examples chosen i.i.d. according to  Treat the generalization error as a r.v. depending on.
Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.
Support vector machines
Semi-Supervised Clustering
System for Semi-automatic ontology construction
Glenn Fung, Murat Dundar, Bharat Rao and Jinbo Bi
A New Support Vector Finder Method Based on Triangular Calculations
Geometrical intuition behind the dual problem
Basic machine learning background with Python scikit-learn
Machine Learning Week 1.
Categorization by Learning and Combing Object Parts
Support vector machines
Support vector machines
Support vector machines
University of Wisconsin - Madison
Supervised machine learning: creating a model
University of Wisconsin - Madison
Minimal Kernel Classifiers
Presentation transcript:

Concave Minimization for Support Vector Machine Classifiers Unlabeled Data Classification & Data Selection Glenn Fung O. L. Mangasarian

Part 1: Unlabeled Data Classification Given a large unlabeled dataset Use a k-Median clustering algorithm to select a small (5% to 10%) representative sample. Representative sample is labeled by expert or oracle. Combined labeled-unlabeled dataset is classified by a Semi-supervised Support Vector Machine. Test set correctness within 5.2% of a linear support vector machine trained on the entire dataset labeled by an expert.

Part 2: Data Selection for Support Vector Machines Classifiers Extract a minimal set of data points from a given dataset. Minimal set used to generate a Minimal Support Vector Machine (MSVM) classifier. MSVM classifier as good or better than that obtained by training on entire dataset. Feature selection is incorporated into procedure to obtain a minimal set of input features. Data reduction as high as 81% and averaged 66% over seven public datasets.

SVM: Linear Support Vector Machine

1-norm Linear SVM

Unlabeled Data Classification Given a completely unlabeled large data set. Costly to label points by an expert or an oracle. Two Question arise: How to choose a small subset for labeling? How to combine labeled and unlabeled data? Answers: Use k-median clustering for selecting “representative” points to be labeled. Use semi-supervised SVM to obtain a classifier based on labeled and unlabeled data.

Unlabeled Data Classification Unlabeled Data Set k-Median clustering Chosen Data Remaining Data Expert Labeled Data Semi-supervised SVM Separating Plane

K-Median Clustering Algorithm Given m data points. Find k clusters of these points such that the sum of the 1-norm distances from each point to the closest cluster center is minimized.

K-Median Clustering Algorithm *

K-Median Clustering Algorithm

Unlabeled Data Classification Unlabeled Data Set k-Median clustering Chosen Data Remaining Data Expert Labeled Data Semi-supervised SVM Separating Plane

Semi-supervised SVM (S3VM) Given a dataset consisting of: labeled (+1,-1) points represented by: unlabeled points represented by: Classify the data into two classes as follows: Assign each unlabeled point in to a class (+1,-1) so as to maximize the distance between the bounding planes obtained by a linear SVM1 applied to entire dataset.

Formulation

:A concave approach The term in the objective function is concave because it is the minimum of two linear functions. A local solution to this problem is obtained solving a succession of linear programs (4 to 7) .

S3VM: Graphical Example Separate Triangles & Circles Hollow shapes represent labeled data Solid shapes represent unlabeled data SVM S3VM

Numerical Tests

Part 2: Data Selection for Support Vector Machines Classifiers Labeled dataset 1-norm SVM feature selection Smaller dimension dataset Support vector suppression MSVM Separating surface

Support Vectors

Feature Selection using 1-norm Linear SVM ( small.)

Motivation for the Minimal Support Vector Machine (MSVM)

Motivation for the Minimal Support Vector Machine (MSVM) Suppression of error term y: Minimizes the number of misclassified points. Works remarkably well computationally. Reduces positive components of multiplier u and hence number of support vectors.

MSVM Formulation

MSVM Formulation

Numerical Tests

Unlabeled data classification: Conclusions Unlabeled data classification: A fast finite linear programming based approach for Semi-supervised Support Vector Machines was proposed for classifying large datasets that are mostly unlabeled. Totally unlabeled datasets were classified by: Labeling a small percentage of clusters by an expert Classification by a semi-supervised SVM Test set correctness within 5.2% of a linear SVM trained on the entire dataset labeled by an expert.

Data selection for SVM classifiers: Conclusions Data selection for SVM classifiers: Minimal SVM (MSVM) extracts a minimal subset used to classify the entire dataset. MSVM maintains or improves generalization over other classifiers that use the entire dataset. Data reduction as high as 81%, and averaged 66% over seven public datasets. Future work MSVM: Promising tool for incremental algorithms. Improve chunking algorithms with MSVM. Nonlinear MSVM: strong potential for time & storage reduction.