1 Kernel based data fusion Discussion of a Paper by G. Lanckriet.

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

(SubLoc) Support vector machine approach for protein subcelluar localization prediction (SubLoc) Kim Hye Jin Intelligent Multimedia Lab
Support Vector Machines
Lecture 9 Support Vector Machines
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Classification / Regression Support Vector Machines
Olivier Duchenne , Armand Joulin , Jean Ponce Willow Lab , ICCV2011.

Pattern Recognition and Machine Learning
Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
Support Vector Machines
SVM—Support Vector Machines
Pattern Recognition and Machine Learning: Kernel Methods.
LOGO Classification IV Lecturer: Dr. Bo Yuan
by Rianto Adhy Sasongko Supervisor: Dr.J.C.Allwright
N.U.S. - January 13, 2006 Gert Lanckriet U.C. San Diego Classification problems with heterogeneous information sources.
Semi-Definite Algorithm for Max-CUT Ran Berenfeld May 10,2005.
Robust Multi-Kernel Classification of Uncertain and Imbalanced Data
Support Vector Machines
Support Vector Machine
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Dual Problem of Linear Program subject to Primal LP Dual LP subject to ※ All duality theorems hold and work perfectly!
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Classification: Support Vector Machine 10/10/07. What hyperplane (line) can separate the two classes of data?
Support Vector Machines for Multiple- Instance Learning Authors: Andrews, S.; Tsochantaridis, I. & Hofmann, T. (Advances in Neural Information Processing.
MURI Meeting July 2002 Gert Lanckriet ( ) L. El Ghaoui, M. Jordan, C. Bhattacharrya, N. Cristianini, P. Bartlett.
Constrained Optimization Rong Jin. Outline  Equality constraints  Inequality constraints  Linear Programming  Quadratic Programming.
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
Support Vector Machines and Kernel Methods
Support Vector Machines
Sparse Kernels Methods Steve Gunn.
2806 Neural Computation Support Vector Machines Lecture Ari Visa.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Lecture 10: Support Vector Machines
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:
Constrained Optimization Rong Jin. Outline  Equality constraints  Inequality constraints  Linear Programming  Quadratic Programming.
Optimization of Linear Problems: Linear Programming (LP) © 2011 Daniel Kirschen and University of Washington 1.
Support Vector Machines
Machine Learning Week 4 Lecture 1. Hand In Data Is coming online later today. I keep test set with approx test images That will be your real test.
Support Vector Machine & Image Classification Applications
Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
Predicting protein function from heterogeneous data
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support Vector Machines Tao Department of computer science University of Illinois.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Robust Optimization and Applications in Machine Learning.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
SVMs in a Nutshell.
Lecture 14. Outline Support Vector Machine 1. Overview of SVM 2. Problem setting of linear separators 3. Soft Margin Method 4. Lagrange Multiplier Method.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Support Vector Machines
PREDICT 422: Practical Machine Learning
Kernels Usman Roshan.
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Usman Roshan CS 675 Machine Learning
Presentation transcript:

1 Kernel based data fusion Discussion of a Paper by G. Lanckriet

2 Paper

3 Overview Problem: Aggregation of heterogeneous data Idea: Different data are represented by different kernels Question: How to combine different kernels in an elegant/efficient way? Solution: Linear combination and SDP Application: Recognition of ribosomal and membrane proteins

4 Linear combination of kernels weightkernel  Resulting kernel K is positive definite (x T Kx > 0 for x, provided  i > 0 and x T K i x > 0 )  Elegant aggregation of heterogeneous data  More efficient than training of individual SVMs  KCCA uses unweighted sum over individual kernels x T Kx = x 2 K x x2Kx2K 0

5 Support Vector Machine slack variables square norm vector penalty term Hyperplane

6 Dual form Lagrange multipliers quadratic, convex  Maximization instead of minimization  Equality constraints  Lagrange multipliers  instead of w,b,  Quadratic program (QP) positive definite scalar  0

7 Inserting linear combination Combined kernel must be within the cone of positive semidefinite matrices Fixed trace, avoids trivial solution ugly

8 Cone and other stuff The set of all symmetric positive semidefinite matrices of particular dimension is called the positive semidefinite cone. x T Ax ≥ 0, x A Positive semidefinite: Positive semidefinite cone:

9 Semidefinite program (SDP) positive semidefinite constraints Fixed trace, avoids trivial solution

10 Dual form  Quadratically constraint quadratic program (QCQP)  QCQPs can be solved more efficiently than SDPs (O(n 3 ) O(n 4.5 ))  Interior point methods quadratic constraint

11 Interior point algorithm Linear program: maximize c T x subject to Ax < b x ≥ 0  Classical Simplex method follows edges of polyhedron  Interior point methods walk through the interior of the feasible region

12 Application  Recognition of ribosomal and membrane proteins in yeast  3 Types of data Amino acid sequences Protein protein interactions mRNA expression profiles  7 Kernels Empirical kernel map -> sequence homology  BLAST(B), Smith-Waterman(SW), Pfam FFT -> sequence hydropathy  KD hydropathy profiles, padding, low-pass filter, FFT, RBF Interaction kernel(LI) -> PPI Diffusion(D) -> PPI RBF(E) -> gene expression

13 Results  Combination of kernels performs better than individual kernels  Gene expression (E) most important for ribosomal protein recognition  PPI (D) most important for membrane protein recognition

14 Results  Small improvement compared to weights = 1  SDP robust in the presence of noise  How performs SDP versus kernel weights derived from accuracy of individual SVMs?  Membrane protein recognition Other methods use sequence information only TMHMM designed for topology prediction TMHMM not trained on yeast only

15 Why is this cool? Everything you ever dreamed of:  Optimization of C included (2-norm soft margin SVM =1/C)  Hyperkernels (optimize the kernel itself)  Transduction (learn from labeled & unlabeled samples in polynomial time)  SDP has many applications (Graph theory, combinatorial optimization, …)

16 Literature  Learning the kernel matrix with semidefinite programming G.R.G.Lanckrit et. al, 2004  Kernel-based data fusion and its application to protein function prediction in yeast G.R.G.Lanckrit et. al, 2004  Machine learning using Hyperkernels C.S.Ong, A.J.Smola, 2003  Semidefinite optimization M.J.Todd, 2001 

17 Software  SeDuMi (SDP)  Mosek (QCQP, Java,C++, commercial)  YALMIP (Matlab) …