Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction.

Slides:



Advertisements
Similar presentations
Lecture 9 Support Vector Machines
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Classification / Regression Support Vector Machines

Support Vector Machines Instructor Max Welling ICS273A UCIrvine.
An Introduction of Support Vector Machine
Support Vector Machines
SVM—Support Vector Machines
Machine learning continued Image source:
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
LOGO Classification IV Lecturer: Dr. Bo Yuan
N.U.S. - January 13, 2006 Gert Lanckriet U.C. San Diego Classification problems with heterogeneous information sources.
Support Vector Machines
Support Vector Machine
Support Vector Machines and Kernel Methods
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Classification: Support Vector Machine 10/10/07. What hyperplane (line) can separate the two classes of data?
MURI Meeting July 2002 Gert Lanckriet ( ) L. El Ghaoui, M. Jordan, C. Bhattacharrya, N. Cristianini, P. Bartlett.
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Binary Classification Problem Learn a Classifier from the Training Set
Support Vector Machines
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Kernel Methods and SVM’s. Predictive Modeling Goal: learn a mapping: y = f(x;  ) Need: 1. A model structure 2. A score function 3. An optimization strategy.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:
An Introduction to Support Vector Machines Martin Law.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.
Outline Separating Hyperplanes – Separable Case
Whole Genome Expression Analysis
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
An Introduction to Support Vector Machines (M. Law)
1 Kernel based data fusion Discussion of a Paper by G. Lanckriet.
+ Get Rich and Cure Cancer with Support Vector Machines (Your Summer Projects)
Kernel Methods Jong Cheol Jeong. Out line 6.1 One-Dimensional Kernel Smoothers Local Linear Regression Local Polynomial Regression 6.2 Selecting.
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
SUPPORT VECTOR MACHINE
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Dd Generalized Optimal Kernel-based Ensemble Learning for HS Classification Problems Generalized Optimal Kernel-based Ensemble Learning for HS Classification.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Support Vector Machines (SVM): A Tool for Machine Learning Yixin Chen Ph.D Candidate, CSE 1/10/2002.
Robust Optimization and Applications in Machine Learning.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
SVMs, Part 2 Summary of SVM algorithm Examples of “custom” kernels Standardizing data for SVMs Soft-margin SVMs.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
SVMs in a Nutshell.
Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:
Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:
PREDICT 422: Practical Machine Learning
Geometrical intuition behind the dual problem
Kernels Usman Roshan.
Statistical Learning Dong Liu Dept. EEIS, USTC.
Machine Learning Week 3.
Usman Roshan CS 675 Machine Learning
Presentation transcript:

Integration II Prediction

Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction – Clinical prognosis

SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... [Noble, Nat. Biotechnology, 2006]

SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types [Noble, Nat. Biotechnology, 2006]

SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types E.g.: a one-dimensional hyper-plane [Noble, Nat. Biotechnology, 2006]

SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types E.g.: a two-dimensional hyper-plane [Noble, Nat. Biotechnology, 2006]

SVMs Suppose that measurements are separable: there exists a hyperplane that separates two types Then there are an infinite number of separating hyperplanes Which to use? [Noble, Nat. Biotechnology, 2006]

SVMs Suppose that measurements are separable: there exists a hyperplane that separates two types Then there are an infinite number of separating hyperplanes Which to use? The maximum-margin hyperplane Equivalently: minimizer of [Noble, Nat. Biotechnology, 2006]

SVMs Which hyper-plane to use? In reality: minimizer of trade-off between 1. classification error, and 2. margin size loss penalty

SVMs This is the primal problem This is the dual problem

SVMs What is K? The kernel matrix: each entry is sample inner product one interpretation: sample similarity measurements completely described by K

SVMs Implication: Non-linearity is obtained by appropriately defining kernel matrix K E.g. quadratic kernel:

SVMs Another implication: No need for measurement vectors all that is required is similarity between samples E.g. string kernels

Protein Structure Prediction Protein structure Protein sequence Sequence similarity

Protein Structure Prediction

Kernel-based data fusion Core idea: use different kernels for different genomic data sources a linear combination of kernel matrices is a kernel (under certain conditions)

Kernel-based data fusion Kernel to use in prediction:

Kernel-based data fusion In general, the task is to estimate SVM function along with coefficients of the kernel matrix combination This is a type of well-studied optimization problem (semi-definite program)

Kernel-based data fusion

Same idea applied to cancer classification from expression and proteomic data

Kernel-based data fusion Prostate cancer dataset – 55 samples – Expression from microarray – Copy number variants Outcomes predicted: – Grade, stage, metastasis, recurrence

Kernel-based data fusion