STOR 892 Object Oriented Data Analysis Radial Distance Weighted Discrimination Jie Xiong Advised by Prof. J.S. Marron Department of Statistics and Operations.

Slides:



Advertisements
Similar presentations
VC theory, Support vectors and Hedged prediction technology.
Advertisements

Support Vector Machines
1 Lecture 5 Support Vector Machines Large-margin linear classifier Non-separable case The Kernel trick.
Support Vector Machines H. Clara Pong Julie Horrocks 1, Marianne Van den Heuvel 2,Francis Tekpetey 3, B. Anne Croy 4. 1 Mathematics & Statistics, University.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
A Kernel-based Support Vector Machine by Peter Axelberg and Johan Löfhede.
SVMs Finalized. Where we are Last time Support vector machines in grungy detail The SVM objective function and QP Today Last details on SVMs Putting it.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
SVM Support Vectors Machines
A Study of the Relationship between SVM and Gabriel Graph ZHANG Wan and Irwin King, Multimedia Information Processing Laboratory, Department of Computer.
SVM (Support Vector Machines) Base on statistical learning theory choose the kernel before the learning process.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Object Orie’d Data Analysis, Last Time HDLSS Discrimination –MD much better Maximal Data Piling –HDLSS space is a strange place Kernel Embedding –Embed.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology.
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
Linear Discrimination Reading: Chapter 2 of textbook.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, II J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
SUPPORT VECTOR MACHINES. Intresting Statistics: Vladmir Vapnik invented Support Vector Machines in SVM have been developed in the framework of Statistical.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Lecture 4 Linear machine
An Introduction to Support Vector Machine (SVM)
Linear Models for Classification
Maximal Data Piling Visual similarity of & ? Can show (Ahn & Marron 2009), for d < n: I.e. directions are the same! How can this be? Note lengths are different.
Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.
CZ5225: Modeling and Simulation in Biology Lecture 7, Microarray Class Classification by Machine learning Methods Prof. Chen Yu Zong Tel:
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
1 UNC, Stat & OR U. C. Davis, F. R. G. Workshop Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Classification on Manifolds Suman K. Sen joint work with Dr. J. S. Marron & Dr. Mark Foskey.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, III J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
PCA Data Represent ’ n (Cont.). PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from Corresponding Normal Distribution.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
Distance Weighted Discrim ’ n Based on Optimization Problem: For “Residuals”:
SUPPORT VECTOR MACHINES
Support Vector Machines
Return to Big Picture Main statistical goals of OODA:
Support vector machines
PREDICT 422: Practical Machine Learning
Object Orie’d Data Analysis, Last Time
Support Vector Machines
Maximal Data Piling MDP in Increasing Dimensions:
An Introduction to Support Vector Machines
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Pawan Lingras and Cory Butz
Support Vector Machines
Support Vector Machines Most of the slides were taken from:
Large Scale Support Vector Machines
Support vector machines
Machine Learning Week 3.
Support Vector Machines
Support vector machines
CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis
Support vector machines
Mathematical Foundations of BME
Linear Discrimination
SVMs for Document Ranking
Support Vector Machines 2
Presentation transcript:

STOR 892 Object Oriented Data Analysis Radial Distance Weighted Discrimination Jie Xiong Advised by Prof. J.S. Marron Department of Statistics and Operations Research UNC-Chapel Hill

Outline Preliminaries – Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD) Data Object, which motivates the development of Radial DWD. – An important application: ‘Virus Hunting’ – High Dimension Low Sample (HDLSS) characteristics Radial DWD optimizations Real data and simulation study Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Preliminaries Binary classification – Using “training data” from Class +1 and Class -1 – Develop a “rule” for assigning new data to a Class – Canonical examples include disease diagnosis based on measurements – Think about split the data space for the 2 Classes using a classification boundary Most simple case: linear hyperplane Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Optimization Viewpoint Formulate Optimization problem, based on: Data (feature) vectors Class Labels Normal Vector Location (determines intercept) Residuals (right side) Residuals (wrong side)

Preliminaries SVM and DWD – Both are binary classifiers, they separate the 2 classes using a hyperplane – DWD is designed for High Dimension Low Sample Size (HDLSS) data, avoid data piling, larger generalizability Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Optimal Direction

Support Vector Machine Direction

Distance Weighted Discrimination

Data Objects Introduce ‘Virus Hunting’ using DNA sequence- alignment data. – DNA sequence and alignment – Data vector and the normalization used – HDLSS data geometry: data on simplex Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Data Objects Introduce ‘Virus Hunting’ using DNA sequence- alignment data. – DNA sequence and alignment – Data vector and the normalization used – HDLSS data geometry: data on simplex Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Virus sequence Reference (HSV-1) Short reads

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Data Objects Data on simplex Let d be the dimension of the data space (x1,…,xd) with non-negative entries adding up to 1 is on the unit simplex of dimension (d-1) (1/d,…,1/d) is the center of the unit simplex (0,…1,…,0) is one of the vertices Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Data Objects Introduce ‘Virus Hunting’ using DNA sequence- alignment data. – DNA sequence and alignment – Data vector and the normalization use – HDLSS data geometry Data points on the unit simplex Position and distances. Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Data Objects What can we say about the linear classifiers ? – When dimension is low, training data may not be linear separable – Under HDLSS, very often the training data is linearly separable (see Ahn and Marron 2010), however, the classification for the new samples could be very bad. Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

What can we say about this testing sample?

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data 1.Visualize using the distance to the center of the sphere, in high dimension cases. 2.Inside or outside the sphere?

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data R Data points: x1…xi…xn Class labels: y1…yi…yn are +/-1 O: center of the sphere R: Radius of the sphere

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data R Data points: x1…xi…xn Class labels: y1…yi…yn are +/-1 O: center of the sphere R: Radius of the sphere The distance of a data point xi to the center is

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data R Data points: x1…xi…xn Class labels: y1…yi…yn are +/-1 O: center of the sphere R: Radius of the sphere The distance of a data point xi to the sphere is

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data R Data points: x1…xi…xn Class labels: y1…yi…yn are +/-1 O: center of the sphere R: Radius of the sphere The distance of a data point xi to the sphere is The objective is to minimize the inverse of the sum of the distances to the sphere

Radial DWD Radial DWD Optimization Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Simulation and real data Real ‘Virus Hunting’ data. – HSV1 Simulated Data using Dirichlet distribution. – Compare Radial DWD with some alternatives: MD, SVM, DWD, LASSO, kernel SVM Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Simulation and real data Real ‘Virus Hunting’ data. – HSV1 positives in training – HSV1 negatives in training – HSV1 related samples in testing (human and other species) – Unrelated samples in testing Simulated Data using Dirichlet distribution. – Compare Radial DWD with some alternatives: MD, SVM, DWD, LASSO, kernel SVM Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Real ‘Virus Hunting’ data. HSV1 positives in training HSV1 negatives in training HSV1 related samples in testing (human and other species) Unrelated samples in testing

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Simulation and real data Real ‘Virus Hunting’ data. – HSV1 Simulated Data using Dirichlet distribution. – Dirichlet distribution: a 2 dimensional simplex case using Dirichlet (a1,a2,a3), and a1=a2=a3 = a. – Compare Radial DWD with some alternatives: MD, SVM, DWD, LASSO, kernel SVM Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Outline-Preliminaries-Data Object - Simulation and real data The density of Dirichlet(a,a,a)

Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data

Simulation and real data Real ‘Virus Hunting’ data. – HSV1 Simulated Data using Dirichlet distribution. – Dirichlet distribution: a 2 dimensional simplex case using Dirichlet (a1,a2,a3), and a1=a2=a3. – Compare Radial DWD with some alternatives: MD, SVM, DWD, LASSO, kernel SVM Outline-Preliminaries-Data Object - Radial DWD - Simulation and real data