Survival-Time Classification of Breast Cancer Patients and Chemotherapy Yuh-Jye Lee, Olvi Mangasarian & W. H. Wolberg UW Madison & UCSD La Jolla Computational.

Slides:



Advertisements
Similar presentations
Optimization in Data Mining Olvi L. Mangasarian with G. M. Fung, J. W. Shavlik, Y.-J. Lee, E.W. Wild & Collaborators at ExonHit – Paris University of Wisconsin.
Advertisements

ECG Signal processing (2)
Christoph F. Eick Questions and Topics Review Dec. 10, Compare AGNES /Hierarchical clustering with K-means; what are the main differences? 2. K-means.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machine Classification Computation & Informatics in Biology & Medicine Madison Retreat, November 15, 2002 Olvi L. Mangasarian with G. M.
Pattern Recognition and Machine Learning
Support Vector Machines
Machine learning continued Image source:
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Y.-J. Lee, O. L. Mangasarian & W.H. Wolberg
Support Vector Machines and Kernel Methods
The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.
1-norm Support Vector Machines Good for Feature Selection  Solve the quadratic program for some : min s. t.,, denotes where or membership. Equivalent.
MMLD1 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst.
Kernel Technique Based on Mercer’s Condition (1909)
Dual Problem of Linear Program subject to Primal LP Dual LP subject to ※ All duality theorems hold and work perfectly!
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
Reduced Support Vector Machine
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Binary Classification Problem Learn a Classifier from the Training Set
Unconstrained Optimization Problem
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Support Vector Machines Exercise solutions Ata Kaban The University of Birmingham.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Breast Cancer Diagnosis A discussion of methods Meena Vairavan.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Mathematical Programming in Support Vector Machines
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Breast Cancer Diagnosis via Linear Hyper-plane Classifier Presented by Joseph Maalouf December 14, 2001 December 14, 2001.
Efficient Model Selection for Support Vector Machines
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
Support Vector Machines in Data Mining AFOSR Software & Systems Annual Meeting Syracuse, NY June 3-7, 2002 Olvi L. Mangasarian Data Mining Institute University.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Knowledge-Based Breast Cancer Prognosis Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Computation and Informatics in Biology and Medicine.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
Nonlinear Data Discrimination via Generalized Support Vector Machines David R. Musicant and Olvi L. Mangasarian University of Wisconsin - Madison
Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison November 14, 2015 TexPoint.
Mathematical Programming in Data Mining Author: O. L. Mangasarian Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Support Vector Machine Data Mining Olvi L. Mangasarian with Glenn M. Fung, Jude W. Shavlik & Collaborators at ExonHit – Paris Data Mining Institute University.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison.
Data Mining via Support Vector Machines Olvi L. Mangasarian University of Wisconsin - Madison IFIP TC7 Conference on System Modeling and Optimization Trier.
MMLD1 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst.
Nonlinear Knowledge in Kernel Approximation Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison.
Nonlinear Knowledge in Kernel Machines Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Data Mining and Mathematical Programming Workshop.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute.
Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison March 3, 2016 TexPoint.
Generalization Error of pac Model  Let be a set of training examples chosen i.i.d. according to  Treat the generalization error as a r.v. depending on.
Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman.
Classification via Mathematical Programming Based Support Vector Machines Glenn M. Fung Computer Sciences Dept. University of Wisconsin - Madison November.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.
Glenn Fung, Murat Dundar, Bharat Rao and Jinbo Bi
Computer Sciences Dept. University of Wisconsin - Madison
Concave Minimization for Support Vector Machine Classifiers
University of Wisconsin - Madison
University of Wisconsin - Madison
Minimal Kernel Classifiers
Presentation transcript:

Survival-Time Classification of Breast Cancer Patients and Chemotherapy Yuh-Jye Lee, Olvi Mangasarian & W. H. Wolberg UW Madison & UCSD La Jolla Computational and Applied Mathematics Seminar April 19, 2005

Breast Cancer Estimates American Cancer Society & World Health Organization  Breast cancer is the most common cancer among women in the US.  212,930 new cases of breast cancer are estimated by the ACS to occur in the US in 2005: 211,240 in women and 1,690 in men.  40,870 deaths are estimated to occur from breast cancer in the US in 2005: 40,410 among women and 460 among men.  WHO estimates: More than 1.2 million people worldwide were diagnosed with breast cancer in 2001 and 0.5 million died from breast cancer in 2000.

Key Objective  Identify breast cancer patients for whom chemotherapy prolongs survival time  Main Difficulty: Cannot carry out comparative tests on human subjects  Similar patients must be treated similarly  Our Approach: Classify patients into: Good, Intermediate & Poor groups such that:  Good group does not need chemotherapy  Intermediate group benefits from chemotherapy  Poor group not likely to benefit from chemotherapy

Outline  Tools used  Support vector machines (Linear & Nonlinear SVMs)  Feature selection & classification  Clustering (k-Median algorithm not k-Means)  Cluster into good & intermediate & poor classes  Cluster no-chemo patients into 2 groups: good & poor  Cluster chemo patients into 2 groups : good & poor  Generate three final classes  Good class (Good from no-chemo cluster group)  Poor class (Poor from chemo cluster group)  Intermediate class: Remaining patients (chemo & no-chemo)  Generate survival curves for three classes  Use SSVM to classify new patients into one of above three classes  Data description

Cell Nuclei of a Fine Needle Aspirate

Thirty Cytological Features Collected at Diagnosis Time

Two Histological Features Collected at Surgery Time

Breast Cancer Diagnosis Based on 3 FNA Features 97% Ten-fold Cross Validation Corrrectnes 780 Patients: 494 Benign, 286 Maignant Research by Mangasarian,Street, Wolberg

1- Norm Support Vector Machines Maximize the Margin between Bounding Planes A+ A-

Support Vector Machine Algebra of 2-Category Linearly Separable Case  Given m points in n dimensional space  Represented by an m-by-n matrix A  Membership of each in class +1 or –1 specified by:  An m-by-m diagonal matrix D with +1 & -1 entries  More succinctly: where e is a vector of ones.  Separate by two bounding planes,

Feature Selection Using 1-Norm Linear SVM Classification Based on Lymph Node Status  Features selected: 6 out of 31 by above SVM:  Feature selection: 1-norm SVM: s. t. min,, denotes Lymph node > 0 or where Lymph node =0  5 out 30 cytological features that describe nuclear size, shape and texture from fine needle aspirate  Tumor size from surgery

Features Selected by Support Vector Machine

Nonlinear SVM for Classifying New Patients  Linear SVM: (Linear separating surface: ) (LP) min s.t.  Replace by a nonlinear kernel : min s.t. in the “dual space”, gives: By QP duality:. Maximizing the margin min s.t.

The Nonlinear Classifier  The nonlinear classifier:  Where K is a nonlinear kernel, e.g.:  Gaussian (Radial Basis) Kernel :  The -entry of represents “similarity” between the data points and

Clustering in Data Mining General Objective  Given: A dataset of m points in n-dimensional real space  Problem: Extract hidden distinct properties by clustering the dataset into k clusters

Concave Minimization Formulation of 1-Norm Clustering Problem (k-Median), and a number  Given: Set of m points in represented by the matrix of desired clusters  Objective Function: Sum of m minima of linear functions, hence it is piecewise-linear concave  Difficulty: Minimizing a general piecewise-linear concave function over a polyhedral set is NP-hard  Find: Cluster centers that minimize the sum of 1-norm distances of each point: to its closest cluster center.

Clustering via Finite Concave Minimization  Equivalent bilinear reformulation: min s.t. min s.t.  Minimize the sum of 1-norm distances between each data point: and the closest cluster center

K-Median Clustering Algorithm Finite Termination at Local Solution Step 1 (Cluster Assignment): Assign points to the cluster with the nearest cluster center in 1-norm Step 2 (Center Update) Recompute location of center for each cluster as the cluster median (closest point to all cluster points in 1-norm) Step3 (Stopping Criterion) Stop if the cluster centers are unchanged, else go to Step 1 Step 0 (Initialization): Pick 2 initial cluster centers as medians of:  (L=0 & T<2) & (L 5 or T 4)

Feature Selection & Initial Cluster Centers  6 out of 31 features selected by 1-norm SVM ( )  SVM separating lymph node positive (Lymph > 0) from lymph node negative (Lymph = 0)  Apply k-Median algorithm in 6-dimensional input space  Initial cluster centers used: Medians of Good1 & Poor1  Good1: Patients with Lymph = 0 AND Tumor < 2  Poor1: Patients with Lymph > 4 OR Tumor  Typical indicator for chemotherapy

Overall Clustering Process 253 Patients (113 NoChemo, 140 Chemo) Cluster 113 NoChemo Patients Use k-Median Algorithm with Initial Centers: Medians of Good1 & Poor1 69 NoChemo Good 44 NoChemo Poor 67 Chemo Good 73 Chemo Poor Good Poor Intermediate Cluster 140 Chemo Patients Use k-Median Algorithm with Initial Centers: Medians of Good1 & Poor1 Good1: Lymph=0 AND Tumor<2 Compute Median Using 6 Features Poor1: Lymph>=5 OR Tumor>=4 Compute Median Using 6 Features Compute Initial Cluster Centers

Survival Curves for Good, Intermediate & Poor Groups (Nonlinear SSVM for New Patients)

Survival Curves for Intermediate Group: Split by Chemo & NoChemo

Survival Curves for Overall Patients: With & Without Chemotherapy

Survival Curves for Intermediate Group Split by Lymph Node & Chemotherapy

Survival Curves for Overall Patients Split by Lymph Node Positive & Negative

Conclusion  Used five cytological features & tumor size to cluster breast cancer patients into 3 groups:  Good – No chemotherapy recommended  Intermediate – Chemotherapy likely to prolong survival  Poor – Chemotherapy may or may not enhance survival  3 groups have very distinct survival curves  First categorization of a breast cancer group for which chemotherapy enhances longevity  SVM- based procedure assigns new patients into one of above three survival groups

Talk & Paper Available on Web   Y.-J. Lee, O. L. Mangasarian & W. H. Wolberg: “ Computational Optimization and Applications” Volume 25, 2003, pages ”