Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, 2001- Rutgers University Y.-J.

Slides:



Advertisements
Similar presentations
Optimization in Data Mining Olvi L. Mangasarian with G. M. Fung, J. W. Shavlik, Y.-J. Lee, E.W. Wild & Collaborators at ExonHit – Paris University of Wisconsin.
Advertisements

COMPUTER AIDED DIAGNOSIS: CLASSIFICATION Prof. Yasser Mostafa Kadah –
ECG Signal processing (2)
Christoph F. Eick Questions and Topics Review Dec. 10, Compare AGNES /Hierarchical clustering with K-means; what are the main differences? 2. K-means.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machine Classification Computation & Informatics in Biology & Medicine Madison Retreat, November 15, 2002 Olvi L. Mangasarian with G. M.
Support Vector Machines
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Machine learning continued Image source:
K Means Clustering , Nearest Cluster and Gaussian Mixture
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Y.-J. Lee, O. L. Mangasarian & W.H. Wolberg
The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.
1-norm Support Vector Machines Good for Feature Selection  Solve the quadratic program for some : min s. t.,, denotes where or membership. Equivalent.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
Reduced Support Vector Machine
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Active Learning with Support Vector Machines
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Support Vector Machines
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Data mining and statistical learning - lecture 13 Separating hyperplane.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Breast Cancer Diagnosis A discussion of methods Meena Vairavan.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Mathematical Programming in Support Vector Machines
Evaluating Performance for Data Mining Techniques
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Breast Cancer Diagnosis via Linear Hyper-plane Classifier Presented by Joseph Maalouf December 14, 2001 December 14, 2001.
Efficient Model Selection for Support Vector Machines
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
Support Vector Machines in Data Mining AFOSR Software & Systems Annual Meeting Syracuse, NY June 3-7, 2002 Olvi L. Mangasarian Data Mining Institute University.
Knowledge-Based Breast Cancer Prognosis Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Computation and Informatics in Biology and Medicine.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Nonlinear Data Discrimination via Generalized Support Vector Machines David R. Musicant and Olvi L. Mangasarian University of Wisconsin - Madison
Mathematical Programming in Data Mining Author: O. L. Mangasarian Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Support Vector Machine Data Mining Olvi L. Mangasarian with Glenn M. Fung, Jude W. Shavlik & Collaborators at ExonHit – Paris Data Mining Institute University.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Prognostic Prediction of Breast Cancer Using C5 Sakina Begum May 1, 2001.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison.
Data Mining via Support Vector Machines Olvi L. Mangasarian University of Wisconsin - Madison IFIP TC7 Conference on System Modeling and Optimization Trier.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Nonlinear Knowledge in Kernel Approximation Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison.
Nonlinear Knowledge in Kernel Machines Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Data Mining and Mathematical Programming Workshop.
Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J.
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute.
Survival-Time Classification of Breast Cancer Patients and Chemotherapy Yuh-Jye Lee, Olvi Mangasarian & W. H. Wolberg UW Madison & UCSD La Jolla Computational.
Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Clustering Usman Roshan CS 675. Clustering Suppose we want to cluster n vectors in R d into two groups. Define C 1 and C 2 as the two groups. Our objective.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.
Semi-Supervised Clustering
Glenn Fung, Murat Dundar, Bharat Rao and Jinbo Bi
Concave Minimization for Support Vector Machine Classifiers
University of Wisconsin - Madison
University of Wisconsin - Madison
Minimal Kernel Classifiers
Presentation transcript:

Survival-Time Classification of Breast Cancer Patients DIMACS Workshop on Data Mining and Scalable Algorithms August 22-24, Rutgers University Y.-J. Lee, O. L. Mangasarian & W.H. Wolberg Second Annual Review June 1, 2001 Data Mining Institute University of Wisconsin - Madison

American Cancer Society 2001 Breast Cancer Estimates  Breast cancer, the most common cancer among women, is the second leading cause of cancer deaths in women (after lung cancer)  192,200 new cases of breast cancer in women will be diagnosed in the United States  40,600 deaths will occur from breast cancer (40,200 among women, 400 among men) in the United States  According to the World Health Organization, more than 1.2 million people will be diagnosed with breast cancer this year worldwide

Key Objective  Identify breast cancer patients for whom adjuvant chemotherapy prolongs survival time  Main Difficulty: Cannot carry out comparative tests on human subjects  Similar patients must be treated similarly  Our Approach: Classify patients into: Good, Intermediate & Poor groups  Classification based on: 5 cytological features plus Tumor size  Classification criteria: Tumor size & Lymph node status

Principal Results For 253 Breast Cancer Patients  All 69 patients in the Good group:  Had the best survival rate  Had no chemotherapy  All 73 patients in the Poor group:  Had the worst survival rate  Had chemotherapy  For the 121 patients in the Intermediate group:  The 67 patients who had chemotherapy had better survival rate than:  The 44 patients who did not have chemotherapy  Last result reverses role of chemotherapy for both the overall population as well as the Good & Poor groups

Outline  Tools used  Support vector machines (SVMs).  Feature selection  Classification  Clustering  k-Median (k-Mean fails!)  Cluster chemo patients into chemo-good & chemo-poor  Cluster no-chemo patients into no-chemo-good & no-chemo-poor  Three final classes  Good = No-chemo good  Poor = Chemo poor  Intermediate = Remaining patients  Generate survival curves for three classes  Use SVM to classify new patients into one of above three classes

Support Vector Machines Used in this Work  6 out of 31 features selected:  Feature selection: SVM with 1-norm approach, s. t. min,, denotes Lymph node > 0 or where Lymph node =0  Classification: Use SSVMs with Gaussian kernel  5 out 30 cytological features describe nuclear size, shape and texture  Tumor size

Clustering in Data Mining General Objective  Given: A dataset of m points in n-dimensional real space  Problem: Extract hidden distinct properties by clustering the dataset

Concave Minimization Formulation of Clustering Problem, and a number  Given: Set of m points in represented by the matrix of desired clusters  Problem: Determine centers,insuch that the sum of the minima over of the 1-norm distance between each point,, and cluster centers,is minimized  Objective: Sum of m minima of linear functions, hence it is piecewise-linear concave  Difficulty: Minimizing a general piecewise-linear concave function over a polyhedral set is NP-hard

Clustering via Concave Minimization  Reformulation: min s.t. min s.t.  Minimize the sum of 1-norm distances between each data point: and the closest cluster center

Finite K-Median Clustering Algorithm (Minimizing Piecewise-linear Concave Function) Step 0 (Initialization): Given k initial cluster centers  Different initial centers will lead to different clusters Step 1 (Cluster Assignment): Assign points to the cluster with the nearest cluster center in 1-norm Step 2 (Center Update) Recompute location of center for each cluster as the cluster median (closest point to all cluster points in 1-norm) Step3 (Stopping Criterion) Stop if the cluster centers are unchanged, else go to Step 1

Clustering Process: Feature Selection & Initial Cluster Centers  6 out of 31 features selected by a linear SVM ( )  SVM separating lymph node positive (Lymph > 0) from lymph node negative (Lymph = 0)  Perform k-Median algorithm in 6-dimensional feature space  Initial cluster centers used: Medians of Good1 & Poor1  Good1: Patients with Lymph = 0 AND Tumor < 2  Poor1: Patients with Lymph > 4 OR Tumor  Typical indicator for chemotherapy

Clustering Process 253 Patients (113 NoChemo, 140 Chemo) Cluster 113 NoChemo Patients Use k-Median Algorithm with Initial Centers: Medians of Good1 & Poor1 69 NoChemo Good 44 NoChemo Poor 67 Chemo Good 73 Chemo Poor Good Poor Intermediate Cluster 140 Chemo Patients Use k-Median Algorithm with Initial Centers: Medians of Good1 & Poor1 Good1: Lymph=0 AND Tumor<2 Compute Median Using 6 Features Poor1: Lymph>=5 OR Tumor>=4 Compute Median Using 6 Features Compute Initial Cluster Centers

Survival Curves for Good, Intermediate & Poor Groups

Survival Curves for Intermediate Group: Split by Chemo & NoChemo

Survival Curves for All Patients Split by Chemo & NoChemo

Survival Curves for Intermediate Group Split by Lymph Node & Chemotherapy

Survival Curves for All Patients Split by Lymph Node Positive & Negative

Nonlinear SVM Classifier 82.7% Tenfold Test Correctness Good2: Good & ChemoGood Poor2: NoChemoPoor & Poor Compute LI(x) & CI(x) Compute LI(x) & CI(x) SVM Good Intermediate Good Poor Intermediate (ChemoGood) Intermediate (NoChemoPoor) Four groups from the clustering result: SVM Poor Intermediate SVM

Conclusion  Used five feature from a fine needle aspirate & tumor size to cluster breast cancer patients into 3 groups:  Good – No chemotherapy recommended  Intermediate – Chemotherapy likely to prolong survival  Poor – Chemotherapy may or may not enhance survival  3 groups have very distinct survival curves  First categorization of a breast cancer group for which chemotherapy enhances longevity  Prescribe a SVM classification procedure to classify new patients into one of above three groups

Simplest Support Vector Machine Linear Surface Maximizing the Margin A+ A-

Key Objective  Identify breast cancer patients for whom adjuvant chemotherapy prolongs survival time  Main Difficulty: Cannot carry out comparative tests on human subjects  Similar patients must be treated similarly  Our Approach: Classify patients into: good, intermediate & poor groups  Characterize classes by: Tumor size & lymph node status  Classification based on: 5 cytological features plus tumor size

Clustering Process: Feature Selection & Initial Cluster Centers  6 out of 31 features selected by a linear SVM  SVM separating lymph node positive (Lymph>0) from lymph node negative (Lymph=0)  Clustering performed in 6-dimensional feature space  Initial cluster centers used:  Good: Median in 6-dimensional space of patients with Lymph=0 AND Tumor <2  Poor: Median in 6-dimensional space of patients with of Lymph>4 OR Tumor >4  Typical indicator for chemotherapy

Conclusion  By using five features from a fine needle aspirate & tumor size, breast cancer patients can be classified into 3 classes  Good – Requiring no chemotherapy  Intermediate – Chemotherapy recommended for longer survival  Poor – Chemotherapy may or may not enhance survival  3 classes have very distinct survival curves  First categorization of a breast cancer group for which chemotherapy enhances longevity