Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman.

Similar presentations


Presentation on theme: "Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman."— Presentation transcript:

1 Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman

2 The Motivation  Breast Cancer is the second most deadly type of cancer in women worldwide.  1.3 million women diagnosed worldwide  Nearly half-a-million women dying from this disease each year.  Very curable if diagnosed early.  Cervical Cancer is the third most deadly type of cancer in women worldwide.  Half-a-million women diagnosed.  250,000 women dying from this disease each year.  Also very curable if diagnosed early.

3 The Data Set Breast Cancer Wisconsin (Original) Data Set - 699 Samples collected from a minimally invasive fine-needle aspirate (FNA). 458 benign (65.5%) and 241 malignant (34.5%) 9 Features (scale from 1-10): Clump thickness Uniformity of cell size Uniformity of cell shape Marginal adhesion Single epithelial cell size Bare nuclei Bland chromatin Normal nucleoli Mitoses Cervical Cancer Data 5 Features: Amount of cytoplasm Nuclei count Nuclei shape Nuclei texture Nuclei area

4 Pre-Processing Unknown Samples  16 incomplete data samples - bare nuclei  Samples used for analysis: 683 Normalization  Normalize value to between 0 - 1

5 Artificial Neural Network Analysis  MATLAB driver program  Network Configurations:  1, 2 or 3 hidden layers  Each layer with 3, 5 or 7 perceptrons  70% training  15% for testing  15% for validation Transfer functions used:  Scaled Conjugate Gradient  Logistic  Tan Sigmoid Network retrained 50 times with random sample with replacement.

6 Support Vector Machine Analysis  MATLAB driver program  Parameters Tuned:  Kernels  Radial Basis Function  Linear Kernel  Polynomial Kernel - degrees 2 and 3  Box Constraint/C - support vector cost/penalty  1e-5 to 1e5 - increasing by factor of 10  Kernel Scale/Gamma - individual examples influence the hyperplane  1e-5 to 1e5 - increasing by factor of 10 - 10-way cross validation

7 Artificial Neural Network Results Configurations with 97.4% accuracy  The Scaled Conjugate Gradient (SCG) without normalization and configuration: [5], [7], [7 7]  The Scaled Conjugate Gradient (SCG) with normalization and configuration: [5 5], [7]  The Logistic transfer function (logsig) without normalization and configuration: [5 5], [7]  The Logistic transfer function (logsig) with normalization and configuration: [5], [7]  The Tan-Sigmoid transfer function (tansig) without normalization:[5], [5 5], [7], [7 7]  The Tan-Sigmoid transfer function (tansig) with normalization:[7], [7 7], [7 7 7] Sensitivity: 98% Specificity: 96%

8 Support Vector Machine Results Maximum accuracy: 94.24%  2nd-order polynomial kernel  Box constraint of 1e-1  kernel scale (gamma) of 1  Sensitivity: 96.58%  Specificity: 91.43% Second most accurate configurations, accuracy: 94.13%  Radial Basis Function kernel, box constraint of 1, kernel scale set to auto,  Polynomial degree 3 with box constraint of 1e-1 and kernel scale set to auto.  Sensitivity: 96.35%  Specificity 91.02%

9 Conclusion  None of the ANN layer configurations with 3 perceptrons achieved a maximum accuracy of 97.4%.  ANN configurations with 3+ layers and perceptrons do not generalize well, and overfit the data.  Classification of breast cancer data using an ANN should be kept to one or two hidden layers of about 5 perceptrons in order to achieve the highest classification accuracy.  In comparison to the SVM, the neural network achieved higher accuracy by 3%, sensitivity by 2% and specificity by 5%.  A neural network appears to be a better algorithm for classifying the data.

10 Discussion  Other configurations for the ANN and SVM which were not analyzed.  More fine tuning of parameters.  Artificial Neural Network  Learning rates, transfer functions, more/less layers/perceptrons  Support Vector Machines  More Kernels  Different cost function  Removing “bare nuclei” feature and using all 699 samples.


Download ppt "Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman."

Similar presentations


Ads by Google