Presentation is loading. Please wait.

Presentation is loading. Please wait.

_____KOSYR 2001______ Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods,

Similar presentations


Presentation on theme: "_____KOSYR 2001______ Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods,"— Presentation transcript:

1 _____KOSYR 2001______ Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods, Nicholas Copernicus University, Torun, Poland. http://www.phys.uni.torun.pl/kmk Zdzisław Hippe Department of Computer Chemistry and Physical Chemistry Rzeszów University of Technology, zshippe@prz.rzeszow.pl Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods, Nicholas Copernicus University, Torun, Poland. http://www.phys.uni.torun.pl/kmk Zdzisław Hippe Department of Computer Chemistry and Physical Chemistry Rzeszów University of Technology, zshippe@prz.rzeszow.pl

2 _____KOSYR 2001______ Content: l Melanoma skin cancer data l 5 methods: GTS, SSV, MLP2LN, SSV, SBL, and their results. l Final comparison of results l Conclusions & future prospects

3 _____KOSYR 2001______ Skin cancer Most common skin cancer: l Basal cell carcinoma (rak podstawnokomórkowy) l Squamous cell carcinoma (rak kolczystonabłonkowy) l Melanoma: uncontrolled growth of melanocytes, the skin cells that produce the skin pigment melanin. l Too much exposure to the sun, sunburn. l Melanoma is 4% of skin cancers, most difficult to control, 1:79 Americans will develop melanoma. l Almost 2000 percent increase since 1930. l Survival now 84%, early detection 95%.

4 _____KOSYR 2001______ Melanoma skin cancer data summary l Collected in the Outpatient Center of Dermatology in Rzeszów, Poland. l Four types of Melanoma: benign, blue, suspicious, or malignant. l 250 cases, with almost equal class distribution. l Each record in the database has 13 attributes. l TDS (Total Dermatoscopy Score) - single index l 26 new test cases. l Goal: understand the data, find simple description.

5 _____KOSYR 2001______ Melanoma AB attributes l Asymmetry: symmetric-spot, 1-axial asymmetry, and 2-axial asymmetry. l Border irregularity: The edges are ragged, notched, or blurred. Integer, from 0 to 8.

6 _____KOSYR 2001______ Melanoma CD attributes l Color: white, blue, black, red, light brown, and dark brown; several colors are possible simultaneously. l Diversity: pigment globules, pigment dots, pigment network, branched strikes, structureless areas.

7 _____KOSYR 2001______ Melanoma TDS index l Combine ABCD attributes to form one index: l TDS index ABCD formula: TDS = 1.3 Asymmetry + 0.1 Border + 0.5  {Colors} + 0.5  {Diversities} Coefficients from statistical analysis.

8 _____KOSYR 2001______ Remarks on testing l Test: only 26 cases for 4 classes. l Estimation of expected statistical accuracy on 276 training + test cases with 10-fold crossvalidation. Not done with most methods! l Risk matrices desirable: identification of Blue nevus instead Benign nevus carries no risk, but with malignant great risk.

9 _____KOSYR 2001______ Methods used: GTS l GTS covering algorithm (Hippe, 1997) + recursive reduction of the number of decision rules. l Interactive, user guides the development of the learning model. l Selection of combination of attributes generating learning model is based on Frequency and Ranking. l GTS allows to create many different sets of rules. l In a complex situation may be rather difficult to use.

10 _____KOSYR 2001______ GTS results. l GTS generated a large number (198) of rules. l Experimentation allowed to find important attributes. l Various sets of decision rules were generated: TDS & C-blue & Asymmetry & Border (4 attributes, based on the experience of medical doctors) TDS & C-blue & D-structureless-areas (3 attributes) TDS & C-Blue (2 attributes) TDS (1 attribute) - poor results. Models with 2-4 attributes give 81-85% accuracy. l Combination and generalization of these rules allowed to select 4 simplified best rules. l Overall: 6 errors on training, 0 errors on test set.

11 _____KOSYR 2001______ Methods used: SSV l Decision tree (Grąbczewski, Duch 1999) l Based on a separability criterion: max. index of separability for a given split value for continuous attribute or a subset of discrete values. l Easily converted into a set of crisp logical rules. l Pruning used to ensure the simplest set of rules that generalize well. l Fully automatic, very efficient, crossvalidation tests provide estimation of statistical accuracy.

12 _____KOSYR 2001______ SSV results l Pruning degree is the only user-defined parameter. l Finds TDS, C-BLUE as most important. Rules are easy to understand: IF TDS  4.85  C-BLUE is absent => Benign-nevus IF TDS  4.85  C-BLUE is present => Blue-nevus IF 4.85 Suspicious IF TDS  5.45 => Malignant l 98% accuracy on training, 100% test. l 5 errors, vector pairs from C1/C2 have identical TDS & C-BLUE. l 10xCV on all data: 97.5±0.3%

13 _____KOSYR 2001______ Methods used: MLP2LN l Constructive constrained MLP algorithm, 0, ±1 weights at the end of training. l MLP is converted into LN, network performing logical function (Duch, Adamczak, Grąbczewski 1996) l Network function is written as a set of crisp logical rules. l Automatic determination of crisp and fuzzy "soft- trapezoidal" membership functions. l Tradeoff: simplicity vs. accuracy explored. l Tradeoff: confidence vs. rejection rate explored. l Almost fully automatic algorithm.

14 _____KOSYR 2001______ MLP2LN results l Very similar rules as for the SSV found. l Confusion matrix: Original class Benign Blue- Malig- Suspi- Calculated nevus nevus nant cious Benign-nevus62 5 0 0 Blue-nevus 0 59 0 0 Malignant 0 0 62 0 Suspicious 0 0 0 62

15 _____KOSYR 2001______ Methods used: FSM l Feature-Space Mapping (Duch 1994) l FSM estimates probability density of training data. l Neuro-fuzzy system, based on separable transfer functions. l Constructive learning algorithm with feature selection and network pruning. l Each transfer function component is a context- dependent membership function. l Crisp logic rules from rectangular functions. l Trapezoidal, triangular, Gaussian f. for fuzzy logic rules.

16 _____KOSYR 2001______ FSM results l Rectangular functions used for C-rules. l 7 nodes (rules) created on average. l 10xCV accuracy on training 95.5±1.0%, test 100%. l Committee of 20 FSM networks: 95.5±1.1%, test 92.6%. l F-rules, Gaussian membership functions: 15 fuzzy rules, lower accuracy. l Simplest solution should strongly be preferred.

17 _____KOSYR 2001______ Methods used: SBL l Similarity-Based-Methods: many models based on evaluation of similarity. l Similarity-Based-Learner (SBL): software implementation of SBM. l Various extensions of the k-nearest neighbor algorithms. l S-rules, more general than C-rules and F-rules. l Small number of prototype cases used to explain the data class structure.

18 _____KOSYR 2001______ SBL results l SBL optimized performing 10xCV on training set. l Manhattan distance, feature selection: TDS & C_Blue l 97.4 ± 0.3% on training, 100% test. l S-rules of the form: IF (X sim P i ) THEN C(X)=C(P i ) IF (|TDS(X)-TDS(P i )|+|C_blue(X)-C_blue (P i )|)<T (P i ) THEN C(X)=C(P i ) Prototype selection left 13 vectors (7 for Benign-nevus class, 2 for every other class. 97.5% or 6 errors on training (237 vectors), 100% test l 7 prototypes: 91.4% training (243 vectors), 100% test

19 _____KOSYR 2001______ Results - comparison Method Rules Training % Test% SSV Tree, crisp rules 497.5±0.3 100 MLP2LN, crisp rules 498.0 all 100 GTS - final simplified 497.6 all 100 FSM, rectangular f. 7 95.5±1.0 100±0.0 knn+ prototype selection 13 97.5±0.0 100 FSM, Gaussian f. 15 93.7±1.0 95±3.6 GTS initial rules19885 all 84.6 knn k=1, Manh, 2 feat. 250 97.4±0.3 100 LERS, weighted rules 21-- 96.2

20 _____KOSYR 2001______ Conclusions: l TDS - most important; Color-blue second. l Without TDS - many rules. l Optimize TDS: automatic aggregation of features, ex. 2-layered neural network. l Very simple and reliable rules have been found. l S-rules are being improved - prototypes obtained from learning instead of selection. l Data base is expanding; need for non-cancer data.


Download ppt "_____KOSYR 2001______ Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods,"

Similar presentations


Ads by Google