_____KOSYR 2001______ Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods,

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

Universal Learning Machines (ULM) Włodzisław Duch and Tomasz Maszczyk Department of Informatics, Nicolaus Copernicus University, Toruń, Poland ICONIP 2009,
GhostMiner Wine example Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP Porto,
Heterogeneous Forests of Decision Trees Krzysztof Grąbczewski & Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Torun, Poland.
PROBABILISTIC DISTANCE MEASURES FOR PROTOTYPE-BASED RULES Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Poland, School of.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
K nearest neighbor and Rocchio algorithm
Heterogeneous adaptive systems Włodzisław Duch & Krzysztof Grąbczewski Department of Informatics, Nicholas Copernicus University, Torun, Poland.
Fuzzy rule-based system derived from similarity to prototypes Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Poland School.
Almost Random Projection Machine with Margin Maximization and Kernel Features Tomasz Maszczyk and Włodzisław Duch Department of Informatics, Nicolaus Copernicus.
Coloring black boxes: visualization of neural network decisions Włodzisław Duch School of Computer Engineering, Nanyang Technological University, Singapore,
Support Vector Neural Training Włodzisław Duch Department of Informatics Nicolaus Copernicus University, Toruń, Poland School of Computer Engineering,
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
Transfer functions: hidden possibilities for better neural networks. Włodzisław Duch and Norbert Jankowski Department of Computer Methods, Nicholas Copernicus.
A Posteriori Corrections to Classification Methods Włodzisław Duch & Łukasz Itert Department of Informatics, Nicholas Copernicus University, Torun, Poland.
Competent Undemocratic Committees Włodzisław Duch, Łukasz Itert and Karol Grudziński Department of Informatics, Nicholas Copernicus University, Torun,
Support Feature Machine for DNA microarray data Tomasz Maszczyk and Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Toruń, Poland.
CS Instance Based Learning1 Instance Based Learning.
Feature selection based on information theory, consistency and separability indices Włodzisław Duch, Tomasz Winiarski, Krzysztof Grąbczewski, Jacek Biesiada,
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
February 13, 1997CWU B.Kovalerchuk1 DESIGN OF CONSISTENT SYSTEM FOR RADIOLOGISTS TO SUPPORT BREAST CANCER DIAGNOSIS.
Skin Cancer: What You Should Know Randy R. Weigel University of Wyoming Cooperative Extension Service.
Radial Basis Function Networks
Learning Chapter 18 and Parts of Chapter 20
This slide shows a tumor metastasizing, or breaking up and traveling to different parts of the body.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Biomedical Science Skin Cancer:. Skin Cancer Most common cancer in US Fastest increasing cancer in US 1,000,000 people had some form of skin cancer in.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
Maryam Sadeghi 1,3, Majid Razmara 1, Martin Ester 1, Tim K. Lee 1,2,3 and M. Stella Atkins 1 1: School of Computing Science, Simon Fraser University 2:
Skin Cancers Pages
Maryam Sadeghi 1,3, Majid Razmara 1, Martin Ester 1, Tim K. Lee 1,2,3 and M. Stella Atkins 1 1: School of Computing Science, Simon Fraser University 2:
Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Computational Intelligence: Methods and Applications Lecture 30 Neurofuzzy system FSM and covering algorithms. Włodzisław Duch Dept. of Informatics, UMK.
1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.
Skin Cancer Sylvie Sabones. Skin Cancer Most common cancer in US Fastest increasing cancer in US 1,000,000 people had some form of skin cancer in 2003.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Computational Intelligence: Methods and Applications Lecture 36 Meta-learning: committees, sampling and bootstrap. Włodzisław Duch Dept. of Informatics,
Computational Intelligence: Methods and Applications Lecture 20 SSV & other trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Towards CI Foundations Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Toruń, Poland Google: W. Duch WCCI’08 Panel Discussion.
Melanoma evaluation and management: expanding the role of the general practitioner in skin examination Meghan A. Rauchenstein February 16, 2006.
Project Safety Sun Awareness MD Anderson Cancer Center.
SUN SAFETY TERMINOLOGY. ABCD RULE  A way to tell the difference between a regular mole and one that may be skin cancer  Asymmetry  Border  Color 
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Computational Intelligence: Methods and Applications Lecture 33 Decision Tables & Information Theory Włodzisław Duch Dept. of Informatics, UMK Google:
Computational Intelligence: Methods and Applications Lecture 29 Approximation theory, RBF and SFN networks Włodzisław Duch Dept. of Informatics, UMK Google:
Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
C LASSIFICATION OF SKIN LESIONS IN DERMOSCOPY IMAGES Ana Raimundo Jorge Martins Júlia Pinheiro Ricardo Trindade Introdução à Engenharia.
Computational Intelligence: Methods and Applications Lecture 14 Bias-variance tradeoff – model selection. Włodzisław Duch Dept. of Informatics, UMK Google:
10. Decision Trees and Markov Chains for Gene Finding.
Data Transformation: Normalization
Support Feature Machine for DNA microarray data
Computational Intelligence: Methods and Applications
Skin Cancer Can be benign or malignant
Department of Informatics, Nicolaus Copernicus University, Toruń
Computational Intelligence: Methods and Applications
SKIN CANCER NOTES.
Tomasz Maszczyk and Włodzisław Duch Department of Informatics,
Projection of network outputs
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Fuzzy rule-based system derived from similarity to prototypes
Support Vector Neural Training
Heterogeneous adaptive systems
What You Need to Know About…
Evolutionary Ensembles with Negative Correlation Learning
Presentation transcript:

_____KOSYR 2001______ Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods, Nicholas Copernicus University, Torun, Poland. Zdzisław Hippe Department of Computer Chemistry and Physical Chemistry Rzeszów University of Technology, Rules for Melanoma Skin Cancer Diagnosis Włodzisław Duch, K. Grąbczewski, R. Adamczak, K. Grudziński, Department of Computer Methods, Nicholas Copernicus University, Torun, Poland. Zdzisław Hippe Department of Computer Chemistry and Physical Chemistry Rzeszów University of Technology,

_____KOSYR 2001______ Content: l Melanoma skin cancer data l 5 methods: GTS, SSV, MLP2LN, SSV, SBL, and their results. l Final comparison of results l Conclusions & future prospects

_____KOSYR 2001______ Skin cancer Most common skin cancer: l Basal cell carcinoma (rak podstawnokomórkowy) l Squamous cell carcinoma (rak kolczystonabłonkowy) l Melanoma: uncontrolled growth of melanocytes, the skin cells that produce the skin pigment melanin. l Too much exposure to the sun, sunburn. l Melanoma is 4% of skin cancers, most difficult to control, 1:79 Americans will develop melanoma. l Almost 2000 percent increase since l Survival now 84%, early detection 95%.

_____KOSYR 2001______ Melanoma skin cancer data summary l Collected in the Outpatient Center of Dermatology in Rzeszów, Poland. l Four types of Melanoma: benign, blue, suspicious, or malignant. l 250 cases, with almost equal class distribution. l Each record in the database has 13 attributes. l TDS (Total Dermatoscopy Score) - single index l 26 new test cases. l Goal: understand the data, find simple description.

_____KOSYR 2001______ Melanoma AB attributes l Asymmetry: symmetric-spot, 1-axial asymmetry, and 2-axial asymmetry. l Border irregularity: The edges are ragged, notched, or blurred. Integer, from 0 to 8.

_____KOSYR 2001______ Melanoma CD attributes l Color: white, blue, black, red, light brown, and dark brown; several colors are possible simultaneously. l Diversity: pigment globules, pigment dots, pigment network, branched strikes, structureless areas.

_____KOSYR 2001______ Melanoma TDS index l Combine ABCD attributes to form one index: l TDS index ABCD formula: TDS = 1.3 Asymmetry Border  {Colors}  {Diversities} Coefficients from statistical analysis.

_____KOSYR 2001______ Remarks on testing l Test: only 26 cases for 4 classes. l Estimation of expected statistical accuracy on 276 training + test cases with 10-fold crossvalidation. Not done with most methods! l Risk matrices desirable: identification of Blue nevus instead Benign nevus carries no risk, but with malignant great risk.

_____KOSYR 2001______ Methods used: GTS l GTS covering algorithm (Hippe, 1997) + recursive reduction of the number of decision rules. l Interactive, user guides the development of the learning model. l Selection of combination of attributes generating learning model is based on Frequency and Ranking. l GTS allows to create many different sets of rules. l In a complex situation may be rather difficult to use.

_____KOSYR 2001______ GTS results. l GTS generated a large number (198) of rules. l Experimentation allowed to find important attributes. l Various sets of decision rules were generated: TDS & C-blue & Asymmetry & Border (4 attributes, based on the experience of medical doctors) TDS & C-blue & D-structureless-areas (3 attributes) TDS & C-Blue (2 attributes) TDS (1 attribute) - poor results. Models with 2-4 attributes give 81-85% accuracy. l Combination and generalization of these rules allowed to select 4 simplified best rules. l Overall: 6 errors on training, 0 errors on test set.

_____KOSYR 2001______ Methods used: SSV l Decision tree (Grąbczewski, Duch 1999) l Based on a separability criterion: max. index of separability for a given split value for continuous attribute or a subset of discrete values. l Easily converted into a set of crisp logical rules. l Pruning used to ensure the simplest set of rules that generalize well. l Fully automatic, very efficient, crossvalidation tests provide estimation of statistical accuracy.

_____KOSYR 2001______ SSV results l Pruning degree is the only user-defined parameter. l Finds TDS, C-BLUE as most important. Rules are easy to understand: IF TDS  4.85  C-BLUE is absent => Benign-nevus IF TDS  4.85  C-BLUE is present => Blue-nevus IF 4.85 Suspicious IF TDS  5.45 => Malignant l 98% accuracy on training, 100% test. l 5 errors, vector pairs from C1/C2 have identical TDS & C-BLUE. l 10xCV on all data: 97.5±0.3%

_____KOSYR 2001______ Methods used: MLP2LN l Constructive constrained MLP algorithm, 0, ±1 weights at the end of training. l MLP is converted into LN, network performing logical function (Duch, Adamczak, Grąbczewski 1996) l Network function is written as a set of crisp logical rules. l Automatic determination of crisp and fuzzy "soft- trapezoidal" membership functions. l Tradeoff: simplicity vs. accuracy explored. l Tradeoff: confidence vs. rejection rate explored. l Almost fully automatic algorithm.

_____KOSYR 2001______ MLP2LN results l Very similar rules as for the SSV found. l Confusion matrix: Original class Benign Blue- Malig- Suspi- Calculated nevus nevus nant cious Benign-nevus Blue-nevus Malignant Suspicious

_____KOSYR 2001______ Methods used: FSM l Feature-Space Mapping (Duch 1994) l FSM estimates probability density of training data. l Neuro-fuzzy system, based on separable transfer functions. l Constructive learning algorithm with feature selection and network pruning. l Each transfer function component is a context- dependent membership function. l Crisp logic rules from rectangular functions. l Trapezoidal, triangular, Gaussian f. for fuzzy logic rules.

_____KOSYR 2001______ FSM results l Rectangular functions used for C-rules. l 7 nodes (rules) created on average. l 10xCV accuracy on training 95.5±1.0%, test 100%. l Committee of 20 FSM networks: 95.5±1.1%, test 92.6%. l F-rules, Gaussian membership functions: 15 fuzzy rules, lower accuracy. l Simplest solution should strongly be preferred.

_____KOSYR 2001______ Methods used: SBL l Similarity-Based-Methods: many models based on evaluation of similarity. l Similarity-Based-Learner (SBL): software implementation of SBM. l Various extensions of the k-nearest neighbor algorithms. l S-rules, more general than C-rules and F-rules. l Small number of prototype cases used to explain the data class structure.

_____KOSYR 2001______ SBL results l SBL optimized performing 10xCV on training set. l Manhattan distance, feature selection: TDS & C_Blue l 97.4 ± 0.3% on training, 100% test. l S-rules of the form: IF (X sim P i ) THEN C(X)=C(P i ) IF (|TDS(X)-TDS(P i )|+|C_blue(X)-C_blue (P i )|)<T (P i ) THEN C(X)=C(P i ) Prototype selection left 13 vectors (7 for Benign-nevus class, 2 for every other class. 97.5% or 6 errors on training (237 vectors), 100% test l 7 prototypes: 91.4% training (243 vectors), 100% test

_____KOSYR 2001______ Results - comparison Method Rules Training % Test% SSV Tree, crisp rules 497.5± MLP2LN, crisp rules all 100 GTS - final simplified all 100 FSM, rectangular f ± ±0.0 knn+ prototype selection ± FSM, Gaussian f ±1.0 95±3.6 GTS initial rules19885 all 84.6 knn k=1, Manh, 2 feat ± LERS, weighted rules

_____KOSYR 2001______ Conclusions: l TDS - most important; Color-blue second. l Without TDS - many rules. l Optimize TDS: automatic aggregation of features, ex. 2-layered neural network. l Very simple and reliable rules have been found. l S-rules are being improved - prototypes obtained from learning instead of selection. l Data base is expanding; need for non-cancer data.